[DS&AL] 트라이(Trie) 설명

February 7, 2026

Trie

Trie는 문자열을 저장하고 효율적으로 탐색하기 위한 트리 형태의 자료 구조를 말합니다. 검색 시 사용하는 자동 완성, 사전 검색 등의 문자열을 탐색하는 것에 특화되어 있는 자료 구조라고 할 수 있습니다.

Trie 알고리즘의 작동원리는 다음과 같습니다.

한 문자열에서 다ㅓ음에 나오는 문자가 현재 문자의 자식 노드가 되고, 위 이미지의 빨간색 노드가 문자열의 끝을 의미합니다.
다음 글자에 해당하는 노드가 연결되어 있다면, 그 노드를 계속 타고 따라갑니다.
루트에서 내려가는 경로에서 만나는 글자들을 모으면, 찾고자 하는 문자열의 접두사를 얻을 수 있습니다.
“rebro”를 예시로 든다면,
- r -> re -> reb -> rebr -> rebro로 이어집니다. 각 노드는 접두사를 별도로 저장할 필요 없이, 문자 하나만 저장해도 충분합니다.

Trie의 시간복잡도는 다음과 같이 설명할 수 있습니다.

제일 긴 문자열 길이를 L, 총 문자열 수를 M이라 했을 때, 생성 시간복잡도는 O(ML)
- 모든 문자열들을 넣어야 하는데, 이 Trie 구조에 넣는 것은 가장 긴 문자열만큼 시간이 소요됨.
탐색 시 시간복잡도 : O(L)

Trie 구현

Trie는 파이썬으로 다음과 같이 구현할 수 있습니다.

from collections import deque

class Node:
    def __init__(self, key, data=None):
        self.key = key
        self.data = data      # 단어 끝이면 전체 문자열 저장 (flag 역할)
        self.children = {}    # char -> Node
class Trie:
    def __init__(self):
        self.head = Node(None)

    def insert(self, string):
        current_node = self.head
        for char in string:
            if char not in current_node.children:
                current_node.children[char] = Node(char)
            current_node = current_node.children[char]
        current_node.data = string
    def search(self, string):
        current_node = self.head
        for char in string:
            if char not in current_node.children:
                return False
            current_node = current_node.children[char]
        return current_node.data is not None
    def starts_with(self, prefix):
        current_node = self.head

        # 1) prefix 노드까지 이동
        for ch in prefix:
            if ch not in current_node.children:
                return []   # None 대신 빈 리스트 추천
            current_node = current_node.children[ch]

        # 2) prefix 아래 서브트리 BFS로 단어 수집
        words = []
        q = deque([current_node])
        while q:
            node = q.popleft()
            if node.data is not None:
                words.append(node.data)
            for child in node.children.values():
                q.append(child)

        return words
words_in_image = ["rebro", "replay", "hi", "high", "algo"]

trie = Trie()
for w in words_in_image:
    trie.insert(w)

# --- search 테스트 ---
print(trie.search("hi"))      # True
print(trie.search("high"))    # True
print(trie.search("h"))       # False
print(trie.search("replay"))  # True
print(trie.search("rebro"))   # True
print(trie.search("algo"))    # True
print(trie.search("alg"))     # False

# --- prefix 테스트 ---
print(trie.starts_with("h"))   # ['hi', 'high'] (순서는 BFS 탐색이라 달라질 수 있음)
print(trie.starts_with("re"))  # ['rebro', 'replay'] (순서 달라질 수 있음)
print(trie.starts_with("a"))   # ['algo']
print(trie.starts_with("z"))   # []

코드의 각 부분에 대해 먼저 설명드리면,

class Node:
    def __init__(self, key, data=None):
        self.key = key              # 노드에 저장된 문자
        self.data = data            # 단어 끝이면 전체 문자열 저장 (flag 역할)
        self.children = {}          # char -> Node

key는 노드가 대표하는 문자를 말하고, 루트 노드의 경우는 문자가 없기 때문에, None으로 표시됩니다.
data를 통해 단어가 여기서 끝나는지를 표시합니다. 문자열의 종료를 알리는 flag로서, 문자열이 종료된다면 전체 문자열을 저장합니다.
children은 자식 노드들의 정보를 나타내며, K:V는 다음문자와 그 문자에 해당하는 노드를 말하빈다. 예시는 다음과 같이 저장됩니다.
- node.children['i'] = Node('i')

Trie 클래스를 다음으로 이제 구축합니다.

class Trie:
    def __init__(self):
        self.head = Node(None)

Trie 클래스에서는 head가 Trie의 루트를 말합니다. 루트는 이전에서 설명드렸다시피, 특정 문자를 의미하거나 담지 않기 때문에, key=None으로 저장됩니다.

다음으로 단어를 삽입하는 insert 함수에 대한 내용입니다.

def insert(self, string: str) -> None:
    current_node = self.head
    for char in string:
        if char not in current_node.children:
            current_node.children[char] = Node(char)
        current_node = current_node.children[char]
    current_node.data = string

단어를 삽입하는 과정은 다음과 같은 흐름으로 이어집니다.

current_node (현재 노드)를 루트 노드(head)부터 시작
문자열을 한 글자씩 순회
현재 노드의 자식들 중 char가 없으면, 새로운 노드 생성 (새로운 갈림길을 만드는 과정입니다.)
char가 있다면 그 자식 노드로 이동
마지막 글자까지 끝났으면, 그 노드에 data=string을 넣어 단어의 끝을 표시합니다.

다음으로는 단어가 정확하게 존재하는지 확인하는 search 함수에 대한 설명입니다.

def search(self, string: str) -> bool:
    current_node = self.head
    for char in string:
        if char not in current_node.children:
            return False
        current_node = current_node.children[char]
    return current_node.data is not None

Search에서는 단어의 모든 문자를 따라 내려갈 수 있어야 하며, 마지막 노드가 단어 종료 표시 (data)를 가지고 있어야 정확히 그 단어가 존재한다고 봅니다.

예를 들어 설명드리면, “h” vs “hi”일 때, “hi”가 있어도, “h”가 단어로 저장되지 않았다면 data가 없기 때문에 False로 기록됩니다.

다음으로는 이제 접두사로 시작하는 모든 단어를 찾는 starts_with 함수입니다.

def starts_with(self, prefix: str):
    current_node = self.head

    # 1) prefix 노드까지 이동
    for ch in prefix:
        if ch not in current_node.children:
            return []
        current_node = current_node.children[ch]

    # 2) prefix 아래 서브트리 BFS로 단어 수집
    words = []
    q = deque([current_node])

    while q:
        node = q.popleft()
        if node.data is not None:
            words.append(node.data)
        for child in node.children.values():
            q.append(child)

    return words

이 과정에서는 크게 다음의 과정을 수행합니다.

먼저, prefix 노드까지 내려가는 과정에서 루트 노드에서부터 이동합니다. 이 과정에서, 중간에 경로가 끊기게 되면 (즉, prefix)가 존재하지 않는다면 []를 반환합니다.

다음으로, 서브트리에서 단어를 전부 수집합니다. prefix 노드 아래에는 prefix로 시작하는 단어들이 모두 매달려 있는데, BFS 큐를 돌리면서 data가 있으면 단어의 끝이므로 words에 추가하고, 자식 노드들은 전부 큐에 넣어서 계속 탐색합니다.

이 과정들을 거쳐 Trie 알고리즘이 완성됩니다.

[DS&AL] 트라이(Trie) 설명

Trie

Trie 구현

Leave a comment

You may also enjoy

[Paper Review] Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture (COLM 2025)

[Paper Review] Trie-Aware Transformers for Generative Recommendation

[Paper Review] Token-level Collaborative Alignment for LLM-based Generative Recommendation (WWW 2026)

[DS&AL] Heap 자료구조