Trie Tree


Definition: Trie or prefix tree is a tree data structure, which is used for retrieval of a key in a dataset of strings.

Time Complexity: Trie could use less space compared to Hash Table when storing many keys with the same prefix. In this case using trie has only O(m) time complexity, where m is the key length. Searching for a key in a balanced tree costs O(mlogn) time complexity.


Tree Node Structure

Trie is a rooted tree. Each nodes have the following fields:

  • Maximum of R links to its children, where each link corresponds to one of R character values from dataset alphabet. We can assume that R is 26, the number of lowercase latin letters.
  • Boolean field which specifies whether the node is the end of the key, or is just a key prefix.
  • Can assume the path between parent and child represent the character!

class TrieNode {
  TrieNode[] children;
  boolean isEnd;

  public TrieNode() {
    children = new TrieNode[26];
    isEnd = false;
  }

  public void setEnd() { isEnd = true; }

  public boolean containsChar(char c) {
    return children[c - 'a'] != null;
  } 

  public void addChild(char c) {
    children[c - 'a'] = new TrieNode();
  }

  public TrieNode getChild(char c) {
    return children[c - 'a'];
  }
}

Insert a Key

We start from the root and search a link which corresponds to the first key character. There are two cases :

  • A link exists. Then we move down the tree following the link to the next child level.
  • A link does not exist. Then we create a new node and link it with the parent's link matching the current key character. We repeat this step until we encounter the last character of the key, then we mark the current node as an end node and the algorithm finishes.

Complexity Analysis: O(n) in time O(n) in space, worst case when none of the characters in inserting string contained in TrieTree

    public void insert(String word) {
        if (word == null || word.length() == 0) {
            return;
        }

        TrieNode pointer = root;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if (!pointer.containsChar(c)) {
                pointer.addChild(c);
            } 
            pointer = pointer.getChild(c);
        }

        pointer.setEnd();
    }

Search for a key

Each key is represented in the trie as a path from the root to the internal node or leaf. We start from the root with the first key character. We examine the current node for a link corresponding to the key character. There are two cases :

  • A link exist. We move to the next node in the path following this link, and proceed searching for the next key character until we find the last character of the key and the corresponding TrieNode is set as end;
  • A link does not exist. We can't find the path from first character to the last character, or this path exists but the last TrieNode is not marked as end.

Complexity Analysis: O(n) in time O(1) in space

    public boolean search(String word) {
        if (word == null || word.length() == 0) {
            return true;
        }

        TrieNode pointer = root;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if (!pointer.containsChar(c)) {
                return false;
            } 
            pointer = pointer.getChild(c);
        }

        return pointer.isEnd();
    }

Search for a key prefix

Searching a prefix in TrieTree is very similar to search for a string. The only difference is that when we reach the last character, we don't need to check if this node is marked as end. Complexity Analysis: O(n) in time O(1) in space

    public boolean search(String word) {
        if (word == null || word.length() == 0) {
            return true;
        }

        TrieNode pointer = root;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if (!pointer.containsChar(c)) {
                return false;
            } 
            pointer = pointer.getChild(c);
        }

        return true;
    }

Get all words start with a prefix

Given a prefix. find all words in this TrieTree start with this prefix. We can use DFS to traverse the TrieTree from the end node of the prefix. Complexity Analysis: O(E + V) in time O(1) in space

    public List<String> prefix(String prefix) {
        List<String> result = new ArrayList<>();
        if (prefix == null || prefix.length() == 0) {
            return result;
        }

        TrieNode temp = root;
        for (int i = 0; i < prefix.length(); ++i) {
            char c = prefix.charAt(i);
            if (!temp.containsChild(c)) {
                return result;
            }
            temp = temp.getChild(c);
        }

        StringBuilder sb = new StringBuilder();
        dfs(temp, result, sb, prefix);
        return result;
    }

    private void dfs(TrieNode head, List<String> result, StringBuilder sb, String prefix) {
        if (head == null) {
            return;
        }
        if (head.isEnd) {
            result.add(prefix + sb.toString());
        }

        for (int i = 0; i < head.children.length; ++i) {
            sb.append((char)('a' + i));
            dfs(head.children[i], result, sb, prefix);
            sb.deleteCharAt(sb.length() - 1);
        }
    }

results matching ""

    No results matching ""