Index Every Suffix Once to Answer Any Substring Query in O(m)
Suffix Tries and Substring Queries
A suffix trie inserts every suffix of a string into one trie so that any substring query runs in O(m) time; learn when to upgrade to a suffix tree or Ukkonen's algorithm for production use.
What you'll learn
- Build a suffix trie by inserting all suffixes of a string
- Answer substring existence and occurrence-count queries in O(m)
- Describe the space trade-off and when a suffix automaton is preferable
A regular trie stores a hand-picked set of words. A suffix trie stores every suffix of a single string, which gives constant-time substring queries for the price of O(n^2) build time and O(n^2) space.
Building the Suffix Trie
class TrieNode {
constructor() {
this.children = new Map(); // char → TrieNode
this.isEnd = false;
this.count = 0; // how many suffixes pass through this node
}
}
class SuffixTrie {
constructor(text) {
this.root = new TrieNode();
this._build(text);
}
_build(text) {
// Insert every suffix: text[0..], text[1..], ..., text[n-1..]
for (let i = 0; i < text.length; i++) {
this._insert(text.slice(i));
}
}
_insert(suffix) {
let node = this.root;
for (const char of suffix) {
if (!node.children.has(char)) {
node.children.set(char, new TrieNode());
}
node = node.children.get(char);
node.count++;
}
node.isEnd = true;
}
} count on each node records how many suffixes pass through it, which equals
the number of times the corresponding substring occurs in the original text.
Substring Queries in O(m)
class SuffixTrie {
// ... (constructor and build from above)
/** Does the pattern appear anywhere in the text? O(m) */
contains(pattern) {
let node = this.root;
for (const char of pattern) {
if (!node.children.has(char)) return false;
node = node.children.get(char);
}
return true;
}
/** How many times does pattern appear in the text? O(m) */
occurrences(pattern) {
let node = this.root;
for (const char of pattern) {
if (!node.children.has(char)) return 0;
node = node.children.get(char);
}
return node.count;
}
} const st = new SuffixTrie("banana");
st.contains("ana"); // true
st.contains("xyz"); // false
st.occurrences("ana"); // 2 ("b[ana]na" and "ban[ana]")
st.occurrences("an"); // 2
st.occurrences("a"); // 3 Complexity
| Operation | Time | Space |
|---|---|---|
| Build | O(n^2) | O(n^2) nodes worst case |
| contains(pattern) | O(m) | O(1) |
| occurrences(pattern) | O(m) | O(1) |
n is the text length, m is the pattern length. The quadratic build cost is acceptable for short strings or offline preprocessing.
Upgrading to a Suffix Tree
A suffix tree compresses runs of single-child nodes into labeled edges, shrinking space to O(n) and build time to O(n) with Ukkonen’s algorithm. Ukkonen’s is complex to implement from scratch in an interview, so it is usually mentioned rather than coded.
Key differences at a glance:
| Property | Suffix Trie | Suffix Tree (Ukkonen) |
|---|---|---|
| Build time | O(n^2) | O(n) |
| Space | O(n^2) | O(n) |
| Query time | O(m) | O(m) |
| Implementation complexity | Low | High |
Suffix Automaton
The suffix automaton (DAWG) is the minimal DFA that accepts all substrings of the text. It occupies O(n) space like a suffix tree but is often easier to implement than Ukkonen’s. Each transition in the automaton still corresponds to reading one character, so queries are still O(m). For competitive programming and production string libraries, the suffix automaton is frequently the structure of choice.
Up Next
Leave string structures behind and move into dynamic programming, where overlapping subproblems are cached to turn exponential recursion into polynomial solutions.
Introduction to Dynamic Programming →