Z-Algorithm

Computes the Z-array where Z[i] is the length of the longest substring starting at position i that is also a prefix of the string; used for O(n + m) pattern search.

Syntax

zSearch(text, pattern)

Parameters

NameTypeRequiredDescription
text string Yes The string to search within.
pattern string Yes The pattern to find. Concatenated with text as `pattern + '$' + text` internally.

Returns

number[] — Starting indices in text where pattern occurs.

Examples

function buildZ(s) {
  const z = new Array(s.length).fill(0);
  let l = 0, r = 0;
  for (let i = 1; i < s.length; i++) {
    if (i < r) z[i] = Math.min(r - i, z[i - l]);
    while (i + z[i] < s.length && s[z[i]] === s[i + z[i]]) z[i]++;
    if (i + z[i] > r) { l = i; r = i + z[i]; }
  }
  return z;
}

function zSearch(text, pattern) {
  const s = pattern + '$' + text;
  const z = buildZ(s);
  const m = pattern.length;
  return z
    .slice(m + 1)
    .map((v, i) => [v, i])
    .filter(([v]) => v === m)
    .map(([, i]) => i);
}

console.log(zSearch('aabxaabxcaabx', 'aabx'));
Output
[ 0, 4, 9 ]
// Z-array itself is useful for many string problems
function buildZ(s) {
  const z = new Array(s.length).fill(0);
  let l = 0, r = 0;
  for (let i = 1; i < s.length; i++) {
    if (i < r) z[i] = Math.min(r - i, z[i - l]);
    while (i + z[i] < s.length && s[z[i]] === s[i + z[i]]) z[i]++;
    if (i + z[i] > r) { l = i; r = i + z[i]; }
  }
  return z;
}

// Find smallest period of a string
const s = 'abcabcabc';
const z = buildZ(s);
const period = z.slice(1).findIndex((v, i) => v + i + 1 === s.length) + 2;
console.log(period); // 'abc' has period 3
Output
3

Notes

| Case | Time | Space | |---------|----------|-------| | Best | O(n + m) | O(n) | | Average | O(n + m) | O(n) | | Worst | O(n + m) | O(n) | The Z-array is computed in linear time using the Z-box [l, r] which tracks the rightmost match seen. When i falls inside the Z-box, we can initialise z[i] from z[i - l] without rescanning. Compared to KMP, Z-algorithm is often simpler to implement from scratch. The sentinel character ('$') separates pattern from text to prevent the Z-box from spanning both. Z-algorithm is also used to find the minimum period of a string and the lexicographically smallest rotation.

See also