Algoritmo Z (algoritmo de búsqueda de patrón de tiempo lineal)

Este algoritmo encuentra todas las apariciones de un patrón en un texto en tiempo lineal. Sea n la longitud del texto y m el patrón, entonces el tiempo total tomado es O(m + n) con complejidad de espacio lineal. Ahora podemos ver que la complejidad tanto del tiempo como del espacio es la misma que el algoritmo KMP, pero este algoritmo es más simple de entender.
En este algoritmo, construimos una array Z.

¿Qué es la array Z?  
Para una string str[0..n-1], la array Z tiene la misma longitud que la string. Un elemento Z[i] de la array Z almacena la longitud de la substring más larga a partir de str[i], que también es un prefijo de str[0..n-1]. La primera entrada de la array Z significa menos, ya que la string completa siempre es un prefijo de sí misma. 

Example:
Index            0   1   2   3   4   5   6   7   8   9  10  11 
Text             a   a   b   c   a   a   b   x   a   a   a   z
Z values         X   1   0   0   3   1   0   0   2   2   1   0 
More Examples:
str  = "aaaaaa"
Z[]  = {x, 5, 4, 3, 2, 1}

str = "aabaacd"
Z[] = {x, 1, 0, 2, 1, 0, 0}

str = "abababab"
Z[] = {x, 0, 6, 0, 4, 0, 2, 0}
 

¿Cómo es útil la array Z en la búsqueda de patrones en tiempo lineal?  
La idea es concatenar patrón y texto, y crear una string «P$T» donde P es patrón, $es un carácter especial que no debe estar presente en patrón y texto, y T es texto. Construya la array Z para strings concatenadas. En la array Z, si el valor Z en cualquier punto es igual a la longitud del patrón, entonces el patrón está presente en ese punto. 

Example:
Pattern P = "aab",  Text T = "baabaa"

The concatenated string is = "aab$baabaa"

Z array for above concatenated string is {x, 1, 0, 0, 0, 
                                          3, 1, 0, 2, 1}.
Since length of pattern is 3, the value 3 in Z array 
indicates presence of pattern. 

¿Cómo construir una array Z?  
     Una solución simple es ejecutar dos bucles anidados, el bucle externo va a cada índice y el bucle interno encuentra la longitud del prefijo más largo que coincide con la substring que comienza en el índice actual. La complejidad temporal de esta solución es O(n 2 ).
      Podemos construir una array Z en tiempo lineal. 

The idea is to maintain an interval [L, R] which is the interval with max R
such that [L,R] is prefix substring (substring which is also prefix). 

Steps for maintaining this interval are as follows – 

1) If i > R then there is no prefix substring that starts before i and 
   ends after i, so we reset L and R and compute new [L,R] by comparing 
   str[0..] to str[i..] and get Z[i] (= R-L+1).

2) If i <= R then let K = i-L,  now Z[i] >= min(Z[K], R-i+1)  because 
   str[i..] matches with str[K..] for atleast R-i+1 characters (they are in
   [L,R] interval which we know is a prefix substring).     
   Now two sub cases arise – 
      a) If Z[K] < R-i+1  then there is no prefix substring starting at 
         str[i] (otherwise Z[K] would be larger)  so  Z[i] = Z[K]  and 
         interval [L,R] remains same.
      b) If Z[K] >= R-i+1 then it is possible to extend the [L,R] interval
         thus we will set L as i and start matching from str[R]  onwards  and
         get new R then we will update interval [L,R] and calculate Z[i] (=R-L+1).

Para comprender mejor el procedimiento paso a paso anterior, consulte esta animación: http://www.utdallas.edu/~besp/demo/John2010/z-algorithm.htm
El algoritmo se ejecuta en tiempo lineal porque nunca comparamos caracteres menores que R y con el emparejamiento aumentamos R en uno para que haya como máximo T comparaciones. En caso de desajuste, el desajuste ocurre solo una vez para cada i (debido a lo cual R se detiene), esa es otra comparación como máximo T que crea una complejidad lineal general.

A continuación se muestra la implementación del algoritmo Z para la búsqueda de patrones. 

C++

// A C++ program that implements Z algorithm for pattern searching
#include<iostream>
using namespace std;
 
void getZarr(string str, int Z[]);
 
// prints all occurrences of pattern in text using Z algo
void search(string text, string pattern)
{
    // Create concatenated string "P$T"
    string concat = pattern + "$" + text;
    int l = concat.length();
 
    // Construct Z array
    int Z[l];
    getZarr(concat, Z);
 
    // now looping through Z array for matching condition
    for (int i = 0; i < l; ++i)
    {
        // if Z[i] (matched region) is equal to pattern
        // length we got the pattern
        if (Z[i] == pattern.length())
            cout << "Pattern found at index "
                << i - pattern.length() -1 << endl;
    }
}
 
// Fills Z array for given string str[]
void getZarr(string str, int Z[])
{
    int n = str.length();
    int L, R, k;
 
    // [L,R] make a window which matches with prefix of s
    L = R = 0;
    for (int i = 1; i < n; ++i)
    {
        // if i>R nothing matches so we will calculate.
        // Z[i] using naive way.
        if (i > R)
        {
            L = R = i;
 
            // R-L = 0 in starting, so it will start
            // checking from 0'th index. For example,
            // for "ababab" and i = 1, the value of R
            // remains 0 and Z[i] becomes 0. For string
            // "aaaaaa" and i = 1, Z[i] and R become 5
            while (R<n && str[R-L] == str[R])
                R++;
            Z[i] = R-L;
            R--;
        }
        else
        {
            // k = i-L so k corresponds to number which
            // matches in [L,R] interval.
            k = i-L;
 
            // if Z[k] is less than remaining interval
            // then Z[i] will be equal to Z[k].
            // For example, str = "ababab", i = 3, R = 5
            // and L = 2
            if (Z[k] < R-i+1)
                Z[i] = Z[k];
 
            // For example str = "aaaaaa" and i = 2, R is 5,
            // L is 0
            else
            {
                // else start from R and check manually
                L = i;
                while (R<n && str[R-L] == str[R])
                    R++;
                Z[i] = R-L;
                R--;
            }
        }
    }
}
 
// Driver program
int main()
{
    string text = "GEEKS FOR GEEKS";
    string pattern = "GEEK";
    search(text, pattern);
    return 0;
}

Java

// A Java program that implements Z algorithm for pattern
// searching
class GFG {
 
    //  prints all occurrences of pattern in text using
    // Z algo
    public static void search(String text, String pattern)
    {
 
        // Create concatenated string "P$T"
        String concat = pattern + "$" + text;
 
        int l = concat.length();
 
        int Z[] = new int[l];
 
        // Construct Z array
        getZarr(concat, Z);
 
        // now looping through Z array for matching condition
        for(int i = 0; i < l; ++i){
 
            // if Z[i] (matched region) is equal to pattern
            // length we got the pattern
 
            if(Z[i] == pattern.length()){
                System.out.println("Pattern found at index "
                              + (i - pattern.length() - 1));
            }
        }
    }
 
    // Fills Z array for given string str[]
    private static void getZarr(String str, int[] Z) {
 
        int n = str.length();
         
        // [L,R] make a window which matches with
        // prefix of s
        int L = 0, R = 0;
 
        for(int i = 1; i < n; ++i) {
 
            // if i>R nothing matches so we will calculate.
            // Z[i] using naive way.
            if(i > R){
 
                L = R = i;
 
                // R-L = 0 in starting, so it will start
                // checking from 0'th index. For example,
                // for "ababab" and i = 1, the value of R
                // remains 0 and Z[i] becomes 0. For string
                // "aaaaaa" and i = 1, Z[i] and R become 5
 
                while(R < n && str.charAt(R - L) == str.charAt(R))
                    R++;
                 
                Z[i] = R - L;
                R--;
 
            }
            else{
 
                // k = i-L so k corresponds to number which
                // matches in [L,R] interval.
                int k = i - L;
 
                // if Z[k] is less than remaining interval
                // then Z[i] will be equal to Z[k].
                // For example, str = "ababab", i = 3, R = 5
                // and L = 2
                if(Z[k] < R - i + 1)
                    Z[i] = Z[k];
 
                // For example str = "aaaaaa" and i = 2, R is 5,
                // L is 0
                else{
 
 
                // else start from R and check manually
                    L = i;
                    while(R < n && str.charAt(R - L) == str.charAt(R))
                        R++;
                     
                    Z[i] = R - L;
                    R--;
                }
            }
        }
    }
     
    // Driver program
    public static void main(String[] args)
    {
        String text = "GEEKS FOR GEEKS";
        String pattern = "GEEK";
 
        search(text, pattern);
    }
}
 
// This code is contributed by PavanKoli.

Python3

# Python3 program that implements Z algorithm
# for pattern searching
 
# Fills Z array for given string str[]
def getZarr(string, z):
    n = len(string)
 
    # [L,R] make a window which matches
    # with prefix of s
    l, r, k = 0, 0, 0
    for i in range(1, n):
 
        # if i>R nothing matches so we will calculate.
        # Z[i] using naive way.
        if i > r:
            l, r = i, i
 
            # R-L = 0 in starting, so it will start
            # checking from 0'th index. For example,
            # for "ababab" and i = 1, the value of R
            # remains 0 and Z[i] becomes 0. For string
            # "aaaaaa" and i = 1, Z[i] and R become 5
            while r < n and string[r - l] == string[r]:
                r += 1
            z[i] = r - l
            r -= 1
        else:
 
            # k = i-L so k corresponds to number which
            # matches in [L,R] interval.
            k = i - l
 
            # if Z[k] is less than remaining interval
            # then Z[i] will be equal to Z[k].
            # For example, str = "ababab", i = 3, R = 5
            # and L = 2
            if z[k] < r - i + 1:
                z[i] = z[k]
 
            # For example str = "aaaaaa" and i = 2,
            # R is 5, L is 0
            else:
 
                # else start from R and check manually
                l = i
                while r < n and string[r - l] == string[r]:
                    r += 1
                z[i] = r - l
                r -= 1
 
# prints all occurrences of pattern
# in text using Z algo
def search(text, pattern):
 
    # Create concatenated string "P$T"
    concat = pattern + "$" + text
    l = len(concat)
 
    # Construct Z array
    z = [0] * l
    getZarr(concat, z)
 
    # now looping through Z array for matching condition
    for i in range(l):
 
        # if Z[i] (matched region) is equal to pattern
        # length we got the pattern
        if z[i] == len(pattern):
            print("Pattern found at index",
                      i - len(pattern) - 1)
 
# Driver Code
if __name__ == "__main__":
    text = "GEEKS FOR GEEKS"
    pattern = "GEEK"
    search(text, pattern)
 
# This code is contributed by
# sanjeev2552

C#

// A C# program that implements Z
// algorithm for pattern searching
using System;
 
class GFG
{
 
// prints all occurrences of
// pattern in text using Z algo
public static void search(string text,
                          string pattern)
{
 
    // Create concatenated string "P$T"
    string concat = pattern + "$" + text;
 
    int l = concat.Length;
 
    int[] Z = new int[l];
 
    // Construct Z array
    getZarr(concat, Z);
 
    // now looping through Z array
    // for matching condition
    for (int i = 0; i < l; ++i)
    {
 
        // if Z[i] (matched region) is equal
        // to pattern length we got the pattern
 
        if (Z[i] == pattern.Length)
        {
            Console.WriteLine("Pattern found at index " +
                             (i - pattern.Length - 1));
        }
    }
}
 
// Fills Z array for given string str[]
private static void getZarr(string str,
                            int[] Z)
{
 
    int n = str.Length;
 
    // [L,R] make a window which
    // matches with prefix of s
    int L = 0, R = 0;
 
    for (int i = 1; i < n; ++i)
    {
 
        // if i>R nothing matches so we will
        // calculate. Z[i] using naive way.
        if (i > R)
        {
            L = R = i;
 
            // R-L = 0 in starting, so it will start
            // checking from 0'th index. For example,
            // for "ababab" and i = 1, the value of R
            // remains 0 and Z[i] becomes 0. For string
            // "aaaaaa" and i = 1, Z[i] and R become 5
            while (R < n && str[R - L] == str[R])
            {
                R++;
            }
 
            Z[i] = R - L;
            R--;
 
        }
        else
        {
 
            // k = i-L so k corresponds to number
            // which matches in [L,R] interval.
            int k = i - L;
 
            // if Z[k] is less than remaining interval
            // then Z[i] will be equal to Z[k].
            // For example, str = "ababab", i = 3,
            // R = 5 and L = 2
            if (Z[k] < R - i + 1)
            {
                Z[i] = Z[k];
            }
 
            // For example str = "aaaaaa" and
            // i = 2, R is 5, L is 0
            else
            {
 
 
                // else start from R and
                // check manually
                L = i;
                while (R < n && str[R - L] == str[R])
                {
                    R++;
                }
 
                Z[i] = R - L;
                R--;
            }
        }
    }
}
 
// Driver Code
public static void Main(string[] args)
{
    string text = "GEEKS FOR GEEKS";
    string pattern = "GEEK";
 
    search(text, pattern);
}
}
 
// This code is contributed by Shrikant13

Javascript

<script>
 
// A JavaScript program that implements Z algorithm for pattern
// searching
 
//  prints all occurrences of pattern in text using
    // Z algo
function search(text,pattern)
{
    // Create concatenated string "P$T"
        let concat = pattern + "$" + text;
  
        let l = concat.length;
  
        let Z = new Array(l);
  
        // Construct Z array
        getZarr(concat, Z);
  
        // now looping through Z array for matching condition
        for(let i = 0; i < l; ++i){
  
            // if Z[i] (matched region) is equal to pattern
            // length we got the pattern
  
            if(Z[i] == pattern.length){
                document.write("Pattern found at index "
                              + (i - pattern.length - 1)+"<br>");
            }
        }
}
 
 // Fills Z array for given string str[]
function getZarr(str,Z)
{
    let n = str.length;
          
        // [L,R] make a window which matches with
        // prefix of s
        let L = 0, R = 0;
  
        for(let i = 1; i < n; ++i) {
  
            // if i>R nothing matches so we will calculate.
            // Z[i] using naive way.
            if(i > R){
  
                L = R = i;
  
                // R-L = 0 in starting, so it will start
                // checking from 0'th index. For example,
                // for "ababab" and i = 1, the value of R
                // remains 0 and Z[i] becomes 0. For string
                // "aaaaaa" and i = 1, Z[i] and R become 5
  
                while(R < n && str[R - L] == str[R])
                    R++;
                  
                Z[i] = R - L;
                R--;
  
            }
            else{
  
                // k = i-L so k corresponds to number which
                // matches in [L,R] interval.
                let k = i - L;
  
                // if Z[k] is less than remaining interval
                // then Z[i] will be equal to Z[k].
                // For example, str = "ababab", i = 3, R = 5
                // and L = 2
                if(Z[k] < R - i + 1)
                    Z[i] = Z[k];
  
                // For example str = "aaaaaa" and i = 2, R is 5,
                // L is 0
                else{
  
  
                // else start from R and check manually
                    L = i;
                    while(R < n && str[R - L] == str[R])
                        R++;
                      
                    Z[i] = R - L;
                    R--;
                }
            }
        }
}
 
// Driver program
let text = "GEEKS FOR GEEKS";
let pattern = "GEEK";
 
search(text, pattern);
 
 
// This code is contributed by rag2127
 
</script>

Producción: 

Pattern found at index 0
Pattern found at index 10

Complejidad de tiempo: O(m+n), donde m es la longitud del patrón y n es la longitud del texto.

Espacio Auxiliar: O(m+n)

Este artículo es una contribución de Utkarsh Trivedi . Escriba comentarios si encuentra algo incorrecto o si desea compartir más información sobre el tema tratado anteriormente.
 

Publicación traducida automáticamente

Artículo escrito por GeeksforGeeks-1 y traducido por Barcelona Geeks. The original can be accessed here. Licence: CCBY-SA

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *