Programa en C para el algoritmo de Rabin-Karp para la búsqueda de patrones

Dado un texto txt[0..n-1] y un patrón pat[0..m-1] , escriba una función search(char pat[], char txt[]) que imprima todas las apariciones de pat[] en txt [] . Puede suponer que n > m.

Ejemplos:

Input:  txt[] = "THIS IS A TEST TEXT"
        pat[] = "TEST"
Output: Pattern found at index 10

Input:  txt[] =  "AABAACAADAABAABA"
        pat[] =  "AABA"
Output: Pattern found at index 0
        Pattern found at index 9
        Pattern found at index 12

El algoritmo Naive String Matching desliza el patrón uno por uno. Después de cada diapositiva, comprueba uno por uno los caracteres en el turno actual y, si todos los caracteres coinciden, imprime la coincidencia.
Al igual que el algoritmo Naive, el algoritmo de Rabin-Karp también desliza el patrón uno por uno. Pero a diferencia del algoritmo Naive, el algoritmo de Rabin Karp hace coincidir el valor hash del patrón con el valor hash de la substring de texto actual, y si los valores hash coinciden, solo comienza a hacer coincidir los caracteres individuales. Entonces, el algoritmo Rabin Karp necesita calcular los valores hash para las siguientes strings.

1) Patrón en sí.
2) Todas las substrings de texto de longitud m.

C/C++

/* Following program is a C implementation of Rabin Karp
Algorithm given in the CLRS book */
#include <stdio.h>
#include <string.h>
  
// d is the number of characters in the input alphabet
#define d 256
  
/* pat -> pattern
    txt -> text
    q -> A prime number
*/
void search(char pat[], char txt[], int q)
{
    int M = strlen(pat);
    int N = strlen(txt);
    int i, j;
    int p = 0; // hash value for pattern
    int t = 0; // hash value for txt
    int h = 1;
  
    // The value of h would be "pow(d, M-1)%q"
    for (i = 0; i < M - 1; i++)
        h = (h * d) % q;
  
    // Calculate the hash value of pattern and first
    // window of text
    for (i = 0; i < M; i++) {
        p = (d * p + pat[i]) % q;
        t = (d * t + txt[i]) % q;
    }
  
    // Slide the pattern over text one by one
    for (i = 0; i <= N - M; i++) {
  
        // Check the hash values of current window of text
        // and pattern. If the hash values match then only
        // check for characters on by one
        if (p == t) {
            /* Check for characters one by one */
            for (j = 0; j < M; j++) {
                if (txt[i + j] != pat[j])
                    break;
            }
  
            // if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]
            if (j == M)
                printf("Pattern found at index %d \n", i);
        }
  
        // Calculate hash value for next window of text: Remove
        // leading digit, add trailing digit
        if (i < N - M) {
            t = (d * (t - txt[i] * h) + txt[i + M]) % q;
  
            // We might get negative value of t, converting it
            // to positive
            if (t < 0)
                t = (t + q);
        }
    }
}
  
/* Driver program to test above function */
int main()
{
    char txt[] = "GEEKS FOR GEEKS";
    char pat[] = "GEEK";
    int q = 101; // A prime number
    search(pat, txt, q);
    return 0;
}

Producción:

Pattern found at index 0 
Pattern found at index 10

Consulte el artículo completo sobre el algoritmo de Rabin-Karp para la búsqueda de patrones para obtener más detalles.

Publicación traducida automáticamente

Artículo escrito por GeeksforGeeks-1 y traducido por Barcelona Geeks. The original can be accessed here. Licence: CCBY-SA

C/C++

Deja una respuesta Cancelar la respuesta