Los anclajes en Perl Regex no coinciden con ningún carácter. En cambio, coinciden con una posición particular como antes, después o entre los personajes. Estos se utilizan para verificar no la string sino sus límites posicionales.
Los siguientes son los anclajes respectivos en Perl Regex:
'^' '$', '\b', '\A', '\Z', '\z', '\G', '\p{....}', '\P{....}', '[:class:]'
^ o \A : Coincide con el patrón al principio de la string.
Sintaxis: (/^patrón/, /\Apatrón/).
Ejemplo:
#!/usr/bin/perl $str = "guardians of the galaxy"; # prints the pattern as it is # starting with 'guardians' print "$&\n" if($str =~ /^guardians/); # prints the pattern 'gua' print "$&\n" if($str =~ /\Agua/); # prints nothing because # the 0th position doesn't start with 'a' print "$&" if($str =~ /^ans/)
guardians gua
$o \z : Coincide con el patrón al final de la string.
Sintaxis: (/patrón$/, /patrón\z/).
Ejemplo:
#!/usr/bin/perl $str = "guardians of the galaxy"; # prints nothing as it is not # ending with 'guardians' print "$&\n" if($str =~ /guardians$/); # prints the pattern 'y' print "$&\n" if($str =~ /y\z/); # prints the pattern as it is # ending with 'galaxy' print "$&" if($str =~ /galaxy$/)
y galaxy
\b : Coincide con el límite de palabra de la string de \w a \W . En concreto, obtiene una coincidencia con el principio o el final de la string si es una palabra o un carácter de palabra o un carácter que no es una palabra.
Sintaxis: (/\bpatrón\b/).
Ejemplo:
#!/usr/bin/perl $str = "guardians-of-the-galaxy"; # prints '-galaxy' as it forms # a word even with '-'. print "$&\n" if($str =~ /\b-galaxy\b/); # prints '-guardians' as it forms # a word even with '-'. print "$&\n" if($str =~ /\bguardians-\b/); # prints nothing as it is bounded # with a character 't'. print "$&" if($str =~ /\be-galaxy\b/); # prints 'guardians-of-the-galaxy' as it # is bounded with the beginning and end. print "$&" if($str =~ /\bguardians-of-the-galaxy\b/);
-galaxy guardians- guardians-of-the-galaxy
\Z : Coincide al final de la string o antes del salto de línea. ‘ \z ‘ y ‘ \Z ‘ difieren de $ en que no se ven afectados por el indicador /m «multilínea», que permite que $ coincida al final de cualquier línea.
#!/usr/bin/perl # Prints one due to m// print "one\n" if ('galaxy' =~ m/galaxy\z/); # Prints two due to m// print "two\n" if('galaxy' =~ m/galaxy\Z/); # Prints three due to /Z # as it forms a newline print "three\n" if ("galaxy\n" =~ m/galaxy\Z/); # Prints four due to m// as # the line ended \z gets affected print "four\n" if ("galaxy\n" =~ m/galaxy\n\z/); # Prints five as it forms a new line print "five\n" if("galaxy\n" =~ m/galaxy\n\Z/); # Due to the "" it forms a newline and # \z doesn't get affected. Prints nothing print "six" if("galaxy\n" =~ m/galaxy\z/);
one two three four five
\G : Coincide en la posición especificada. Si la longitud de un patrón es 5, entonces comienza desde el inicio de la string hasta las 5 posiciones, si el patrón es válido, entonces se ve obligado a verificar la string desde la 6ª posición en adelante, avanza de esta manera hasta que el patrón no es válido o finaliza. cuerda.
#!/usr/bin/perl $str = "galaxy8222as"; # prints until the pattern is valid print "one: $& " while($str =~ /\G[a-z]{2}/gc); print "\n"; # prints until the pattern is valid print "two: $& " while("1122a44" =~ /\G\d\d/gc); print "\n"; # Take the string as a new value and # searches from the start to false print "three: $& " while("galaxy8222as" =~ /\G\w{2}/gc); print "four: $& " while($str =~ /\G[a-z]{2}/gc); # Take the false position of the # above string and searches from there # Prints if the pattern is valid from that position # onwards(prints nothing). As it is false # it stays at the same position as before. print "\n"; print "five: $& " while($str =~ /\G\w{2}/gc);
one: ga one: la one: xy two: 11 two: 22 three: ga three: la three: xy three: 82 three: 22 three: as five: 82 five: 22 five: as
\p{…} y \P{…} : \p{…} coincide con la clase de caracteres Unicode como IsLower, IsAlpha, etc. mientras que \P{….} es el complemento de la clase de caracteres Unicode.
#!/usr/bin/perl # unicode class is the pattern to match print "$&" while("guardians!@#%^*123" =~ /\p{isalpha}/gc); print "\n"; # unicode class is the pattern to match print "$&" while("guardians!@#%^&*123" =~ /\p{isalnum}/gc); print "\n"; # here L matches the alphabets where \P is the complement print "$&" while("guardians!@#%^&*123" =~ /\P{L}/gc); print "\n"; # here L matches the alphabets where \p is non-complement print "$&" while("guardians!@#%^&*123" =~ /\p{L}/gc);
guardians guardians123 !@#%^&*123 guardians
[:class:] : clases de caracteres POSIX como digit, lower, ascii, etc.
Sintaxis: (/[[:clase:]]/)
Las clases de caracteres POSIX son las siguientes:
alpha, alnum, ascii, blank, cntrl, digit, graph, lower, punct, space, upper, xdigit, word
#!/usr/bin/perl # prints only alphabets print "$&" while('guardians!@#%^&*123' =~ /[[:alpha:]]/gc); print "\n"; # prints characters and digits print "$&" while("guardians!@#%^&*123" =~ /[[:alnum:]]/gc); print "\n"; # prints only digits print "$&" while("guardians!@#%^&*123" =~ /[[:digit:]]/gc); print "\n"; # prints anything except space " ". print "$&" while("guardians!@#%^& 123\n" =~ /[[:graph:]]/gc); print "\n"; # prints the 1 as it gets matched to # space " " or horizontal tab. print "1" while("guardians!@#%^& 123\n" =~ /[[:blank:]]/gc); print "\n"; # prints lowercase characters print "$&" while("Guardians!@#%^& 123\n" =~ /[[:lower:]]/gc); print "\n"; # prints all ascii characters print "$&" while("guardians!@#%^& 123\n" =~ /[[:ascii:]]/gc);
guardians guardians123 123 guardians!@#%^&123 1 uardians guardians!@#%^& 123