Regular Expressions
Pikt regular expressions follow the usual regular expression rules with any necessary clarifications/amplifications to follow.
Here are the regular expression operators:
OPERATOR MEANING a =~ b string b matches at least one substring within a a =~~ b like the above, but without case sensitivity a !~ b string b matches no substring within a a !~~ b like the above, but without case sensitivityFor example, all of the following are true:
"this is a test" =~ "is" "this is a test" =~~ "IS" "this is a test" !~ "THIS" "this is a test" !~~ "that" "this is a test" =~ "" "" !~ "this is a test"These characters have special meaning within Pikt regular expressions:
CHARACTER(S) MEANING . matches any single character * matches zero or more instances of the preceding character/pattern ? matches zero or one instance(s) of the preceding character/pattern + matches one or more instances of the preceding character/pattern {m,n} matches as few as m, or as many as n, instances of the preceding character/pattern ( ) enclose a subexpression, or set of subexpressions separated by | | separates subexpressions (think of "or") [ ] enclose a set of characters/character ranges ^ as the first character in a [ ] subexpression, indicates set negation; as the first character in a regular expression, anchors to the beginning of the string expression on the left-hand side of the regexp operator $ anchors to the end of the string expression on the left-hand side of the regexp operatorIn addition to user-specified character classes, Pikt supports these built-in predefined character classes:
[[:alnum:]] the set of alphanumeric characters [[:alpha:]] the set of letters [[:blank:]] tab and space [[:cntrl:]] the control characters [[:digit:]] the decimal digits [[:graph:]] the printable characters except space [[:lower:]] the lower-case letters [[:print:]] the printable characters [[:punct:]] the punctuation characters [[:space:]] whitespace characters [[:upper:]] the upper-case lettersBackslash escapes suppress a character's specialness. So, "\\*" is a literal asterisk, and the following are all true:
"fo*bar" !~ "fo*bar" // left side literal string, // right side regexp "fo*bar" !~ "fo\*bar" "fo*bar" =~ "fo\\*bar" "fo*bar" =~ "\\*" "*" =~ "\\*"In any of the above left-hand expressions, you could substitute "fo\*bar", and the statements would all still be true.
Usually, just a single backslash is required for this purpose. In Pikt, however, backslashes are a general escape character. If, for example, you want to output the literal text string "$x" without the $x being interpreted as a variable (which Pikt would attempt to resolve to a value), you would use "\$x". So, if you require a backslash in the final product, you must supply double backslashes going in. Again, see the sample config files for examples of double-backslash usage.
Note that every time a regular expression containing matching parentheses is invoked, for example in any of the following situations
dat "([^:]*):([^:]*)" if $line =~ "^([^:]*):([^:]*)" do #split($rdline, "([^:]*):([^:]*)")you can reference the first parentheses-enclosed matched subexpression with $1, the second with $2, and so on. $0 references the entire matched subexpression.
Note well: The $0, $1, and so on only persist until the next regexp pattern match. The next time you use =~ (or any of the other regexp operators), or the next time you invoke the #split() function (in any of its forms), any previous $0, $1, ... values get supplanted by the values in the latest regexp. You will encounter many strange bugs unless you keep this in mind!
Alternate forms for referencing regexp matches are: $[0], $[1], $[2], and so on. These make it possible to reference the matched expressions within for loops:
set #n = #split($rdlin) for #i=1 #i<=#n #i+=1 output $[#i] endforHere is a technique for saving $0, $1, ... before a subsequent regexp action:
set #n = #split($rdlin) for #i=1 #i<=#n #i+=1 set $f[#i] = $[#i] endfor ... if $f[3] =~ "cantata|sonata|toccata" // wipes out // $3 & $[3] value output $f[3] fiBetter still is to use the #split() function (with all three arguments required) this way:
do #split($f, $rdlin, " ") ... if $f[3] =~ "cantata|sonata|toccata" // wipes out // $3 & $[3] value output $f[3] fiIf you failed to save the previous regexp values in the $f[] array and simply referenced $3 or $[3], that value would be undefined, since in the =~ test you didn't put ( )'s around any third subexpression, but even if you did (around "toccata") you have lost your previous $3 value.
For further coverage of regular expressions, see the GNU RX info pages.
Refer to the sample alarms.cfg for examples.
prev page | 1st page | next page |