Water 5-Common Data Types-String RegexClass for matching strings and substrings to patterns
Contract| Parameter key | Default value | Type | | a_type | opt | | regex_string | opt | string | | Parameter kind | Default value | Type | | Other unkeyed arguments | opt | type.<one_of string/> | | Water Contract<class pattern
a_type =opt
regex_string =opt=string
_native_object=null
_other_unkeyed=opt=<type.one_of string/>=ekind.code="_add_to_environment"
/> | |
See also: key_of, subvector, subvectors, replace, string_segment
This class and its subclasses may be used to describe character
patterns that may be matched to a target string. Methods exist to
check if a match is found, to get the matched substrings, or to
replace the matched substrings.
Simple Patterns
Use the is_type_for method to compare the pattern to the
target string.
Example: Check for match of a simple pattern
<pattern "ABCD" />.<is_type_for "ABCD" />
 | true |
Unless otherwise instructed, the pattern will look for a match
anywhere in the target string.
Example: Patterns match anywhere in the string
<pattern "ABCD" />.<is_type_for "xxxyyyABCDzzz" />
 | true |
Anchoring the Pattern to the Beginning and/or End of a Line
A pattern may be anchored to the beginning or end of a line by using the
special fields pattern.start_of_line and pattern.end_of_line .
a wob
Example: Pattern is not found against beginning of line
<pattern pattern.start_of_line "ABCD" />.<is_type_for "xxxABCD" />
 | false |
a wob
Example: Pattern is found against end of line
<pattern "ABCD" pattern.end_of_line />.<is_type_for "xxxABCD" />
 | true |
Matching the Whole Line
A pattern may use both start_of_line and end_of_line to
anchor a pattern to match only the entire line from beginning to end.
Example: Pattern must match entire line
<pattern pattern.start_of_line "ABCD" pattern.end_of_line />.
<is_type_for "ABCD" />
 | true |
Matching Using Regular Expressions
a wob
| Parameter key | Default value | Type |
| regex_string | opt | string |
The contents of the pattern are transformed into a string in the
form of a regular expression. The value of this field
will be computed from the contents of the pattern, or it may be
set by the programmer in the call to pattern.
Example: Setting the regex_string field
<pattern regex_string="x+" />.<is_type_for "zzzxxyyy" />
 | true |
In the example above, the pattern attempts to match one or more
'x's in the target string.
It is important to know that resetting the regex_string after
the pattern instance has been created will not
have an effect on the underlying pattern match. If you want a
different regex_string, create a new instance of the pattern,
do not simply change its regex_string field.
Matching Special Sets of Characters
The following are several predefined fields that may
be used to construct patterns to be used for matching.
a wob
Upper and lowercase alphabetic characters
(regex_string="[a-zA-Z]").
pattern.alphabetic.<is_type_for "A" />
 | true |
a wob
Uppercase alphabetic characters
(regex_string="[A-Z]").
pattern.uppercase.<is_type_for "A" />
 | true |
a wob
Lowercase alphabetic characters
(regex_string="[a-z]").
pattern.lowercase.<is_type_for "a" />
 | true |
a wob
Alphanumeric characters
(regex_string="[a-zA-Z0-9]").
pattern.alphanumeric.<is_type_for "9" />
 | true |
a wob
Digit characters
(regex_string="\d").
pattern.digit.<is_type_for "7" />
 | true |
a wob
Non-digit characters
(regex_string="\D").
pattern.digit_not.<is_type_for "A" />
 | true |
a wob
Hexadecimal digit characters
(regex_string="[0123456789abcdefABCDEF]").
pattern.hexdigit.<is_type_for "F" />
 | true |
a wob
"Word" characters
(regex_string="\w").
pattern.word_char.<is_type_for "A" />
 | true |
a wob
Non-word characters
(regex_string="\W").
pattern.word_char_not.<is_type_for char."!" />
 | true |
a wob
Whitespace characters
(regex_string="\s").
pattern.whitespace.<is_type_for char.tab />
 | true |
a wob
Non-whitespace characters
(regex_string="\S").
pattern.whitespace_not.<is_type_for "A" />
 | true |
a wob
Space or tab characters
(regex_string="[ \t]").
pattern.space_or_tab.<is_type_for char.tab />
 | true |
a wob
Punctuation characters
(regex_string="[!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]").
pattern.punctuation.<is_type_for "!" />
 | true |
a wob
Control characters
(regex_string="[\x00-\x1F\x7F]").
pattern.control.<is_type_for char.19 />
 | true |
a wob
"Graphic" characters
(regex_string="[a-zA-Z0-9]|[!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]").
pattern.graphic.<is_type_for "&" />
 | true |
a wob
Printing characters
(regex_string="[a-zA-Z0-9]|[!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]| ").
pattern.printing.<is_type_for "A" />
 | true |
Constructing Complex Patterns
Several simple patterns may be put together by calling pattern to
construct a complex pattern for matching more complex target strings.
Example: A complex pattern
<pattern pattern.start_of_line <!-- matches start of line -->
"xxx" <!-- must exactly match "xxx" -->
pattern.<one_or_more pattern.alphabetic />
"yyy" <!-- must exactly match "yyy" -->
pattern.<one_or_more pattern.whitespace />
"zzz" <!-- must exactly match "zzz" -->
pattern.end_of_line />.
<is_type_for "xxxABcdEfGyyy zzz" /> | true |
Notice that the pattern is very precise in its matching capabilities.
Adding a digit to the target string causes the match to fail.
Example: A digit causes it to fail
<pattern pattern.start_of_line <!-- matches start of line -->
"xxx" <!-- must exactly match "xxx" -->
pattern.<one_or_more pattern.alphabetic />
"yyy" <!-- must exactly match "yyy" -->
pattern.<one_or_more pattern.whitespace />
"zzz" <!-- must exactly match "zzz" -->
pattern.end_of_line />.
<is_type_for "xxxABc9EfGyyy zzz" /> | false |
See the documentation on the methods and subclasses of this class for
more examples of using Water patterns.
© Copyright 2007 Clear Methods, Inc.