Parle pattern matching
Содержание
- Character representations
- Character classes
- Unicode character classes
- Alternation and repetition
- Anchors
- Grouping
Parle supports regex matching similar to flex. Also supported are the following POSIX character sets: [:alnum:]
, [:alpha:]
, [:blank:]
, [:cntrl:]
, [:digit:]
, [:graph:]
, [:lower:]
, [:print:]
, [:punct:]
, [:space:]
, [:upper:]
and [:xdigit:]
.
The Unicode character classes are currently not enabled by default, pass --enable-parle-utf32 to make them available. A particular encoding can be mapped with a correctly constructed regex. For example, to match the EURO symbol encoded in UTF-8, the regular expression [\xe2][\x82][\xac]
can be used. The pattern for an UTF-8 encoded string could be [ -\x7f]{+}[\x80-\xbf]{+}[\xc2-\xdf]{+}[\xe0-\xef]{+}[\xf0-\xff]+
.