Regex package:hxt-regex-xmlschema

W3C XML Schema Regular Expression Matcher Grammar can be found under http://www.w3.org/TR/xmlschema11-2/#regexs
csh style Glob Pattern Parser for Regular Expressions
W3C XML Schema Regular Expression Parser This parser supports the full W3C standard, the complete grammar can be found under http://www.w3.org/TR/xmlschema11-2/#regexs and extensions for all missing set operations, intersection, difference, exclusive or, interleave, complement
A regular expression library for W3C XML Schema regular expressions This library supports full W3C XML Schema regular expressions inclusive all Unicode character sets and blocks. The complete grammar can be found under http://www.w3.org/TR/xmlschema11-2/#regexs. It is implemented by the technique of derivations of regular expressions. The W3C syntax is extended to support not only union of regular sets, but also intersection, set difference, exor. Matching of subexpressions is also supported. The library can be used for constricting lightweight scanners and tokenizers. It is a standalone library, no external regex libraries are used. Extensions in 9.2: The library does nor only support String's, but also ByteString's and Text in strict and lazy variants
parse a glob pattern
parse a regular expression surrounded by contenxt spec a leading ^ denotes start of text, a trailing $ denotes end of text, a leading \< denotes word start, a trailing \> denotes word end. The 1. param ist the regex parser (parseRegex or parseRegexExt)
parse a standard W3C XML Schema regular expression
parse an extended syntax W3C XML Schema regular expression The Syntax of the W3C XML Schema spec is extended by further useful set operations, like intersection, difference, exor. Subexpression match becomes possible with "named" pairs of parentheses. The multi char escape sequence \a represents any Unicode char, The multi char escape sequence \A represents any Unicode word, (\A = \a*). All syntactically wrong inputs are mapped to the Zero expression representing the empty set of words. Zero contains as data field a string for an error message. So error checking after parsing becomes possible by checking against Zero (isZero predicate)
This function wraps the whole regex in a subexpression before starting the parse. This is done for getting access to the whole parsed string. Therfore we need one special label, this label is the Nothing value, all explicit labels are Just labels.
The main scanner function
speedup version for splitWithRegex' This function checks whether the input starts with a char from FIRST re. If this is not the case, the split fails. The FIRST set can be computed once for a whole tokenizer and reused by every call of split