Regex package:hledger-lib

Easy regular expression helpers, currently based on regex-tdfa. These should:
  • be cross-platform, not requiring C libraries
  • support unicode
  • support extended regular expressions
  • support replacement, with backreferences etc.
  • support splitting
  • have mnemonic names
  • have simple monomorphic types
  • work with simple strings
Regex strings are automatically compiled into regular expressions the first time they are seen, and these are cached. If you use a huge number of unique regular expressions this might lead to increased memory usage. Several functions have memoised variants (*Memo), which also trade space for time. Currently two APIs are provided:
  • The old partial one (with ' suffixes') which will call error on any problem (eg with malformed regexps). This comes from hledger's origin as a command-line tool.
  • The new total one which will return an error message. This is better for long-running apps like hledger-web.
Current limitations:
  • (?i) and similar are not supported
An error message arising during a regular expression operation. Eg: trying to compile a malformed regular expression, or trying to apply a malformed replacement pattern.
Regular expression. Extended regular expression-ish syntax ? But does not support eg (?i) syntax.
Test whether a Regexp matches a String. This is an alias for matchTest for consistent naming.
Tests whether a Regexp matches a Text. This currently unpacks the Text to a String, to work around a performance bug in regex-tdfa (#9), which may or may not be relevant here.
Return a (possibly empty) list of match groups derived by applying the Regex to a Text.
A memoising version of regexReplace. Caches the result for each search pattern, replacement pattern, target string tuple. This won't generate a regular expression parsing error since that is pre-compiled nowadays, but there can still be a runtime error from the replacement pattern, eg with a backreference referring to a nonexistent match group.
Convert an account name to a regular expression matching it but not its subaccounts.
Convert an account name to a regular expression matching it but not its subaccounts, case insensitively.
Convert an account name to a regular expression matching it and its subaccounts.
Convert an account name to a regular expression matching it and its subaccounts, case insensitively.
Regular expressions matching common English top-level account names, used as a fallback when account types are not declared.