Text package:base

String I/O functions The API of this module is unstable and not meant to be consumed by the general public. If you absolutely must depend on it, make sure to use a tight upper bound, e.g., base < 4.X rather than base < 5, because the interface can change rapidly without much warning.
Show the text as is.
A TextEncoding is a specification of a conversion scheme between sequences of bytes and sequences of Unicode characters. For example, UTF-8 is an encoding of Unicode characters into a sequence of bytes. The TextEncoding for UTF-8 is utf8.
a string that can be passed to mkTextEncoding to create an equivalent TextEncoding.
Wraps a particular exception exposing its ExceptionContext. Intended to be used when catching exceptions in cases where access to the context is desired.
Like try but also returns the exception context, which is useful if you intend to rethrow the exception later.
Exception context and annotations.
Exception context represents a list of ExceptionAnnotations. These are attached to SomeExceptions via addExceptionContext and can be used to capture various ad-hoc metadata about the exception including backtraces and application-specific context. ExceptionContexts can be merged via concatenation using the Semigroup instance or mergeExceptionContext. Note that GHC will automatically solve implicit constraints of type ExceptionContext with emptyExceptionContext.
Render ExceptionContext to a human-readable String.
An ExceptionContext containing no annotations.
Look up the named Unicode encoding. May fail with The set of known encodings is system-dependent, but includes at least:
  • UTF-8
  • UTF-16, UTF-16BE, UTF-16LE
  • UTF-32, UTF-32BE, UTF-32LE
There is additional notation (borrowed from GNU iconv) for specifying how illegal characters are handled:
  • a suffix of //IGNORE, e.g. UTF-8//IGNORE, will cause all illegal sequences on input to be ignored, and on output will drop all code points that have no representation in the target encoding.
  • a suffix of //TRANSLIT will choose a replacement character for illegal sequences or code points.
  • a suffix of //ROUNDTRIP will use a PEP383-style escape mechanism to represent any invalid bytes in the input as Unicode codepoints (specifically, as lone surrogates, which are normally invalid in UTF-32). Upon output, these special codepoints are detected and turned back into the corresponding original byte.
In theory, this mechanism allows arbitrary data to be roundtripped via a String with no loss of data. In practice, there are two limitations to be aware of:
  1. This only stands a chance of working for an encoding which is an ASCII superset, as for security reasons we refuse to escape any bytes smaller than 128. Many encodings of interest are ASCII supersets (in particular, you can assume that the locale encoding is an ASCII superset) but many (such as UTF-16) are not.
  2. If the underlying encoding is not itself roundtrippable, this mechanism can fail. Roundtrippable encodings are those which have an injective mapping into Unicode. Almost all encodings meet this criterion, but some do not. Notably, Shift-JIS (CP932) and Big5 contain several different encodings of the same Unicode codepoint.
On Windows, you can access supported code pages with the prefix CP; for example, "CP1250".
Creates a means of decoding bytes into characters: the result must not be shared between several byte sequences or simultaneously across threads
Creates a means of encode characters into bytes: the result must not be shared between several character sequences or simultaneously across threads