Read

Parsing of Strings, producing values. Derived instances of Read make the following assumptions, which derived instances of Show obey:
  • If the constructor is defined to be an infix operator, then the derived Read instance will parse only infix applications of the constructor (not the prefix form).
  • Associativity is not used to reduce the occurrence of parentheses, although precedence may be.
  • If the constructor is defined using record syntax, the derived Read will parse only the record-syntax form, and furthermore, the fields must be given in the same order as the original declaration.
  • The derived Read instance allows arbitrary Haskell whitespace between tokens of the input string. Extra parentheses are also allowed.
For example, given the declarations
infixr 5 :^:
data Tree a =  Leaf a  |  Tree a :^: Tree a
the derived instance of Read in Haskell 2010 is equivalent to
instance (Read a) => Read (Tree a) where

readsPrec d r =  readParen (d > app_prec)
(\r -> [(Leaf m,t) |
("Leaf",s) <- lex r,
(m,t) <- readsPrec (app_prec+1) s]) r

++ readParen (d > up_prec)
(\r -> [(u:^:v,w) |
(u,s) <- readsPrec (up_prec+1) r,
(":^:",t) <- lex s,
(v,w) <- readsPrec (up_prec+1) t]) r

where app_prec = 10
up_prec = 5
Note that right-associativity of :^: is unused. The derived instance in GHC is equivalent to
instance (Read a) => Read (Tree a) where

readPrec = parens $ (prec app_prec $ do
Ident "Leaf" <- lexP
m <- step readPrec
return (Leaf m))

+++ (prec up_prec $ do
u <- step readPrec
Symbol ":^:" <- lexP
v <- step readPrec
return (u :^: v))

where app_prec = 10
up_prec = 5

readListPrec = readListPrecDefault
Why do both readsPrec and readPrec exist, and why does GHC opt to implement readPrec in derived Read instances instead of readsPrec? The reason is that readsPrec is based on the ReadS type, and although ReadS is mentioned in the Haskell 2010 Report, it is not a very efficient parser data structure. readPrec, on the other hand, is based on a much more efficient ReadPrec datatype (a.k.a "new-style parsers"), but its definition relies on the use of the RankNTypes language extension. Therefore, readPrec (and its cousin, readListPrec) are marked as GHC-only. Nevertheless, it is recommended to use readPrec instead of readsPrec whenever possible for the efficiency improvements it brings. As mentioned above, derived Read instances in GHC will implement readPrec instead of readsPrec. The default implementations of readsPrec (and its cousin, readList) will simply use readPrec under the hood. If you are writing a Read instance by hand, it is recommended to write it like so:
instance Read T where
readPrec     = ...
readListPrec = readListPrecDefault
Converting strings to values. The Text.Read module is the canonical place to import for Read-class facilities. For GHC only, it offers an extended and much improved Read class, which constitutes a proposed alternative to the Haskell 2010 Read. In particular, writing parsers is easier, and the parsers are much more efficient.
The Read class and instances for basic data types.
Common internal functions for reading textual data.
Functions used frequently when reading textual data.
Functions used frequently when reading textual data.
Parsing of Strings, producing values. Derived instances of Read make the following assumptions, which derived instances of Show obey:
  • If the constructor is defined to be an infix operator, then the derived Read instance will parse only infix applications of the constructor (not the prefix form).
  • Associativity is not used to reduce the occurrence of parentheses, although precedence may be.
  • If the constructor is defined using record syntax, the derived Read will parse only the record-syntax form, and furthermore, the fields must be given in the same order as the original declaration.
  • The derived Read instance allows arbitrary Haskell whitespace between tokens of the input string. Extra parentheses are also allowed.
For example, given the declarations
infixr 5 :^:
data Tree a =  Leaf a  |  Tree a :^: Tree a
the derived instance of Read in Haskell 2010 is equivalent to
instance (Read a) => Read (Tree a) where

readsPrec d r =  readParen (d > app_prec)
(\r -> [(Leaf m,t) |
("Leaf",s) <- lex r,
(m,t) <- readsPrec (app_prec+1) s]) r

++ readParen (d > up_prec)
(\r -> [(u:^:v,w) |
(u,s) <- readsPrec (up_prec+1) r,
(":^:",t) <- lex s,
(v,w) <- readsPrec (up_prec+1) t]) r

where app_prec = 10
up_prec = 5
Note that right-associativity of :^: is unused. The derived instance in GHC is equivalent to
instance (Read a) => Read (Tree a) where

readPrec = parens $ (prec app_prec $ do
Ident "Leaf" <- lexP
m <- step readPrec
return (Leaf m))

+++ (prec up_prec $ do
u <- step readPrec
Symbol ":^:" <- lexP
v <- step readPrec
return (u :^: v))

where app_prec = 10
up_prec = 5

readListPrec = readListPrecDefault
Why do both readsPrec and readPrec exist, and why does GHC opt to implement readPrec in derived Read instances instead of readsPrec? The reason is that readsPrec is based on the ReadS type, and although ReadS is mentioned in the Haskell 2010 Report, it is not a very efficient parser data structure. readPrec, on the other hand, is based on a much more efficient ReadPrec datatype (a.k.a "new-style parsers"), but its definition relies on the use of the RankNTypes language extension. Therefore, readPrec (and its cousin, readListPrec) are marked as GHC-only. Nevertheless, it is recommended to use readPrec instead of readsPrec whenever possible for the efficiency improvements it brings. As mentioned above, derived Read instances in GHC will implement readPrec instead of readsPrec. The default implementations of readsPrec (and its cousin, readList) will simply use readPrec under the hood. If you are writing a Read instance by hand, it is recommended to write it like so:
instance Read T where
readPrec     = ...
readListPrec = readListPrecDefault
TextShow instance for Lexeme (and Number, if using a recent-enough version of base). Since: 2
Convert a human readable string to a physical value.
Parsing of Strings, producing values. Derived instances of Read make the following assumptions, which derived instances of Show obey:
  • If the constructor is defined to be an infix operator, then the derived Read instance will parse only infix applications of the constructor (not the prefix form).
  • Associativity is not used to reduce the occurrence of parentheses, although precedence may be.
  • If the constructor is defined using record syntax, the derived Read will parse only the record-syntax form, and furthermore, the fields must be given in the same order as the original declaration.
  • The derived Read instance allows arbitrary Haskell whitespace between tokens of the input string. Extra parentheses are also allowed.
For example, given the declarations
infixr 5 :^:
data Tree a =  Leaf a  |  Tree a :^: Tree a
the derived instance of Read in Haskell 2010 is equivalent to
instance (Read a) => Read (Tree a) where

readsPrec d r =  readParen (d > app_prec)
(\r -> [(Leaf m,t) |
("Leaf",s) <- lex r,
(m,t) <- readsPrec (app_prec+1) s]) r

++ readParen (d > up_prec)
(\r -> [(u:^:v,w) |
(u,s) <- readsPrec (up_prec+1) r,
(":^:",t) <- lex s,
(v,w) <- readsPrec (up_prec+1) t]) r

where app_prec = 10
up_prec = 5
Note that right-associativity of :^: is unused. The derived instance in GHC is equivalent to
instance (Read a) => Read (Tree a) where

readPrec = parens $ (prec app_prec $ do
Ident "Leaf" <- lexP
m <- step readPrec
return (Leaf m))

+++ (prec up_prec $ do
u <- step readPrec
Symbol ":^:" <- lexP
v <- step readPrec
return (u :^: v))

where app_prec = 10
up_prec = 5

readListPrec = readListPrecDefault
Generic implementation of Read

Warning

This is an internal module: it is not subject to any versioning policy, breaking changes can happen at any time. If something here seems useful, please report it or create a pull request to export it from an external module.
Tools for reading values in a CBOR-encoded format back into ordinary values.
This is the entry point to hledger's reading system, which can read Journals from various data formats. Use this module if you want to parse journal data or read journal files. Generally it should not be necessary to import modules below this one.

Journal reading

Reading an input file (in journal, csv, timedot, or timeclock format..) involves these steps:
  • select an appropriate file format "reader" based on filename extensionfile path prefixfunction parameter. A reader contains a parser and a finaliser (usually journalFinalise).
  • run the parser to get a ParsedJournal (this may run additional sub-parsers to parse included files)
  • run the finaliser to get a complete Journal, which passes standard checks
  • if reading multiple files: merge the per-file Journals into one overall Journal
  • if using -s/--strict: run additional strict checks
  • if running print --new: save .latest files for each input file. (import also does this, as its final step.)

Journal merging

Journal implements the Semigroup class, so two Journals can be merged into one Journal with j1 <> j2. This is implemented by the journalConcat function, whose documentation explains what merging Journals means exactly.

Journal finalising

This is post-processing done after parsing an input file, such as inferring missing information, normalising amount styles, checking for errors and so on - a delicate and influential stage of data processing. In hledger it is done by journalFinalise, which converts a preliminary ParsedJournal to a validated, ready-to-use Journal. This is called immediately after the parsing of each input file. It is not called when Journals are merged.

Journal reading API

There are three main Journal-reading functions:
  • readJournal to read from a Text value. Selects a reader and calls its parser and finaliser, then does strict checking if needed.
  • readJournalFile to read one file, or stdin if the file path is -. Uses the file path/file name to help select the reader, calls readJournal, then writes .latest files if needed.
  • readJournalFiles to read multiple files. Calls readJournalFile for each file (without strict checking or .latest file writing) then merges the Journals into one, then does strict checking and .latest file writing at the end if needed.
Each of these also has an easier variant with ' suffix, which uses default options and has a simpler type signature. One more variant, readJournalFilesAndLatestDates, is like readJournalFiles but exposing the latest transaction date (and how many on the same day) seen for each file. This is used by the import command.