This is the entry point to hledger's reading system, which can read
Journals from various data formats. Use this module if you want to
parse journal data or read journal files. Generally it should not be
necessary to import modules below this one.
Journal reading
Reading an input file (in journal, csv, timedot, or timeclock
format..) involves these steps:
- select an appropriate file format "reader" based on filename
extensionfile path prefixfunction parameter. A reader contains
a parser and a finaliser (usually journalFinalise).
- run the parser to get a ParsedJournal (this may run additional
sub-parsers to parse included files)
- run the finaliser to get a complete Journal, which passes standard
checks
- if reading multiple files: merge the per-file Journals into one
overall Journal
- if using -s/--strict: run additional strict checks
- if running print --new: save .latest files for each input file.
(import also does this, as its final step.)
Journal merging
Journal implements the Semigroup class, so two Journals can be merged
into one Journal with
j1 <> j2. This is implemented by
the
journalConcat function, whose documentation explains what
merging Journals means exactly.
Journal finalising
This is post-processing done after parsing an input file, such as
inferring missing information, normalising amount styles, checking for
errors and so on - a delicate and influential stage of data
processing. In hledger it is done by
journalFinalise, which
converts a preliminary ParsedJournal to a validated, ready-to-use
Journal. This is called immediately after the parsing of each input
file. It is not called when Journals are merged.
Journal reading API
There are three main Journal-reading functions:
- readJournal to read from a Text value. Selects a reader and calls
its parser and finaliser, then does strict checking if needed.
- readJournalFile to read one file, or stdin if the file path is
-. Uses the file path/file name to help select the reader,
calls readJournal, then writes .latest files if needed.
- readJournalFiles to read multiple files. Calls readJournalFile for
each file (without strict checking or .latest file writing) then
merges the Journals into one, then does strict checking and .latest
file writing at the end if needed.
Each of these also has an easier variant with ' suffix, which uses
default options and has a simpler type signature.
One more variant,
readJournalFilesAndLatestDates, is like
readJournalFiles but exposing the latest transaction date (and how
many on the same day) seen for each file. This is used by the import
command.