This module provides functions to parse an XML document to a tree
structure, either strictly or lazily.
The
GenericXMLString type class allows you to use any string
type. Three string types are provided for here:
String,
ByteString and
Text.
Here is a complete example to get you started:
-- | A "hello world" example of hexpat that lazily parses a document, printing
-- it to standard out.
import Text.XML.Expat.Tree
import Text.XML.Expat.Format
import System.Environment
import System.Exit
import System.IO
import qualified Data.ByteString.Lazy as L
main = do
args <- getArgs
case args of
[filename] -> process filename
otherwise -> do
hPutStrLn stderr "Usage: helloworld <file.xml>"
exitWith $ ExitFailure 1
process :: String -> IO ()
process filename = do
inputText <- L.readFile filename
-- Note: Because we're not using the tree, Haskell can't infer the type of
-- strings we're using so we need to tell it explicitly with a type signature.
let (xml, mErr) = parse defaultParseOptions inputText :: (UNode String, Maybe XMLParseError)
-- Process document before handling error, so we get lazy processing.
L.hPutStr stdout $ format xml
putStrLn ""
case mErr of
Nothing -> return ()
Just err -> do
hPutStrLn stderr $ "XML parse failed: "++show err
exitWith $ ExitFailure 2
Error handling in strict parses is very straightforward - just check
the
Either return value. Lazy parses are not so simple. Here
are two working examples that illustrate the ways to handle errors.
Here they are:
Way no. 1 - Using a Maybe value
import Text.XML.Expat.Tree
import qualified Data.ByteString.Lazy as L
import Data.ByteString.Internal (c2w)
-- This is the recommended way to handle errors in lazy parses
main = do
let (tree, mError) = parse defaultParseOptions
(L.pack $ map c2w $ "<top><banana></apple></top>")
print (tree :: UNode String)
-- Note: We check the error _after_ we have finished our processing
-- on the tree.
case mError of
Just err -> putStrLn $ "It failed : "++show err
Nothing -> putStrLn "Success!"
Way no. 2 - Using exceptions
parseThrowing can throw an exception from pure code, which is
generally a bad way to handle errors, because Haskell's lazy
evaluation means it's hard to predict where it will be thrown from.
However, it may be acceptable in situations where it's not expected
during normal operation, depending on the design of your program.
...
import Control.Exception.Extensible as E
-- This is not the recommended way to handle errors.
main = do
do
let tree = parseThrowing defaultParseOptions
(L.pack $ map c2w $ "<top><banana></apple></top>")
print (tree :: UNode String)
-- Because of lazy evaluation, you should not process the tree outside
-- the 'do' block, or exceptions could be thrown that won't get caught.
`E.catch` (\exc ->
case E.fromException exc of
Just (XMLParseException err) -> putStrLn $ "It failed : "++show err
Nothing -> E.throwIO exc)