split package:split

Combinator library for splitting lists. A collection of various methods for splitting lists into parts, akin to the "split" function found in several mainstream languages. Here is its tale: Once upon a time the standard Data.List module held no function for splitting a list into parts according to a delimiter. Many a brave lambda-knight strove to add such a function, but their striving was in vain, for Lo, the Supreme Council fell to bickering amongst themselves what was to be the essential nature of the One True Function which could cleave a list in twain (or thrain, or any required number of parts). And thus came to pass the split package, comprising divers functions for splitting a list asunder, each according to its nature. And the Supreme Council had no longer any grounds for argument, for the favored method of each was contained therein. To get started, see the Data.List.Split module.
Split a list according to the given splitting strategy. This is how to "run" a Splitter that has been built using the other combinators.
The Data.List.Split module contains a wide range of strategies for splitting lists with respect to some sort of delimiter, mostly implemented through a unified combinator interface. The goal is to be flexible yet simple. See below for usage, examples, and detailed documentation of all exported functions. If you want to learn about the implementation, see Data.List.Split.Internals. A git repository containing the source (including a module with over 40 QuickCheck properties) can be found at https://github.com/byorgey/split.
Split on the given sublist. Equivalent to split . dropDelims . onSublist.
>>> splitOn ":" "12:35:07"
["12","35","07"]
>>> splitOn "x" "axbxc"
["a","b","c"]
>>> splitOn "x" "axbxcx"
["a","b","c",""]
>>> splitOn ".." "a..b...c....d.."
["a","b",".c","","d",""]
In some parsing combinator frameworks this is also known as sepBy. Note that this is the right inverse of the intercalate function from Data.List, that is,
intercalate x . splitOn x === id
splitOn x . intercalate x is the identity on certain lists, but it is tricky to state the precise conditions under which this holds. (For example, it is not enough to say that x does not occur in any elements of the input list. Working out why is left as an exercise for the reader.)
Split on any of the given elements. Equivalent to split . dropDelims . oneOf.
>>> splitOneOf ";.," "foo,bar;baz.glurk"
["foo","bar","baz","glurk"]
Split a list into chunks of the given lengths.
>>> splitPlaces [2,3,4] [1..20]
[[1,2],[3,4,5],[6,7,8,9]]
>>> splitPlaces [4,9] [1..10]
[[1,2,3,4],[5,6,7,8,9,10]]
>>> splitPlaces [4,9,3] [1..10]
[[1,2,3,4],[5,6,7,8,9,10]]
If the input list is longer than the total of the given lengths, then the remaining elements are dropped. If the list is shorter than the total of the given lengths, then the result may contain fewer chunks than requested, and the last chunk may be shorter than requested.
Split a list into chunks of the given lengths. Unlike splitPlaces, the output list will always be the same length as the first input argument. If the input list is longer than the total of the given lengths, then the remaining elements are dropped. If the list is shorter than the total of the given lengths, then the last several chunks will be shorter than requested or empty.
>>> splitPlacesBlanks [2,3,4] [1..20]
[[1,2],[3,4,5],[6,7,8,9]]
>>> splitPlacesBlanks [4,9] [1..10]
[[1,2,3,4],[5,6,7,8,9,10]]
>>> splitPlacesBlanks [4,9,3] [1..10]
[[1,2,3,4],[5,6,7,8,9,10],[]]
Notice the empty list in the output of the third example, which differs from the behavior of splitPlaces.
Split on elements satisfying the given predicate. Equivalent to split . dropDelims . whenElt.
>>> splitWhen (<0) [1,3,-4,5,7,-9,0,2]
[[1,3],[5,7],[0,2]]
>>> splitWhen (<0) [1,-2,3,4,-5,-6,7,8,-9]
[[1],[3,4],[],[7,8],[]]
Given a delimiter to use, split a list into an internal representation with chunks tagged as delimiters or text. This transformation is lossless; in particular,
concatMap fromElem (splitInternal d l) == l.
A splitting strategy.
Internal representation of a split list that tracks which pieces are delimiters and which aren't.
The default splitting strategy: keep delimiters in the output as separate chunks, don't condense multiple consecutive delimiters into one, keep initial and final blank chunks. Default delimiter is the constantly false predicate. Note that defaultSplitter should normally not be used; use oneOf, onSublist, or whenElt instead, which are the same as the defaultSplitter with just the delimiter overridden. The defaultSplitter strategy with any delimiter gives a maximally information-preserving splitting strategy, in the sense that (a) taking the concat of the output yields the original list, and (b) given only the output list, we can reconstruct a Splitter which would produce the same output list again given the original input list. This default strategy can be overridden to allow discarding various sorts of information.
Split over a different type of element by performing a preprocessing step.
>>> split (mapSplitter snd $ oneOf "-_") $ zip [0..] "a-bc_d"
[[(0,'a')],[(1,'-')],[(2,'b'),(3,'c')],[(4,'_')],[(5,'d')]]
>>> import Data.Char (toLower)

>>> split (mapSplitter toLower $ dropDelims $ whenElt (== 'x')) "abXcxd"
["ab","c","d"]