Rope

A rope is a data strucure to efficiently store and manipulate long strings. Wikipedia provides a nice overview: https://en.wikipedia.org/wiki/Rope_(data_structure)
This module defines a rope data structure for use in Yi. This specific implementation uses a fingertree over Text. In contrast to our old implementation, we can now reap all the benefits of Text: automatic unicode handling and blazing fast implementation on underlying strings. This frees us from a lot of book-keeping. We don't lose out on not using ByteString directly because the old implementation encoded it into UTF8 anyway, making it unsuitable for storing anything but text.
Rope of Text chunks with logarithmic concatenation. This rope offers three interfaces: one based on code points, one based on UTF-16 code units, and one based on UTF-8 code units. This comes with a price of more bookkeeping and is less performant than Data.Text.Rope, Data.Text.Utf8.Rope, or Data.Text.Utf16.Rope.
Rope of Text chunks with logarithmic concatenation. This rope offers an interface, based on code points. Use Data.Text.Utf16.Rope, if you need UTF-16 code units, or Data.Text.Mixed.Rope, if you need both interfaces.
Rope of Text chunks with logarithmic concatenation. This rope offers an interface, based on UTF-16 code units. Use Data.Text.Rope, if you need code points, or Data.Text.Mixed.Rope, if you need both interfaces.
Rope of Text chunks with logarithmic concatenation. This rope offers an interface, based on UTF-8 code units. Use Data.Text.Rope, if you need code points, or Data.Text.Mixed.Rope, if you need both interfaces.
If you're accustomed to working with text in almost any other programming language, you'd be aware that a "string" typically refers to an in-memory array of characters. Traditionally this was a single ASCII byte per character; more recently UTF-8 variable byte encodings which dramatically complicates finding offsets but which gives efficient support for the entire Unicode character space. In Haskell, the original text type, String, is implemented as a list of Char which, because a Haskell list is implemented as a linked-list of boxed values, is wildly inefficient at any kind of scale. In modern Haskell there are two primary ways to represent text. First is via the [rather poorly named] ByteString from the bytestring package (which is an array of bytes in pinned memory). The Data.ByteString.Char8 submodule gives you ways to manipulate those arrays as if they were ASCII characters. Confusingly there are both strict (Data.ByteString) and lazy (Data.ByteString.Lazy) variants which are often hard to tell the difference between when reading function signatures or haddock documentation. The performance problem an immutable array backed data type runs into is that appending a character (that is, ASCII byte) or concatonating a string (that is, another array of ASCII bytes) is very expensive and requires allocating a new larger array and copying the whole thing into it. This led to the development of "builders" which amortize this reallocation cost over time, but it can be cumbersome to switch between Builder, the lazy ByteString that results, and then having to inevitably convert to a strict ByteString because that's what the next function in your sequence requires. The second way is through the opaque Text type of Data.Text from the text package, which is well tuned and high-performing but suffers from the same design; it is likewise backed by arrays. (Historically, the storage backing Text objects was encoded in UTF-16, meaning every time you wanted to work with unicode characters that came in from anywhere else and which inevitably were UTF-8 encoded they had to be converted to UTF-16 and copied into a further new array! Fortunately Haskell has recently adopted a UTF-8 backed Text type, reducing this overhead. The challenge of appending pinned allocations remains, however.) In this package we introduce Rope, a text type backed by the 2-3 FingerTree data structure from the fingertree package. This is not an uncommon solution in many languages as finger trees support exceptionally efficient appending to either end and good performance inserting anywhere else (you often find them as the backing data type underneath text editors for this reason). Rather than Char the pieces of the rope are ShortText from the text-short package, which are UTF-8 encoded and in normal memory managed by the Haskell runtime. Conversion from other Haskell text types is not O(1) (UTF-8 validity must be checked, or UTF-16 decoded, or...), but in our benchmarking the performance has been comparable to the established types and you may find the resultant interface for combining chunks is comparable to using a Builder, without being forced to use a Builder. Rope is used as the text type throughout this library. If you use the functions within this package (rather than converting to other text types) operations are quite efficient. When you do need to convert to another type you can use fromRope or intoRope from the Textual typeclass. Note that we haven't tried to cover the entire gamut of operations or customary convenience functions you would find in the other libraries; so far Rope is concentrated on aiding interoperation, being good at appending (lots of) small pieces, and then efficiently taking the resultant text object out to a file handle, be that the terminal console, a file, or a network socket.
A type for textual data. A rope is text backed by a tree data structure, rather than a single large continguous array, as is the case for strings. There are three use cases: Referencing externally sourced data Often we interpret large blocks of data sourced from external systems as text. Ideally we would hold onto this without copying the memory, but (as in the case of ByteString which is the most common source of data) before we can treat it as text we have to validate the UTF-8 content. Safety first. We also copy it out of pinned memory, allowing the Haskell runtime to manage the storage. Interoperating with other libraries The only constant of the Haskell universe is that you won't have the right combination of {strict, lazy} × {Text, ByteString, String, [Word8], etc} you need for the next function call. The Textual typeclass provides for moving between different text representations. To convert between Rope and something else use fromRope; to construct a Rope from textual content in another type use intoRope. You can get at the underlying finger tree with the unRope function. Assembling text to go out This involves considerable appending of data, very very occaisionally inserting it. Often the pieces are tiny. To add text to a Rope use the appendRope method as below or the (<>) operator from Data.Monoid (like you would have with a Builder). Output to a Handle can be done efficiently with hWrite.
A SplayTree of Text values optimised for being indexed by and modified at UTF-16 code units and row/column (RowColumn) positions. Internal invariant: No empty Chunks in the SplayTree
A SplayTree of Text values optimised for being indexed by and modified at UTF-16 code units and row/column (RowColumn) positions. Internal invariant: No empty Chunks in the SplayTree
Not on Stackage, so not searched. Tools for manipulating fingertrees of bytestrings with optional annotations
Construct a Rope out of a single ByteString strand.
Ropes optimised for updating using UTF-16 code units and row/column pairs. Ropes optimised for updating using UTF-16 code units and row/column pairs. This implementation uses splay trees instead of the usual finger trees. According to my benchmarks, splay trees are faster in most situations.
The function properFraction takes a real fractional number x and returns a pair (n,f) such that x = n+f, and:
  • n is an integral number with the same sign as x; and
  • f is a fraction with the same type and sign as x, and with absolute value less than 1.
The default definitions of the ceiling, floor, truncate and round functions are in terms of properFraction.