This post announces the new version 0.2.0.0 of the safe-coloured-text library. The safe-coloured-text library lets you safely output coloured text to a terminal. The idea for version 0.2.0.0 came from a very smart and annoyingly sensible comment on reddit. The first (0.1.0.0) version made the now-considered-erroneous decision to require the user to use UTF8. The newest (0.2.0.0) version relaxes that requirement by using Text instead of ByteString.
A quick primer on character encodings
Human language is seriously complex. Representing human language in computers is even more complex. This has to do with human history but more importantly also the history of computing and (historical) efficiency requirements. Here is an extremely simplified summary.
Human text consists of Characters.
[*]Unicode assigns a number to every
[*]character. We call these numbers code points.A character encoding lets you map a sequence of code points from and to a sequence of octets (bytes
[*]).We would like encodings to be efficient for common use-cases like "English text only" or "Text with European languages only".
[*]UTF8is a common encoding that is a good compromise for most use-cases.UTF8is not the standard everywhere, and even Haskell'stextpackage usedUTF16internally until recently.Systems try to specify the encoding that they want programs to use in various ways like, for example, the
LANGenvironment variable.
[*]: Not really, but we've more or less been able to pretend so anyway.
Relevant Haskell types
With that in mind, these are the relevant types in Haskell:
Char: A unicode code pointString: A list ofChars:type String = [Char].Text: LikeString, but more performant for most use-cases. (Text also doesn't support certain code points, like unmatched UTF16 surrogate code points, in versions beforetext-2.0.)ByteString: Like[Word8], but more performant for most use-cases.
These types are different for Real and Important reasons. Some examples include:
Programmers often want to be able to talk about single
Chars.Lists are a fundamental data (and control) structure in Haskell
In order for
Textto roundtrip withByteString, one must choose an encoding.For some encodings, not every
ByteStringrepresents a valid encoding of a sequence of characters. (UTF8, for example.) This means that decoding must be able to fail.
Notable changes
Version 0.2.0.0 of the safe-coloured-text library
The default output of the safe-coloured-text library is now Text instead of ByteString. Existing functions are deprecated according to the following scheme:
renderChunksis now a deprecated synonym ofrenderChunksUtf8BSBuilder.renderChunksUtf8BSBuilderis a new function that outputs aByteString.Builder.renderChunksBuilderis a new function that outputs aText.Builder.renderChunksTextis a new function that outputs aText.renderChunksBSis now a deprecated synonym ofrenderChunksUtf8BS.renderChunksUtf8is a new function that outputs aByteString.
Note that the new version of the library requires you to choose an encoding in order to continue outputting raw bytes, but does not break reverse dependencies that want to keep using renderChunks or renderChunksBS.
Version 0.2.0.0 of the autodocodec-yaml library
The autodocodec-yaml library lets you output a schema for a JSON (and YAML) codec in a nice and colourful way. The functions that output these nicely coloured schemas now produce Text values instead of ByteStrings.
Version 0.11.0.0 of the sydtest library
The sydtest testing framework now tries to respect the system's locale by using the functions in Data.Text.IO instead of outputting UTF8 bytes directly.