Converting Markup to HTML
One key part is missing before we can glue everything together, and that is
to convert our Markup
data types to Html
.
We'll start by creating a new module and importing both the Markup
and the Html
modules.
module Convert where
import qualified Markup
import qualified Html
Qualified Imports
This time, we've imported the modules qualified. Qualified imports mean that instead of exposing the names that we've defined in the imported module to the general module namespace, they now have to be prefixed with the module name.
For example, parse
becomes Markup.parse
.
If we would've imported Html.Internal
qualified, we'd have to write
Html.Internal.el
, which is a bit long.
We can also give the module a new name with the as
keyword:
import qualified Html.Internal as HI
And write HI.el
instead.
I like using qualified imports because readers do not have to guess where a
name comes from. Some modules are even designed to be imported qualified.
For example, the APIs of many container types, such as maps, sets, and vectors, are very similar.
If we want to use multiple containers in a single module, we pretty much have
to use qualified imports so that when we write a function such as singleton
,
which creates a container with a single value, GHC will know which singleton
function we are referring to.
Some people prefer to use import lists instead of qualified imports, because qualified names can be a bit verbose and noisy. I will often prefer qualified imports to import lists, but feel free to try both solutions and see which fits you better. For more information about imports, see this wiki article.
Converting Markup.Structure
to Html.Structure
Converting a markup structure to an HTML structure is mostly straightforward at this point, we need to pattern match on the markup structure and use the relevant HTML API.
convertStructure :: Markup.Structure -> Html.Structure
convertStructure structure =
case structure of
Markup.Heading 1 txt ->
Html.h1_ txt
Markup.Paragraph p ->
Html.p_ p
Markup.UnorderedList list ->
Html.ul_ $ map Html.p_ list
Markup.OrderedList list ->
Html.ol_ $ map Html.p_ list
Markup.CodeBlock list ->
Html.code_ (unlines list)
Notice that running this code with -Wall
will reveal that the pattern matching
is non-exhaustive. This is because we don't currently have a way to build
headings that are not h1
. There are a few ways to handle this:
- Ignore the warning - this will likely fail at runtime one day, and the user will be sad
- Pattern match other cases and add a nice error with the
error
function - it has the same disadvantage above, but will also no longer notify of the unhandled cases at compile time - Pattern match and do the wrong thing - user is still sad
- Encode errors in the type system using
Either
, we'll see how to do this in later chapters - Restrict the input - change
Markup.Heading
to not include a number but rather specific supported headings. This is a reasonable approach - Implement an HTML function supporting arbitrary headings. Should be straightforward to do
What are these
$
?The dollar sign (
$
) is an operator that we can use to group expressions, like we do with parenthesis. we can replace the$
with invisible parenthesis around the expressions to the left of it, and around the expression to the right of it. So that:Html.ul_ $ map Html.p_ list
is understood as:
(Html.ul_) (map Html.p_ list)
It is a function application operator, it applies the argument on the right of the dollar to the function on the left of the dollar.
$
is right-associative and has very low precedence, which means that: it groups to the right, and other operators bind more tightly. For example the following expression:filter (2<) $ map abs $ [-1, -2, -3] <> [4, 5, 6]
is understood as:
(filter (2<)) ((map abs) ([1, -2, 3] <> [-4, 5, 6]))
Which is also equivalent to the following code with less parenthesis:
filter (2<) (map abs ([1, -2, 3] <> [-4, 5, 6]))
See how information flows from right to left and that
<>
binds more tightly?This operator is fairly common in Haskell code and it helps us reduce some clutter, but feel free to avoid it in favor of parenthesis if you'd like, it's not like we're even saving keystrokes with
$
!
Exercise: Implement h_ :: Natural -> String -> Structure
which we'll use to define arbitrary headings (such as <h1>
, <h2>
, and so on).
Solution
import Numeric.Natural
h_ :: Natural -> String -> Structure
h_ n = Structure . el ("h" <> show n) . escape
Don't forget to export it from Html.hs
!
Exercise: Fix convertStructure
using h_
.
Solution
convertStructure :: Markup.Structure -> Html.Structure
convertStructure structure =
case structure of
Markup.Heading n txt ->
Html.h_ n txt
Markup.Paragraph p ->
Html.p_ p
Markup.UnorderedList list ->
Html.ul_ $ map Html.p_ list
Markup.OrderedList list ->
Html.ol_ $ map Html.p_ list
Markup.CodeBlock list ->
Html.code_ (unlines list)
Document -> Html
To create an Html
document, we need to use the html_
function.
This function expects two things: a Title
and a Structure
.
For a title, we could just supply it from outside using the file name.
To convert our markup Document
(which is a list of markup Structure
)
to an HTML Structure
, we need to convert each markup Structure
and then
concatenate them together.
We already know how to convert each markup Structure
; we can use the
convertStructure
function we wrote and map
. This will provide
us with the following function:
map convertStructure :: Markup.Document -> [Html.Structure]
To concatenate all of the Html.Structure
, we could try to write a recursive
function. However, we will quickly run into an issue
with the base case: what to do when the list is empty?
We could just provide a dummy Html.Structure
that represents an empty
HTML structure.
Let's add this to Html.Internal
:
empty_ :: Structure
empty_ = Structure ""
Now we can write our recursive function. Try it!
Solution
concatStructure :: [Structure] -> Structure
concatStructure list =
case list of
[] -> empty_
x : xs -> x <> concatStructure xs
Remember the <>
function we implemented as an instance of the Semigroup
type class? We mentioned that Semigroup
is an abstraction for things
that implements (<>) :: a -> a -> a
, where <>
is associative
(a <> (b <> c) = (a <> b) <> c
).
It turns out that having an instance of Semigroup
and also having a value that represents
an "empty" value is a fairly common pattern. For example, a string can be concatenated,
and the empty string can serve as an "empty" value.
And this is actually a well known abstraction called monoid.
Monoids
Actually, "empty" isn't a very good description of what we want, and isn't very useful as an abstraction. Instead, we can describe it as an "identity" element that satisfies the following laws:
x <> <identity> = x
<identity> <> x = x
In other words, if we try to use this "empty" - this identity value,
as one argument to <>
, we will always get the other argument back.
For String
, the empty string, ""
, satisfies this:
"" <> "world" = "world"
"hello" <> "" = "hello"
This is, of course, true for any value we'd write and not just "world" and "hello".
Actually, if we move out of the Haskell world for a second, even integers
with +
as the associative binary operations +
(in place of <>
)
and 0
in place of the identity member form a monoid:
17 + 0 = 17
0 + 99 = 99
So integers together with the +
operation form a semigroup, and
together with 0
form a monoid.
We learn new things from this:
- A monoid is a more specific abstraction over semigroup; it builds on it by adding a new condition (the existence of an identity member)
- This abstraction can be useful! We can write a general
concatStructure
that could work for any monoid
And indeed, there exists a type class in base
called Monoid
, which has
Semigroup
as a super class.
class Semigroup a => Monoid a where
mempty :: a
Note: this is a simplified version. The actual is a bit more complicated because of backward compatibility and performance reasons.
Semigroup
was actually introduced in Haskell afterMonoid
!
We could add an instance of Monoid
for our HTML Structure
data type:
instance Monoid Structure where
mempty = empty_
And now, instead of using our own concatStructure
, we can use the library function:
mconcat :: Monoid a => [a] -> a
Which could theoretically be implemented as:
mconcat :: Monoid a => [a] -> a
mconcat list =
case list of
[] -> mempty
x : xs -> x <> mconcat xs
Notice that because Semigroup
is a super class of Monoid
,
we can still use the <>
function from the Semigroup
class
without adding the Semigroup a
constraint to the left side of =>
.
By adding the Monoid a
constraint, we implicitly add a Semigroup a
constraint as well!
This mconcat
function is very similar to the concatStructure
function,
but this one works for any Monoid
, including Structure
!
Abstractions help us identify common patterns and reuse code!
Side note: integers with
+
and0
aren't actually an instance ofMonoid
in Haskell. This is because integers can also form a monoid with*
and1
! But there can only be one instance per type. Instead, two othernewtype
s exist that provide that functionality, Sum and Product. See how they can be used inghci
:ghci> import Data.Monoid ghci> Product 2 <> Product 3 -- note, Product is a data constructor Product {getProduct = 6} ghci> getProduct (Product 2 <> Product 3) 6 ghci> getProduct $ mconcat $ map Product [1..5] 120
Another abstraction?
We've used map
and then mconcat
twice now. Surely there has to be a function
that unifies this pattern. And indeed, it is called
foldMap
,
and it works not only for lists but also for any data structure that can be "folded",
or "reduced", into a summary value. This abstraction and type class is called Foldable.
For a simpler understanding of Foldable
, we can look at fold
:
fold :: (Foldable t, Monoid m) => t m -> m
-- compare with
mconcat :: Monoid m => [m] -> m
mconcat
is just a specialized version of fold
for lists.
And fold
can be used for any pair of a data structure that implements
Foldable
and a payload type that implements Monoid
. This
could be []
with Structure
, or Maybe
with Product Int
, or
your new shiny binary tree with String
as the payload type. But note that
the Foldable
type must be of kind * -> *
. So, for example Html
cannot be a Foldable
.
foldMap
is a function that allows us to apply a function to the
payload type of the Foldable
type right before combining them
with the <>
function.
foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
-- compare to a specialized version with:
-- - t ~ []
-- - m ~ Html.Structure
-- - a ~ Markup.Structure
foldMap
:: (Markup.Structure -> Html.Structure)
-> [Markup.Structure]
-> Html.Structure
True to its name, it really "maps" before it "folds". You might pause here
and think, "this 'map' we are talking about isn't specific for lists; maybe
that's another abstraction?" Yes. It is actually a very important and
fundamental abstraction called Functor
.
But I think we had enough abstractions for this chapter.
We'll cover it in a later chapter!
Finishing our conversion module
Let's finish our code by writing convert
:
convert :: Html.Title -> Markup.Document -> Html.Html
convert title = Html.html_ title . foldMap convertStructure
Now we have a full implementation and can convert markup documents to HTML:
-- Convert.hs
module Convert where
import qualified Markup
import qualified Html
convert :: Html.Title -> Markup.Document -> Html.Html
convert title = Html.html_ title . foldMap convertStructure
convertStructure :: Markup.Structure -> Html.Structure
convertStructure structure =
case structure of
Markup.Heading n txt ->
Html.h_ n txt
Markup.Paragraph p ->
Html.p_ p
Markup.UnorderedList list ->
Html.ul_ $ map Html.p_ list
Markup.OrderedList list ->
Html.ol_ $ map Html.p_ list
Markup.CodeBlock list ->
Html.code_ (unlines list)
Summary
We learned about:
- Qualified imports
- Ways to handle errors
- The
Monoid
type class and abstraction - The
Foldable
type class and abstraction
Next, we are going to glue our functionality together and learn about I/O in Haskell!
You can view the git commit of the changes we've made and the code up until now.