There has been a lot of discussion recently about free (and operational) monads. And perhaps you know what a free monad is. But you may still be confused as to when or why.

You might wonder if the free and operational monads actually solve any useful problems in the real world, or if they are just theoretical wanking.

In this post I will show how we can rewrite the routing combinators in Happstack to use the Free and operational monads and in the process solve a couple real world problems.

Happstack is a Haskell web programming toolkit. There are two url routing systems you can use in Happstack. web-routes provides a flexible system for type-safe URL routing. But we also have an older system that works around simple string based combinators. In this post, we are going to look into improvements we can make to the simple, combinator based approach.

To provide basic url routing you need three things:

  1. match - a way to match on a static path segment.
  2. capture - a way to capture a path segment and try to decode it.
  3. choice - a way to pick from multiple alternatives.

match and capture work on entire path segments. If we have a url like:

/foo/bar/baz

we first split it on the / and then decode the path segments to get a list like:

["foo","bar","baz"]

That is what gets feed into the routing system.

match is really just a special case of capture. But we will keep it as a separate case for two reasons:

  1. the implementation of match is easier to understand than capture
  2. there are optimizations we can only perform when match is a separate case

Version 1: Monad Transformers

The traditional way to implement a router like this is by using some common monads and monad transformers. So let's start with that.

First we need some imports:

> {-# LANGUAGE DeriveFunctor, GADTs, GeneralizedNewtypeDeriving, ExistentialQuantification #-}
> import Control.Monad             (MonadPlus(mzero), msum, join)
> import Control.Monad.State       (StateT, MonadState(get, put), evalStateT, modify)
> import Control.Monad.Free        (Free(Pure, Free), liftF)
> import Control.Monad.Operational (Program(..), ProgramT(..), ProgramView, ProgramViewT(Return, (:>>=)), singleton, view)
> import Data.List                 (groupBy)
> import Data.Maybe                (isNothing)
> import Text.PrettyPrint.HughesPJ (Doc, (<+>), ($+$), (<>), char, doubleQuotes, nest, space, text, vcat, empty)
> import Text.Show.Functions       () -- instance Show (a -> b)

You will need to install the free and operational libraries from hackage (used in later sections).

Next we define a newtype for our routing monad:

> newtype RouteMT a = RouteMT { unRoute :: StateT [String] Maybe a }
>     deriving (Functor, Monad, MonadPlus, MonadState [String])
>
> runRouteMT :: RouteMT a -> [String] -> Maybe a
> runRouteMT route paths = evalStateT (unRoute route) paths

MT is short for MonadTransformer here. Our RouteMT monad is created by combining two familiar monads: State and Maybe. StateT [String] contains the path segments in the url. Everytime we successfully consume a path segment, we pop it off the list. We use String instead of Text just to keep this blog post simple. A real implementation would probably use Text.

Maybe is used to indicate failure. We can use its MonadPlus instance to provide the choice operation.

> choiceMT :: [RouteMT a] -> RouteMT a
> choiceMT = msum

So, we need only implement match and capture. We can define match as:

> matchMT :: String  -- ^ path segment to match on
>         -> RouteMT ()
> matchMT p' =
>     do paths <- get
>        case paths of
>          (p:ps) | p == p' -> put ps
>          _                -> mzero

If the path matches, then we pop it off the stack, otherwise we call mzero.

We can implement capture as:

> captureMT :: (String -> Maybe a) -- ^ function to decode path segment
>           -> RouteMT a
> captureMT parse =
>     do paths <- get
>        case paths of
>          (p : ps) ->
>              case parse p of
>                Nothing  -> mzero
>                (Just a) -> return a
>          _        ->
>              mzero

capture is very much like match except we use the supplied parsing function instead of plain old ==.

We will also want a helper function so that we can use read with captureMT:

> readMaybe :: (Read a) => String -> Maybe a
> readMaybe s =
>    case reads s of
>      [(n,[])] -> Just n
>      _        -> Nothing

Now we can implement a simple route:

> route1MT :: RouteMT String
> route1MT =
>     choiceMT [ do matchMT "foo"
>                   i <- captureMT readMaybe
>                   return $ "You are looking at /foo/" ++ show (i :: Int)
>              , do matchMT "bar"
>                   i <- captureMT readMaybe
>                   return $ "You are looking at /bar/" ++ show (i :: Double)
>              , do matchMT "foo"
>                   matchMT "cat"
>                   return $ "You are looking at /foo/cat"
>              ]

Being lazy programmers, we will define some unit tests rather than a formal proof of correctness. Here is our simple test function:

> testRouteMT :: (Eq a) => RouteMT a -> [([String], Maybe a)] -> Bool
> testRouteMT r tests =
>     all (\(paths, result) -> (runRouteMT r paths) == result) tests

And to make things pretty, we will define ==> as an alias for (,):

> (==>) :: a -> b -> (a,b)
> a ==> b = (a, b)

Now we can write down our unit tests for route:

> route1_results =
>       [ ["foo", "1"]     ==> Just "You are looking at /foo/1"
>       , ["foo", "cat"]   ==> Just "You are looking at /foo/cat"
>       , ["bar", "3.141"] ==> Just "You are looking at /bar/3.141"
>       , ["baz"]          ==> Nothing
>       ]

and combining it all together:

> route1MT_test =
>     testRouteMT route1MT route1_results

Version 2: Interpreter Approach

While the monad transformer code works fine, there are ways it could be better:

  1. In order to find the matching route, it has to start at the top of the list and work all the way to the bottom until it finds a match or gets to the end of the list. For example, we have one route that starts with "/foo" at the top and another that starts with "/foo" at the bottom. Additionally, if we match on "bar" but fail to decode the next path as an Int, there is no point in trying any additional routes, because no other routes start with "/bar". But there is no way to impart that information into the router.

  2. when routes fail, there is no record of why it failed. We just get back Nothing. On a live site, that is fine, but during development, you sometimes do care.

We can solve both these issues by using a data-type to build the router instead:

> data Route' a
>     = Match' String (Route' a)
>     | forall b. Capture' (String -> Maybe b) (b -> Route' a)
>     | Choice' [Route' a]
>     | Term' a

The last argument of the Match' and Capture' constructors is what to do next if that match or capture succeeds.

We can create Functor and Monad instances for the Route' type:

> instance Functor Route' where
>     fmap f (Match' s r)   = Match' s (fmap f r)
>     fmap f (Capture' p r) = Capture' p (\b -> fmap f (r b))
>     fmap f (Choice' rs)   = Choice' (map (fmap f) rs)
>     fmap f (Term' a)      = Term' (f a)
>
> instance Monad Route' where
>     return a = Term' a
>     (Term' a)      >>= f = f a
>     (Match' str r) >>= f = Match' str (r >>= f)
>     (Choice' rs)   >>= f = Choice' (map (\r -> r >>= f) rs)
>     (Capture' p r) >>= f = Capture' p (\b -> r b >>= f)

These instances can be a little tricky to understand at first. You might want to finish this section and then come back to them after you have seen the bigger picture and some examples.

Our routing functions no longer do any real work directly. Instead they just construct Route' values:

> match' :: String -> Route' ()
> match' p = Match' p (Term' ())
> capture' :: (String -> Maybe b) -> Route' b
> capture' p = Capture' p (\b -> Term' b)
> choice' :: [Route' a] -> Route' a
> choice' = Choice'

Now all the real work happens in runRoute':

> runRoute' :: Route' a -> [String] -> Maybe a
> runRoute' (Term' a) _ = Just a
> runRoute' (Match' p' r) (p:ps) | p == p' = runRoute' r ps
> runRoute' (Match' _ _)  _                = Nothing
> runRoute' (Choice' []) _ = Nothing
> runRoute' (Choice' (r:rs)) paths =
>     case runRoute' r paths of
>       (Just a) -> Just a
>       Nothing  -> runRoute' (Choice' rs) paths
> runRoute' (Capture' parse r) (p:ps) =
>     case parse p of
>       Nothing  -> Nothing
>       (Just b) -> runRoute' (r b) ps

We can test a Route' with runRoute' and see that it acts just like RouteMT. We can reimplement route1MT using the new functions. The only difference is that the names have been changed from fooMT to foo'. If we did not change the names then the new implementation would be a drop-in replacement for the old code:

> route1' :: Route' String
> route1' =
>     choice' [ do match' "foo"
>                  i <- capture' readMaybe
>                  return $ "You are looking at /foo/" ++ show (i :: Int)
>             , do match' "bar"
>                  i <- capture' readMaybe
>                  return $ "You are looking at /bar/" ++ show (i :: Double)
>             , do match' "foo"
>                  match' "cat"
>                  return $ "You are looking at /foo/cat"
>             ]
> testRoute' :: (Eq a) => Route' a -> [([String], Maybe a)] -> Bool
> testRoute' r tests =
>     all (\(paths, result) -> (runRoute' r paths) == result) tests
> route1'_test =
>     testRoute' route1' route1_results

If you look at runRoute' closely you will notice that we don't ever pass back the unconsumed path segments. So you might wonder how something like this could work:

> route2' :: Route' ()
> route2' =
>     do choice' [ match' "foo"
>                , match' "bar"
>                ]
>        match' "baz"

Specifically, after choice' has successfully matched on "foo" or "bar", how does the match' function get access to the remaining path segments?

If we expand the functions and monad operations, though, the answer becomes clearer. First let's substitute in the match', capture', and choice' operations:

> route2'Expanded'1 :: Route' ()
> route2'Expanded'1 =
>     do Choice' [ Match' "foo" (Term' ())
>                     , Match' "bar" (Term' ())
>                     ]
>        Match' "baz" (Term' ())

Next let's desugar the monad syntax:

> route2'Expanded'2 :: Route' ()
> route2'Expanded'2 =
>     (Choice' [ Match' "foo" (Term' ())
>             , Match' "bar" (Term' ())
>             ]) >>=
>     \_ -> Match' "baz" (Term' ())

And now we can substitute the >>= using this rule from the Monad instance:

(Choice' rs)   >>= f = Choice' (map (\r -> r >>= f) rs)
> route2'Expanded'3 :: Route' ()
> route2'Expanded'3 =
>     (Choice' (map (\r -> r >>= \_ -> Match' "baz" (Term' ()))
>               [ Match' "foo" (Term' ())
>               , Match' "bar" (Term' ())
>               ]))

substituting the map gives us:

> route2'Expanded'4 :: Route' ()
> route2'Expanded'4 =
>     (Choice' [ Match' "foo" (Term' ()) >>= \_ -> Match' "baz" (Term' ())
>              , Match' "bar" (Term' ()) >>= \_ -> Match' "baz" (Term' ())
>              ])

now we can apply this rule to expand the remaining >>=:

(Match' str r) >>= f = Match' str (r >>= f)
> route2'Expanded'5 :: Route' ()
> route2'Expanded'5 =
>     (Choice' [ Match' "foo" (Term' () >>= \_ -> Match' "baz" (Term' ()))
>              , Match' "bar" (Term' () >>= \_ -> Match' "baz" (Term' ()))
>              ])

next we can apply this rule:

(Term' a)      >>= f = f a
> route2'Expanded'6 :: Route' ()
> route2'Expanded'6 =
>     (Choice' [ Match' "foo" ((\_ -> Match' "baz" (Term' ())) ())
>              , Match' "bar" ((\_ -> Match' "baz" (Term' ())) ())
>              ])

and finally, we can apply the \_ -> to () which gives us:

> route2'Expanded'7 :: Route' ()
> route2'Expanded'7 =
>     (Choice' [ Match' "foo" (Match' "baz" (Term' ()))
>              , Match' "bar" (Match' "baz" (Term' ()))
>              ])

So, we can see here that the monad syntax is just used to build a tree. Each valid parse is represented as a straight path from the root to a leaf. So, we need to pass the remaining segments in as we travel down the tree. But we don't have to worry about coming back up again, so the recursive runRoute' calls don't have to return the unconsumed path segments to the callers.

Alternative Interpreters

The monad transformer version was a lot shorter to write, and easy to understand. For people studying the implementation, it leverages existing knowledge about the State and Maybe monads.

The new interpreter version has a far more complex type and the runRoute' function is harder to understand. For the end user, none of this really matters, because the API is exactly the same -- we only changed the names so we could shove this entire example in a single literate Haskell file.

So, what have we gained?

One thing we can do is write alternative interpreters which address the two complaints we had about the monad transformer based direct implementation.

For example, we can analyze the Route' type and optimize the routes. As an example, we can rewrite this:

choice' [ do match' "foo"
             match' "bar"
             return "/foo/bar"
        , do match' "foo"
             match' "baz"
             return "/foo/baz"
        ]

to this:

choice' [ do match' "foo"
             choice' [ do match' "bar"
                          return "/foo/bar"
                     , do match' "baz"
                          return "/foo/baz"
                     ]
        ]

Because routes with the same prefix are now nested, we do not need to do any backtracking. if "foo" matches, but "bar" and "baz" fail, we do not need to backtrack and see if there are any other routes that start with "foo". We could implement this as an alternative function to runRoute' leaving runRoute' still intact.

We could also implement a debugRoute function that shows us what path we tried to match at each step and whether it succeeded or not. We are not going to implement these functions quite yet though.

Version 3: Free Monad

As we saw, the Route' type is essentially building a specialized tree with the values at the leaves. As Haskell users, we like to abstract and reuse things. What if we could get rid of the explicit recursion in the Route' type and get a valid Monad instance with out having to do any real work? That should simplify our code, and reduce the chances of introducing a bug. This is where the Free monad comes into play. The Free type is defined as:

data Free f a = Pure a | Free (f (Free f a))

If we look at that type we can see how we might be able to use the Pure constructor for the values in the leaves, and the Free constructor to provide the recusion. So, now we can define a non-recursive type that just operates on a single path segment.

Using the GADT syntax makes things a bit prettier, because the constructor types look like the related function types:

> data Segment a where
>     Match   :: String -> a         -> Segment a
>     Capture :: (String -> Maybe a) -> Segment a
>     Choice  :: [a]                 -> Segment a
>     Zero    ::                        Segment a
>     deriving (Functor, Show)

we can be extra lazy and derive the Functor instance automatically (and correctly!).

Compared to the Route' type, we see that the constructors are a little simpler now. For example, Match takes the String to match on and the value to return on success. But, we do not have to explicitly spell out the recursion. And, because Capture does not have a forall b. anymore, we can use the DeriveFunctor extension to derive the Functor instance automatically.

While we do not explicitly have recursion in the Segment type -- we do leave holes where recursion can happen. For example we can write:

Match "foo" (Match "bar" Zero)

Free already has a Monad instance, so to make a Monad out of Segment we can just use a type alias:

> type Route = Free Segment

So, this is pretty nice! We got valid Functor and Monad instances for free! You might think that is why it is called the Free monad -- and it sort of is. The term free actually comes from abstract algebra and category theory -- and they have some other idea about what the free part is.

All sorts of things like monoids, functors, monads, etc can be free. Something is free if it satisfies exactly the required laws but nothing extra. In our example, we created the Route monad by just making the type alias type Route = Free Segment. By design the Haskell Free monad doesn't do anything except satisfy the monad laws. And adding the type alias type Route = Free Segment is clearly not going to suddenly make it do more things. So, presumably Route is free as well. Yes, it really is that simple.

There are a bunch of other blog posts and wiki pages about the underlying theory, so we are just going to move on. We are aiming for gaining an hands-on understanding in this post, not a theoretical one.

We define the routing combinators similar to how we did for Route':

> -- | match on a static path segment
> match :: String
>       -> Route ()
> match p = liftF (Match p ())

liftF has the type:

liftF :: Functor f => f a -> Free f a

we could have written match as:

match p = Free (Match p (Pure ()))

but liftF gets rid of some of the noise for us. The other combinators are pretty much the same:

> -- | match on a path segment and attempt to convert it to a type
> capture :: (String -> Maybe a)
>     -> Route a
> capture convert =
>     liftF (Capture convert)
> -- | try several routes, using the first that succeeds
> choice :: [Route a]
>        -> Route a
> choice a = join $ liftF (Choice a)
> -- | a route that will always fail
> zero :: Route a
> zero = liftF Zero

To perform the routing, we create a runRoute function like before:

> -- | run a route, full backtracking on failure
> runRoute :: Route a -> [String] -> Maybe a
> runRoute (Pure a) _  = Just a
> runRoute _        [] = Nothing
> runRoute (Free (Match p' r)) (p:ps)
>     | p == p'   = runRoute r ps
>     | otherwise = Nothing
> runRoute (Free (Capture convert)) (p:ps) =
>     case convert p of
>       Nothing  -> Nothing
>       (Just r) -> runRoute r ps
> runRoute (Free (Choice choices)) paths =
>     msum $ map (flip runRoute paths) choices
> runRoute (Free Zero) _ =
>     Nothing

You'll note that this runRoute function looks quite a bit like the previous runRoute' function. It does contain a bit of extra noise because we of the Free constructors.

As before, the API remains unchanged (aside from renames to avoid name clashes):

> route1Free :: Route String
> route1Free =
>     choice [ do match "foo"
>                 i <- capture readMaybe
>                 return $ "You are looking at /foo/" ++ show (i :: Int)
>            , do match "bar"
>                 i <- capture readMaybe
>                 return $ "You are looking at /bar/" ++ show (i :: Double)
>            , do match "foo"
>                 match "cat"
>                 return $ "You are looking at /foo/cat"
>            ]

And our test results remain unchanged:

> testRoute :: (Eq a) => Route a -> [([String], Maybe a)] -> Bool
> testRoute r tests =
>     all (\(paths, result) -> (runRoute r paths) == result) tests
> route1Free_tests =
>     testRoute route1Free route1_results

As with Route', we are just building a tree. For example if we have `route2' again:

> route2 :: Route ()
> route2 =
>     do choice [ match "foo"
>               , match "bar"
>               ]
>        match "baz"

and we show it using the show function we get:

Free (Choice [ Free (Match "foo" (Free (Match "baz" (Pure ()))))
             , Free (Match "bar" (Free (Match "baz" (Pure ()))))
             ]
     )

That looks nearly identical to route2'Expanded'7 except for with Pure instead of Term and with a bunch of Free constructors inserted. The same basic tree structure still remains.

Interpreter with Debugging

Now, let's look at actually implementing the alternative interpreters we mentioned earlier.

We can implement debugRoute like this:

> -- | run a route, also returning debug log
> debugRoute :: Route a -> [String] -> (Doc, Maybe a)
> debugRoute (Pure a) _  = (text "Pure", Just a)
> debugRoute _        [] = (text "-- ran out of path segments before finding 'Pure'", Nothing)
> debugRoute (Free (Match p' next)) (p:ps)
>     | p == p'   =
>         let (doc, ma) = debugRoute next ps
>         in (text "dir" <+> text p' <+> text "-- matched" <+> text p $+$ doc, ma)
>     | otherwise =
>        (text "dir" <+> text p' <+> text "-- did not match" <+> text p $+$ text "-- aborted", Nothing)
> debugRoute (Free (Capture convert)) (p:ps) =
>    case convert p of
>      Nothing  -> (text "path <func>" <+> text "-- was not able to convert" <+> text p $+$ text "-- aborted", Nothing)
>      (Just r) ->
>          let (doc, ma) = debugRoute r ps
>          in (text "path <func>" <+> text "-- matched" <+> text p $+$ doc, ma)
> debugRoute (Free (Choice choices)) paths =
>     let debugs (doc, Nothing) (docs, Nothing)  = (doc:docs, Nothing)
>         debugs (doc, Just a)  (docs, Nothing)  = (doc:docs, Just a)
>         debugs _              r@(docs, Just a) = r
>         (docs, ma) = foldr debugs ([], Nothing) $ reverse $ map (flip debugRoute paths) choices
>    in (text "choice" <+> showPrettyList (map (\d -> text "do" <+> d) $ reverse docs), ma)
> debugRoute (Free Zero) _      =
>     (text "zero", Nothing)
> showPrettyList  :: [Doc] -> Doc
> showPrettyList []     = text "[]"
> showPrettyList [x]    = char '[' <+> x $+$ char ']'
> showPrettyList (h:tl) = char '[' <+> h $+$ (vcat (map showTail tl)) $+$ char ']'
>     where
>      showTail x = char ',' <+> x

debugRoute is pretty straight-forward. Feel free to skip over the implementation. The interesting part is that with out modifying route2 we can next get a debug log:

*GHCi> let (d, r) = debugRoute route2 ["foo","bar"] in (print d >> print r)
choice [ do dir foo -- matched foo
            dir baz -- did not match bar
            -- aborted
       , do dir bar -- did not match foo
            -- aborted
       ]
Nothing

Route optimizing

Because the routing is now represented by a data-type, we can also write a simple optimizing function for it:

> optimize :: Route a -> Route a
> optimize (Free (Match p n))   = Free (Match p (optimize n))
> optimize (Free (Choice cs)) = optimize' cs
> optimize r                  = r
>
> optimize' :: [Route a] -> Route a
> optimize' cs =
>     case map flatten $ groupBy sameDir cs of
>       []  -> zero
>       [x] -> x
>       xs  -> choice xs
>     where
>       flatten :: [Route a] -> Route a
>       flatten []  = zero
>       flatten [x] = x
>       flatten xs@(Free (Match p _) : _) = Free (Match p (optimize' [ next | (Free (Match _ next)) <- xs]))
>       flatten _ = error "flatten assertion failed."
>       sameDir (Free (Match p _)) (Free (Match p' _)) = p == p'
>       sameDir _         _          = False

And a helper function that shows the Route type as if it was Haskell code:

> prettyRoute :: (Show a) => Route a -> Doc
> prettyRoute (Pure a)              = text "return" <+> text (show a)
> prettyRoute (Free (Match p next)) = text "match" <+> doubleQuotes (text p) $+$ prettyRoute next
> prettyRoute (Free (Capture f))    = text "capture <func>" <> text (show (fmap prettyRoute (f "")))
> prettyRoute (Free (Choice cs))    = text "choice" <+> (showPrettyList $ map (\r -> text "do" <+> (nest 4 $ prettyRoute r)) cs)
> prettyRoute (Free Zero)           = text "zero"

Consider this route table:

> route4 :: Route String
> route4 =
>     choice [ do match "foo"
>                 match "bar"
>                 match "one"
>                 return "foo/bar/one"
>            , do match "foo"
>                 match "bar"
>                 match "two"
>                 return "foo/bar/two"
>            , do match "foo"
>                 match "baz"
>                 match "three"
>                 return "foo/baz/three"
>            ]

It has a bunch of overlapping in the patterns -- they all start with match "foo" which means that the router is going to have to do a linear search of all the patterns to make sure that none match. Additionally it will have to keep rematching on "foo" even though it already has.

If we call prettyRoute on route4 we get the original route map:

prettyRoute route4

choice [ do match "foo"
            match "bar"
            match "one"
            return "foo/bar/one"
       , do match "foo"
            match "bar"
            match "two"
            return "foo/bar/two"
       , do match "foo"
            match "baz"
            match "three"
            return "foo/baz/three"
       ]

And if we optimize the route:

prettyRoute (optimize route4)

Then we see that the overlapping prefixes have been combined:

match "foo"
choice [ do match "bar"
            choice [ do match "one"
                        return "foo/bar/one"
                   , do match "two"
                        return "foo/bar/two"
                   ]
       , do match "baz"
            match "three"
            return "foo/baz/three"
       ]

This version should run a bit faster than the original version because it will only need to match on "foo" once. And the embedded lists are shorter than the original. So when it does need to try all the alternatives, there are fewer to try.

One remaining problem is that for a url like "/foo/bar/apple", runRoute is going to backtrack and try the "baz" branch. But, that is pointless, because the optimizer ensures that backtracking is never going to be needed. (Actually that is not true, but let's pretend for a second that it is).

So, we can instead use this non-backtracking variant to run the route:

> runOptRoute :: Route a -> [String] -> (Bool, Maybe a)
> runOptRoute (Pure a) _  = (False, Just a)
> runOptRoute _        [] = (False, Nothing)
> runOptRoute (Free (Match p' next)) (p:ps)
>     | p == p'   = (True, snd $ runOptRoute next ps)
>     | otherwise = (False, Nothing)
> runOptRoute (Free (Capture convert)) (p:ps) =
>     case convert p of
>       (Just r) -> (True, snd $ runOptRoute r ps)
>       Nothing  -> (False, Nothing)
> runOptRoute (Free (Choice choices)) paths  =
>     tryChoices paths choices
> runOptRoute (Free Zero) _ =
>     (False, Nothing)
> tryChoices :: [String] -> [Route a] -> (Bool, Maybe a)
> tryChoices [] _  = (False, Nothing)
> tryChoices _  [] = (False, Nothing)
> tryChoices paths [r]    = runOptRoute r paths
> tryChoices paths (r:rs) =
>     case runOptRoute r paths of
>       (False, Nothing) -> tryChoices paths rs
>       x                -> x

Unfortunately, this won't actually work correctly. We said that we would never have to backtrack once a match succeeds because the optimizer has combined all the other branches that matched on the same path into a single branch of the tree. However, the optimizer has no way of knowing what the capture clauses are matching on because capture just takes an arbitrary function to do the matching.

Also, it is valid for there to be more than one possible match for a particular URL. The route that matches first is the correct route. However, the optimizer does not take that into consideration. So, it is possible that after optimization a different route will start matching.

These problems can be addressed, but are outside the scope of this blog post. We just wanted to see that the possibility exists. A correct, and more powerful, solution will likely appear in Happstack 8.

Version 4: Operational Monad

Instead of using the Free monad we could use the operational monad.

The operational monad was designed from the ground up to be used for defining programs which are run by interpreters -- like what we have been doing in the last couple sections.

In the original Route type we had explicit recursive types. In the Free monad section, we simplified that and had the Segment type which had polymorphic places where you could use recursion, but you were not forced to. But, that made the type a bit odd -- looking at the Segment type by itself, it is not really clear what the point of a type variable is supposed to be.

A much more natural way of encoding a program that matches on routes would be something like this (using GADTs):

> data SegmentCommand a where
>     MatchOp   :: String              -> SegmentCommand ()
>     CaptureOp :: (String -> Maybe a) -> SegmentCommand a
>     ChoiceOp  :: [RouteOp a]         -> SegmentCommand a
>     FailOp    ::                        SegmentCommand a
>

the monad provided by the operational package is actually called Program not Operational. As with the Free monad, we create our route monad via a simple type alias.

> type RouteOp = Program SegmentCommand

The SegmentCommand type now shows no signs of recursion at all. And the constructor types look just like the corresponding function types.

To turn a single command (like SegmentCommand) into a program (like RouteOp) we use the singleton function.

singleton :: instr a -> ProgramT instr m a

Using that we can define our familiar routing combinators very trivially:

> matchOp :: String -> RouteOp ()
> matchOp = singleton . MatchOp
>
> captureOp :: (String -> Maybe a) -> RouteOp a
> captureOp = singleton . CaptureOp
>
> choiceOp :: [RouteOp a] -> RouteOp a
> choiceOp = singleton . ChoiceOp
>
> failOp :: RouteOp a
> failOp = singleton FailOp

Next we can define an interpreter for our program. The Program monad does not expose its internals directly. Instead we use the view function:

view :: Program instr a -> ProgramView instr a

to produce a ProgramView:

type ProgramView instr = ProgramViewT instr Identity

data ProgramViewT instr m a where
    Return :: a -> ProgramViewT instr m a
    (:>>=) :: (instr b)
           -> (b -> ProgramT instr m a)
           -> ProgramViewT instr m a

We see that the ProgramViewT data-type looks almost exactly like the Monad type-class.

Knowing that, we can now define an interpreter for our RouteOp:

> interpretRouteOp :: RouteOp a -> [String] -> Maybe (a, [String])
> interpretRouteOp router' paths = go paths router'
>     where
>       go :: [String] -> RouteOp a -> Maybe (a, [String])
>       go paths router =
>           case view router of
>           Return a    -> Just (a, paths)
>
>           (MatchOp p :>>= k) ->
>               case paths of
>                 (p':ps) | p == p ->
>                             go ps (k ())
>                 _ -> Nothing
>
>           (CaptureOp pat :>>= k) ->
>               case paths of
>                (p':ps) ->
>                    case pat p' of
>                      Nothing -> Nothing
>                      (Just a) -> go ps (k a)
>                _ -> Nothing
>           (ChoiceOp [] :>>= k) ->
>               Nothing
>
>           (ChoiceOp choices :>>= k) ->
>               let tryChoiceOps cs =
>                       case cs of
>                         [] -> Nothing
>                         (c:cs') ->
>                             case go paths c of
>                               (Just (a,paths')) -> go paths' (k a)
>                               Nothing -> tryChoiceOps cs'
>               in tryChoiceOps choices
>
>           (FailOp :>>= _) ->
>               Nothing
> route1Op :: RouteOp String
> route1Op =
>     choiceOp [ do matchOp "foo"
>                   i <- captureOp readMaybe
>                   return $ "You are looking at /foo/" ++ show (i :: Int)
>              , do matchOp "bar"
>                   i <- captureOp readMaybe
>                   return $ "You are looking at /bar/" ++ show (i :: Double)
>              , do matchOp "foo"
>                   matchOp "cat"
>                   return $ "You are looking at /foo/cat"
>              ]
>
>
> testRouteOp :: (Eq a) => RouteOp a -> [([String], Maybe a)] -> Bool
> testRouteOp r tests =
>     all (\(paths, result) -> (fmap fst $ interpretRouteOp r paths) == result) tests
>
> route1Op_tests =
>     testRouteOp route1Op route1_results

Clearly, we could make an interpreter that included debug information as well. What is a little less obvious is how to create an optimizer. With the Free monad approach, the optimize function had the type:

optimize :: Route a -> Route a

where Route had the type:

type Route = Free Segment

We were able to directly inspect the Free monad structure and transform it. However, the operational monad does not directly expose its internals to us. Instead we have to use the view function to turn the Program into a ProgramView.

We also need to convert a ProgramView back into a Program. The operational library does not provide an unview/unviewT function, but we can define it ourselves as:

> unviewT :: (Monad m) => ProgramViewT instr m a -> ProgramT instr m a
> unviewT (Return a)     = return a
> unviewT (instr :>>= k) = singleton instr >>= k

Once that is done, we can then create a route optimizer:

>
> optimizeOp :: RouteOp a
>            -> RouteOp a
> optimizeOp route =
>      case view route of
>        (ChoiceOp cs :>>= k) ->
>            do a <- optimizeOp' (map view cs)
>               k a
>        _ -> route
>
> optimizeOp' :: [ProgramView SegmentCommand a]
>             -> RouteOp a
> optimizeOp' cs =
>     case map flatten $ groupBy sameDir cs of
>       []  -> failOp
>       [x] -> x
>       xs  -> choiceOp xs
>     where
>       flatten :: [ProgramView SegmentCommand a] -> RouteOp a
>       flatten []  = failOp
>       flatten [x] = unviewT x
>       flatten xs@((MatchOp p :>>= _) : _) =
>           do matchOp p
>              optimizeOp' [ view (next ()) | (MatchOp _ :>>= next) <- xs ]
>
>       sameDir (MatchOp p1 :>>= _) (MatchOp p2 :>>= _) = p1 == p2
>       sameDir _                   _                   = False

As with the Free monad solution, we could define a function to run the optimized route. However, it would have the same issues outlined before. So we will skip it.

Conclusions

In the original RouteMT based around monad transformers, the set of primitives can easily be extended by adding new functions. For example, we could add a primitive that reverses the order of all the remaining path segments:

> reverseMT :: RouteMT ()
> reverseMT = modify reverse

In Route' and Route, we would need to extend the data-type and also the interpreter functions. But we get the ability to inspect, rewrite, or interpret the actions differently.

As we saw, we don't need the Free or operational monads to reap these benefits. So, a big question is, what do those monads actually buy us?

In this example, we saw that it allowed us to derive the Functor instance and we got the Monad instance automatically too. So, we can be sure that our Functor and Monad instances actually follow the laws.

But, we also paid a price. In the Free implementation, our interpreter functions are cluttered up by a bunch of Free and Pure constructors.

The operational monad provided benefits similar to the Free monad, but with a nicer interface. The operational monad is reportedly nearly isomorphic to the Free monad. So, it should be able to do almost everything the Free monad can.

I've heard rumors that the Free monad can have quadratic runtime, while the operational monad does stuff to avoid that? Though there is also some way to use the codensity monad to fix the Free monad?

So, experts, which implementation should I use for this example? And what are the benefits of that solution over the other alternatives? My inclination is to use the operational. The interpretive solution clearly provides the most flexibility. Of the three intepretive solutions, the operational monad seems like the easiest solution to implement and to understand. I am not sure what (if anything) I am missing out on by using operational instead of Free...

Discussion can be found here