There has been a lot of discussion recently about free (and operational) monads. And perhaps you know what a free monad is. But you may still be confused as to when or why.
You might wonder if the free and operational monads actually solve any useful problems in the real world, or if they are just theoretical wanking.
In this post I will show how we can rewrite the routing combinators in
Happstack to use the Free
and operational
monads and in the
process solve a couple real world problems.
Happstack
is a Haskell web programming toolkit. There are two url
routing systems you can use in Happstack
. web-routes
provides a
flexible system for type-safe URL routing. But we also have an older
system that works around simple string based combinators. In this
post, we are going to look into improvements we can make to the
simple, combinator based approach.
To provide basic url routing you need three things:
match
- a way to match on a static path segment.capture
- a way to capture a path segment and try to decode it.choice
- a way to pick from multiple alternatives.match
and capture
work on entire path segments. If we have a url like:
/foo/bar/baz
we first split it on the / and then decode the path segments to get a list like:
["foo","bar","baz"]
That is what gets feed into the routing system.
match
is really just a special case of capture
. But we will keep it as a separate case for two reasons:
match
is easier to understand than capture
match
is a separate caseThe traditional way to implement a router like this is by using some common monads and monad transformers. So let's start with that.
First we need some imports:
> {-# LANGUAGE DeriveFunctor, GADTs, GeneralizedNewtypeDeriving, ExistentialQuantification #-}
> import Control.Monad (MonadPlus(mzero), msum, join) > import Control.Monad.State (StateT, MonadState(get, put), evalStateT, modify) > import Control.Monad.Free (Free(Pure, Free), liftF) > import Control.Monad.Operational (Program(..), ProgramT(..), ProgramView, ProgramViewT(Return, (:>>=)), singleton, view) > import Data.List (groupBy) > import Data.Maybe (isNothing) > import Text.PrettyPrint.HughesPJ (Doc, (<+>), ($+$), (<>), char, doubleQuotes, nest, space, text, vcat, empty) > import Text.Show.Functions () -- instance Show (a -> b)
You will need to install the free
and operational
libraries from hackage (used in later sections).
Next we define a newtype
for our routing monad:
> newtype RouteMT a = RouteMT { unRoute :: StateT [String] Maybe a } > deriving (Functor, Monad, MonadPlus, MonadState [String]) > > runRouteMT :: RouteMT a -> [String] -> Maybe a > runRouteMT route paths = evalStateT (unRoute route) paths
MT
is short for MonadTransformer
here. Our RouteMT
monad is
created by combining two familiar monads: State
and Maybe
. StateT
[String]
contains the path segments in the url. Everytime we
successfully consume a path segment, we pop it off the list. We use
String
instead of Text
just to keep this blog post simple. A real
implementation would probably use Text
.
Maybe
is used to indicate failure. We can use its MonadPlus
instance to provide the choice
operation.
> choiceMT :: [RouteMT a] -> RouteMT a > choiceMT = msum
So, we need only implement match
and capture
. We can define match
as:
> matchMT :: String -- ^ path segment to match on > -> RouteMT () > matchMT p' = > do paths <- get > case paths of > (p:ps) | p == p' -> put ps > _ -> mzero
If the path matches, then we pop it off the stack, otherwise we call mzero
.
We can implement capture as:
> captureMT :: (String -> Maybe a) -- ^ function to decode path segment > -> RouteMT a > captureMT parse = > do paths <- get > case paths of > (p : ps) -> > case parse p of > Nothing -> mzero > (Just a) -> return a > _ -> > mzero
capture
is very much like match
except we use the supplied parsing function instead of plain old ==
.
We will also want a helper function so that we can use read
with captureMT
:
> readMaybe :: (Read a) => String -> Maybe a > readMaybe s = > case reads s of > [(n,[])] -> Just n > _ -> Nothing
Now we can implement a simple route:
> route1MT :: RouteMT String > route1MT = > choiceMT [ do matchMT "foo" > i <- captureMT readMaybe > return $ "You are looking at /foo/" ++ show (i :: Int) > , do matchMT "bar" > i <- captureMT readMaybe > return $ "You are looking at /bar/" ++ show (i :: Double) > , do matchMT "foo" > matchMT "cat" > return $ "You are looking at /foo/cat" > ]
Being lazy programmers, we will define some unit tests rather than a formal proof of correctness. Here is our simple test function:
> testRouteMT :: (Eq a) => RouteMT a -> [([String], Maybe a)] -> Bool > testRouteMT r tests = > all (\(paths, result) -> (runRouteMT r paths) == result) tests
And to make things pretty, we will define ==>
as an alias for (,)
:
> (==>) :: a -> b -> (a,b) > a ==> b = (a, b)
Now we can write down our unit tests for route:
> route1_results = > [ ["foo", "1"] ==> Just "You are looking at /foo/1" > , ["foo", "cat"] ==> Just "You are looking at /foo/cat" > , ["bar", "3.141"] ==> Just "You are looking at /bar/3.141" > , ["baz"] ==> Nothing > ]
and combining it all together:
> route1MT_test = > testRouteMT route1MT route1_results
While the monad transformer code works fine, there are ways it could be better:
In order to find the matching route, it has to start at the top of the list and work all the way to the bottom until it finds a match or gets to the end of the list. For example, we have one route that starts with "/foo"
at the top and another that starts with "/foo"
at the bottom. Additionally, if we match on "bar"
but fail to decode the next path as an Int
, there is no point in trying any additional routes, because no other routes start with "/bar"
. But there is no way to impart that information into the router.
when routes fail, there is no record of why it failed. We just get back Nothing
. On a live site, that is fine, but during development, you sometimes do care.
We can solve both these issues by using a data-type to build the router instead:
> data Route' a > = Match' String (Route' a) > | forall b. Capture' (String -> Maybe b) (b -> Route' a) > | Choice' [Route' a] > | Term' a
The last argument of the Match'
and Capture'
constructors is what
to do next if that match or capture succeeds.
We can create Functor
and Monad
instances for the Route'
type:
> instance Functor Route' where > fmap f (Match' s r) = Match' s (fmap f r) > fmap f (Capture' p r) = Capture' p (\b -> fmap f (r b)) > fmap f (Choice' rs) = Choice' (map (fmap f) rs) > fmap f (Term' a) = Term' (f a) > > instance Monad Route' where > return a = Term' a > (Term' a) >>= f = f a > (Match' str r) >>= f = Match' str (r >>= f) > (Choice' rs) >>= f = Choice' (map (\r -> r >>= f) rs) > (Capture' p r) >>= f = Capture' p (\b -> r b >>= f)
These instances can be a little tricky to understand at first. You might want to finish this section and then come back to them after you have seen the bigger picture and some examples.
Our routing functions no longer do any real work directly. Instead
they just construct Route'
values:
> match' :: String -> Route' () > match' p = Match' p (Term' ())
> capture' :: (String -> Maybe b) -> Route' b > capture' p = Capture' p (\b -> Term' b)
> choice' :: [Route' a] -> Route' a > choice' = Choice'
Now all the real work happens in runRoute'
:
> runRoute' :: Route' a -> [String] -> Maybe a > runRoute' (Term' a) _ = Just a > runRoute' (Match' p' r) (p:ps) | p == p' = runRoute' r ps > runRoute' (Match' _ _) _ = Nothing > runRoute' (Choice' []) _ = Nothing > runRoute' (Choice' (r:rs)) paths = > case runRoute' r paths of > (Just a) -> Just a > Nothing -> runRoute' (Choice' rs) paths > runRoute' (Capture' parse r) (p:ps) = > case parse p of > Nothing -> Nothing > (Just b) -> runRoute' (r b) ps
We can test a Route'
with runRoute'
and see that it acts just like
RouteMT
. We can reimplement route1MT
using the new functions. The only difference is that the names have been changed from fooMT
to foo'
. If we did not change the names then the new implementation would be a drop-in replacement for the old code:
> route1' :: Route' String > route1' = > choice' [ do match' "foo" > i <- capture' readMaybe > return $ "You are looking at /foo/" ++ show (i :: Int) > , do match' "bar" > i <- capture' readMaybe > return $ "You are looking at /bar/" ++ show (i :: Double) > , do match' "foo" > match' "cat" > return $ "You are looking at /foo/cat" > ]
> testRoute' :: (Eq a) => Route' a -> [([String], Maybe a)] -> Bool > testRoute' r tests = > all (\(paths, result) -> (runRoute' r paths) == result) tests
> route1'_test = > testRoute' route1' route1_results
If you look at runRoute'
closely you will notice that we don't ever
pass back the unconsumed path segments. So you might wonder how
something like this could work:
> route2' :: Route' () > route2' = > do choice' [ match' "foo" > , match' "bar" > ] > match' "baz"
Specifically, after choice'
has successfully matched on "foo"
or "bar"
, how does the match'
function get access to the remaining path segments?
If we expand the functions and monad operations, though, the answer becomes clearer. First let's substitute in the match'
, capture'
, and choice'
operations:
> route2'Expanded'1 :: Route' () > route2'Expanded'1 = > do Choice' [ Match' "foo" (Term' ()) > , Match' "bar" (Term' ()) > ] > Match' "baz" (Term' ())
Next let's desugar the monad syntax:
> route2'Expanded'2 :: Route' () > route2'Expanded'2 = > (Choice' [ Match' "foo" (Term' ()) > , Match' "bar" (Term' ()) > ]) >>= > \_ -> Match' "baz" (Term' ())
And now we can substitute the >>=
using this rule from the Monad
instance:
(Choice' rs) >>= f = Choice' (map (\r -> r >>= f) rs)
> route2'Expanded'3 :: Route' () > route2'Expanded'3 = > (Choice' (map (\r -> r >>= \_ -> Match' "baz" (Term' ())) > [ Match' "foo" (Term' ()) > , Match' "bar" (Term' ()) > ]))
substituting the map
gives us:
> route2'Expanded'4 :: Route' () > route2'Expanded'4 = > (Choice' [ Match' "foo" (Term' ()) >>= \_ -> Match' "baz" (Term' ()) > , Match' "bar" (Term' ()) >>= \_ -> Match' "baz" (Term' ()) > ])
now we can apply this rule to expand the remaining >>=
:
(Match' str r) >>= f = Match' str (r >>= f)
> route2'Expanded'5 :: Route' () > route2'Expanded'5 = > (Choice' [ Match' "foo" (Term' () >>= \_ -> Match' "baz" (Term' ())) > , Match' "bar" (Term' () >>= \_ -> Match' "baz" (Term' ())) > ])
next we can apply this rule:
(Term' a) >>= f = f a
> route2'Expanded'6 :: Route' () > route2'Expanded'6 = > (Choice' [ Match' "foo" ((\_ -> Match' "baz" (Term' ())) ()) > , Match' "bar" ((\_ -> Match' "baz" (Term' ())) ()) > ])
and finally, we can apply the \_ ->
to ()
which gives us:
> route2'Expanded'7 :: Route' () > route2'Expanded'7 = > (Choice' [ Match' "foo" (Match' "baz" (Term' ())) > , Match' "bar" (Match' "baz" (Term' ())) > ])
So, we can see here that the monad syntax is just used to build a
tree. Each valid parse is represented as a straight path from the
root to a leaf. So, we need to pass the remaining segments in as we
travel down the tree. But we don't have to worry about coming back up
again, so the recursive runRoute'
calls don't have to return the
unconsumed path segments to the callers.
The monad transformer version was a lot shorter to write, and easy to
understand. For people studying the implementation, it leverages
existing knowledge about the State
and Maybe
monads.
The new interpreter version has a far more complex type and the
runRoute'
function is harder to understand. For the end user, none
of this really matters, because the API is exactly the same -- we only
changed the names so we could shove this entire example in a single
literate Haskell file.
So, what have we gained?
One thing we can do is write alternative interpreters which address the two complaints we had about the monad transformer based direct implementation.
For example, we can analyze the Route'
type and optimize the routes. As an example, we can rewrite this:
choice' [ do match' "foo"
match' "bar"
return "/foo/bar"
, do match' "foo"
match' "baz"
return "/foo/baz"
]
to this:
choice' [ do match' "foo"
choice' [ do match' "bar"
return "/foo/bar"
, do match' "baz"
return "/foo/baz"
]
]
Because routes with the same prefix are now nested, we do not need to do any backtracking. if "foo"
matches, but "bar"
and "baz"
fail, we do not need to backtrack and see if there are any other routes that start with "foo"
. We could implement this as an alternative function to runRoute'
leaving runRoute'
still intact.
We could also implement a debugRoute
function that shows us what path we tried to match at each step and whether it succeeded or not. We are not going to implement these functions quite yet though.
As we saw, the Route'
type is essentially building a specialized
tree with the values at the leaves. As Haskell users, we like to
abstract and reuse things. What if we could get rid of the explicit
recursion in the Route'
type and get a valid Monad
instance with
out having to do any real work? That should simplify our code, and
reduce the chances of introducing a bug. This is where the Free
monad comes into play. The Free
type is defined as:
data Free f a = Pure a | Free (f (Free f a))
If we look at that type we can see how we might be able to use the Pure
constructor for the values in the leaves, and the Free
constructor to provide the recusion. So, now we can define a non-recursive type that just operates on a single path segment.
Using the GADT syntax makes things a bit prettier, because the constructor types look like the related function types:
> data Segment a where > Match :: String -> a -> Segment a > Capture :: (String -> Maybe a) -> Segment a > Choice :: [a] -> Segment a > Zero :: Segment a > deriving (Functor, Show)
we can be extra lazy and derive the Functor
instance automatically (and correctly!).
Compared to the Route'
type, we see that the constructors are a little simpler now. For example, Match
takes the String
to match on and the value to return on success. But, we do not have to explicitly spell out the recursion. And, because Capture
does not have a forall b.
anymore, we can use the DeriveFunctor
extension to derive the Functor
instance automatically.
While we do not explicitly have recursion in the Segment
type -- we do leave holes where recursion can happen. For example we can write:
Match "foo" (Match "bar" Zero)
Free
already has a Monad
instance, so to make a Monad
out of Segment
we can just use a type alias:
> type Route = Free Segment
So, this is pretty nice! We got valid Functor
and Monad
instances
for free! You might think that is why it is called the Free
monad --
and it sort of is. The term free actually comes from abstract
algebra and category theory -- and they have some other idea about
what the free part is.
All sorts of things like monoids, functors, monads, etc can be
free. Something is free if it satisfies exactly the required laws but
nothing extra. In our example, we created the Route
monad by just
making the type alias type Route = Free Segment
. By design the
Haskell Free
monad doesn't do anything except satisfy the monad
laws. And adding the type alias type Route = Free Segment
is clearly
not going to suddenly make it do more things. So, presumably Route
is free as well. Yes, it really is that simple.
There are a bunch of other blog posts and wiki pages about the underlying theory, so we are just going to move on. We are aiming for gaining an hands-on understanding in this post, not a theoretical one.
We define the routing combinators similar to how we did for Route'
:
> -- | match on a static path segment > match :: String > -> Route () > match p = liftF (Match p ())
liftF
has the type:
liftF :: Functor f => f a -> Free f a
we could have written match as:
match p = Free (Match p (Pure ()))
but liftF gets rid of some of the noise for us. The other combinators are pretty much the same:
> -- | match on a path segment and attempt to convert it to a type > capture :: (String -> Maybe a) > -> Route a > capture convert = > liftF (Capture convert)
> -- | try several routes, using the first that succeeds > choice :: [Route a] > -> Route a > choice a = join $ liftF (Choice a)
> -- | a route that will always fail > zero :: Route a > zero = liftF Zero
To perform the routing, we create a runRoute
function like before:
> -- | run a route, full backtracking on failure > runRoute :: Route a -> [String] -> Maybe a > runRoute (Pure a) _ = Just a > runRoute _ [] = Nothing > runRoute (Free (Match p' r)) (p:ps) > | p == p' = runRoute r ps > | otherwise = Nothing > runRoute (Free (Capture convert)) (p:ps) = > case convert p of > Nothing -> Nothing > (Just r) -> runRoute r ps > runRoute (Free (Choice choices)) paths = > msum $ map (flip runRoute paths) choices > runRoute (Free Zero) _ = > Nothing
You'll note that this runRoute
function looks quite a bit like the
previous runRoute'
function. It does contain a bit of extra noise
because we of the Free
constructors.
As before, the API remains unchanged (aside from renames to avoid name clashes):
> route1Free :: Route String > route1Free = > choice [ do match "foo" > i <- capture readMaybe > return $ "You are looking at /foo/" ++ show (i :: Int) > , do match "bar" > i <- capture readMaybe > return $ "You are looking at /bar/" ++ show (i :: Double) > , do match "foo" > match "cat" > return $ "You are looking at /foo/cat" > ]
And our test results remain unchanged:
> testRoute :: (Eq a) => Route a -> [([String], Maybe a)] -> Bool > testRoute r tests = > all (\(paths, result) -> (runRoute r paths) == result) tests
> route1Free_tests = > testRoute route1Free route1_results
As with Route'
, we are just building a tree. For example if we have `route2' again:
> route2 :: Route () > route2 = > do choice [ match "foo" > , match "bar" > ] > match "baz"
and we show it using the show
function we get:
Free (Choice [ Free (Match "foo" (Free (Match "baz" (Pure ()))))
, Free (Match "bar" (Free (Match "baz" (Pure ()))))
]
)
That looks nearly identical to route2'Expanded'7
except for with Pure
instead of Term
and with a bunch of Free
constructors inserted. The same basic tree structure still remains.
Now, let's look at actually implementing the alternative interpreters we mentioned earlier.
We can implement debugRoute
like this:
> -- | run a route, also returning debug log > debugRoute :: Route a -> [String] -> (Doc, Maybe a) > debugRoute (Pure a) _ = (text "Pure", Just a) > debugRoute _ [] = (text "-- ran out of path segments before finding 'Pure'", Nothing) > debugRoute (Free (Match p' next)) (p:ps) > | p == p' = > let (doc, ma) = debugRoute next ps > in (text "dir" <+> text p' <+> text "-- matched" <+> text p $+$ doc, ma) > | otherwise = > (text "dir" <+> text p' <+> text "-- did not match" <+> text p $+$ text "-- aborted", Nothing) > debugRoute (Free (Capture convert)) (p:ps) = > case convert p of > Nothing -> (text "path <func>" <+> text "-- was not able to convert" <+> text p $+$ text "-- aborted", Nothing) > (Just r) -> > let (doc, ma) = debugRoute r ps > in (text "path <func>" <+> text "-- matched" <+> text p $+$ doc, ma) > debugRoute (Free (Choice choices)) paths = > let debugs (doc, Nothing) (docs, Nothing) = (doc:docs, Nothing) > debugs (doc, Just a) (docs, Nothing) = (doc:docs, Just a) > debugs _ r@(docs, Just a) = r > (docs, ma) = foldr debugs ([], Nothing) $ reverse $ map (flip debugRoute paths) choices > in (text "choice" <+> showPrettyList (map (\d -> text "do" <+> d) $ reverse docs), ma) > debugRoute (Free Zero) _ = > (text "zero", Nothing)
> showPrettyList :: [Doc] -> Doc > showPrettyList [] = text "[]" > showPrettyList [x] = char '[' <+> x $+$ char ']' > showPrettyList (h:tl) = char '[' <+> h $+$ (vcat (map showTail tl)) $+$ char ']' > where > showTail x = char ',' <+> x
debugRoute
is pretty straight-forward. Feel free to skip over the implementation. The interesting part is that with out modifying route2
we can next get a debug log:
*GHCi> let (d, r) = debugRoute route2 ["foo","bar"] in (print d >> print r)
choice [ do dir foo -- matched foo
dir baz -- did not match bar
-- aborted
, do dir bar -- did not match foo
-- aborted
]
Nothing
Because the routing is now represented by a data-type, we can also write a simple optimizing function for it:
> optimize :: Route a -> Route a > optimize (Free (Match p n)) = Free (Match p (optimize n)) > optimize (Free (Choice cs)) = optimize' cs > optimize r = r > > optimize' :: [Route a] -> Route a > optimize' cs = > case map flatten $ groupBy sameDir cs of > [] -> zero > [x] -> x > xs -> choice xs > where > flatten :: [Route a] -> Route a > flatten [] = zero > flatten [x] = x > flatten xs@(Free (Match p _) : _) = Free (Match p (optimize' [ next | (Free (Match _ next)) <- xs])) > flatten _ = error "flatten assertion failed." > sameDir (Free (Match p _)) (Free (Match p' _)) = p == p' > sameDir _ _ = False
And a helper function that shows the Route
type as if it was Haskell code:
> prettyRoute :: (Show a) => Route a -> Doc > prettyRoute (Pure a) = text "return" <+> text (show a) > prettyRoute (Free (Match p next)) = text "match" <+> doubleQuotes (text p) $+$ prettyRoute next > prettyRoute (Free (Capture f)) = text "capture <func>" <> text (show (fmap prettyRoute (f ""))) > prettyRoute (Free (Choice cs)) = text "choice" <+> (showPrettyList $ map (\r -> text "do" <+> (nest 4 $ prettyRoute r)) cs) > prettyRoute (Free Zero) = text "zero"
Consider this route table:
> route4 :: Route String > route4 = > choice [ do match "foo" > match "bar" > match "one" > return "foo/bar/one" > , do match "foo" > match "bar" > match "two" > return "foo/bar/two" > , do match "foo" > match "baz" > match "three" > return "foo/baz/three" > ]
It has a bunch of overlapping in the patterns -- they all start with
match "foo"
which means that the router is going to have to do a
linear search of all the patterns to make sure that none
match. Additionally it will have to keep rematching on "foo" even
though it already has.
If we call prettyRoute
on route4
we get the original route map:
prettyRoute route4
choice [ do match "foo"
match "bar"
match "one"
return "foo/bar/one"
, do match "foo"
match "bar"
match "two"
return "foo/bar/two"
, do match "foo"
match "baz"
match "three"
return "foo/baz/three"
]
And if we optimize the route:
prettyRoute (optimize route4)
Then we see that the overlapping prefixes have been combined:
match "foo"
choice [ do match "bar"
choice [ do match "one"
return "foo/bar/one"
, do match "two"
return "foo/bar/two"
]
, do match "baz"
match "three"
return "foo/baz/three"
]
This version should run a bit faster than the original version because it will only need to match on "foo" once. And the embedded lists are shorter than the original. So when it does need to try all the alternatives, there are fewer to try.
One remaining problem is that for a url like "/foo/bar/apple"
, runRoute
is going to backtrack and try the "baz"
branch. But, that is pointless, because the optimizer ensures that backtracking is never going to be needed. (Actually that is not true, but let's pretend for a second that it is).
So, we can instead use this non-backtracking variant to run the route:
> runOptRoute :: Route a -> [String] -> (Bool, Maybe a) > runOptRoute (Pure a) _ = (False, Just a) > runOptRoute _ [] = (False, Nothing) > runOptRoute (Free (Match p' next)) (p:ps) > | p == p' = (True, snd $ runOptRoute next ps) > | otherwise = (False, Nothing) > runOptRoute (Free (Capture convert)) (p:ps) = > case convert p of > (Just r) -> (True, snd $ runOptRoute r ps) > Nothing -> (False, Nothing) > runOptRoute (Free (Choice choices)) paths = > tryChoices paths choices > runOptRoute (Free Zero) _ = > (False, Nothing)
> tryChoices :: [String] -> [Route a] -> (Bool, Maybe a) > tryChoices [] _ = (False, Nothing) > tryChoices _ [] = (False, Nothing) > tryChoices paths [r] = runOptRoute r paths > tryChoices paths (r:rs) = > case runOptRoute r paths of > (False, Nothing) -> tryChoices paths rs > x -> x
Unfortunately, this won't actually work correctly. We said that we would never have to backtrack once a match
succeeds because the optimizer has combined all the other branches that matched on the same path into a single branch of the tree. However, the optimizer has no way of knowing what the capture
clauses are matching on because capture
just takes an arbitrary function to do the matching.
Also, it is valid for there to be more than one possible match for a particular URL. The route that matches first is the correct route. However, the optimizer does not take that into consideration. So, it is possible that after optimization a different route will start matching.
These problems can be addressed, but are outside the scope of this blog post. We just wanted to see that the possibility exists. A correct, and more powerful, solution will likely appear in Happstack 8.
Instead of using the Free
monad we could use the operational
monad.
The operational
monad was designed from the ground up to be used for
defining programs which are run by interpreters -- like what we have
been doing in the last couple sections.
In the original Route
type we had explicit recursive types. In the
Free
monad section, we simplified that and had the Segment
type
which had polymorphic places where you could use recursion, but you
were not forced to. But, that made the type a bit odd -- looking at
the Segment
type by itself, it is not really clear what the point of
a
type variable is supposed to be.
A much more natural way of encoding a program that matches on routes would be something like this (using GADTs):
> data SegmentCommand a where > MatchOp :: String -> SegmentCommand () > CaptureOp :: (String -> Maybe a) -> SegmentCommand a > ChoiceOp :: [RouteOp a] -> SegmentCommand a > FailOp :: SegmentCommand a >
the monad provided by the operational
package is actually called
Program
not Operational
. As with the Free
monad, we create our
route monad via a simple type alias.
> type RouteOp = Program SegmentCommand
The SegmentCommand
type now shows no signs of recursion at all. And
the constructor types look just like the corresponding function types.
To turn a single command (like SegmentCommand
) into a program (like
RouteOp
) we use the singleton
function.
singleton :: instr a -> ProgramT instr m a
Using that we can define our familiar routing combinators very trivially:
> matchOp :: String -> RouteOp () > matchOp = singleton . MatchOp > > captureOp :: (String -> Maybe a) -> RouteOp a > captureOp = singleton . CaptureOp > > choiceOp :: [RouteOp a] -> RouteOp a > choiceOp = singleton . ChoiceOp > > failOp :: RouteOp a > failOp = singleton FailOp
Next we can define an interpreter for our program. The Program
monad does not expose its internals directly. Instead we use the view
function:
view :: Program instr a -> ProgramView instr a
to produce a ProgramView
:
type ProgramView instr = ProgramViewT instr Identity
data ProgramViewT instr m a where
Return :: a -> ProgramViewT instr m a
(:>>=) :: (instr b)
-> (b -> ProgramT instr m a)
-> ProgramViewT instr m a
We see that the ProgramViewT
data-type looks almost exactly like the Monad
type-class.
Knowing that, we can now define an interpreter for our RouteOp
:
> interpretRouteOp :: RouteOp a -> [String] -> Maybe (a, [String]) > interpretRouteOp router' paths = go paths router' > where > go :: [String] -> RouteOp a -> Maybe (a, [String]) > go paths router = > case view router of > Return a -> Just (a, paths) > > (MatchOp p :>>= k) -> > case paths of > (p':ps) | p == p -> > go ps (k ()) > _ -> Nothing > > (CaptureOp pat :>>= k) -> > case paths of > (p':ps) -> > case pat p' of > Nothing -> Nothing > (Just a) -> go ps (k a) > _ -> Nothing > (ChoiceOp [] :>>= k) -> > Nothing > > (ChoiceOp choices :>>= k) -> > let tryChoiceOps cs = > case cs of > [] -> Nothing > (c:cs') -> > case go paths c of > (Just (a,paths')) -> go paths' (k a) > Nothing -> tryChoiceOps cs' > in tryChoiceOps choices > > (FailOp :>>= _) -> > Nothing
> route1Op :: RouteOp String > route1Op = > choiceOp [ do matchOp "foo" > i <- captureOp readMaybe > return $ "You are looking at /foo/" ++ show (i :: Int) > , do matchOp "bar" > i <- captureOp readMaybe > return $ "You are looking at /bar/" ++ show (i :: Double) > , do matchOp "foo" > matchOp "cat" > return $ "You are looking at /foo/cat" > ] > > > testRouteOp :: (Eq a) => RouteOp a -> [([String], Maybe a)] -> Bool > testRouteOp r tests = > all (\(paths, result) -> (fmap fst $ interpretRouteOp r paths) == result) tests > > route1Op_tests = > testRouteOp route1Op route1_results
Clearly, we could make an interpreter that included debug information
as well. What is a little less obvious is how to create an optimizer. With the Free
monad approach, the optimize function had the type:
optimize :: Route a -> Route a
where Route
had the type:
type Route = Free Segment
We were able to directly inspect the Free
monad structure and
transform it. However, the operational
monad does not directly
expose its internals to us. Instead we have to use the view
function
to turn the Program
into a ProgramView
.
We also need to convert a ProgramView
back into a Program
. The operational
library does not provide an unview/unviewT
function, but we can define it ourselves as:
> unviewT :: (Monad m) => ProgramViewT instr m a -> ProgramT instr m a > unviewT (Return a) = return a > unviewT (instr :>>= k) = singleton instr >>= k
Once that is done, we can then create a route optimizer:
> > optimizeOp :: RouteOp a > -> RouteOp a > optimizeOp route = > case view route of > (ChoiceOp cs :>>= k) -> > do a <- optimizeOp' (map view cs) > k a > _ -> route > > optimizeOp' :: [ProgramView SegmentCommand a] > -> RouteOp a > optimizeOp' cs = > case map flatten $ groupBy sameDir cs of > [] -> failOp > [x] -> x > xs -> choiceOp xs > where > flatten :: [ProgramView SegmentCommand a] -> RouteOp a > flatten [] = failOp > flatten [x] = unviewT x > flatten xs@((MatchOp p :>>= _) : _) = > do matchOp p > optimizeOp' [ view (next ()) | (MatchOp _ :>>= next) <- xs ] > > sameDir (MatchOp p1 :>>= _) (MatchOp p2 :>>= _) = p1 == p2 > sameDir _ _ = False
As with the Free
monad solution, we could define a function to run the optimized route. However, it would have the same issues outlined before. So we will skip it.
In the original RouteMT
based around monad transformers, the set of
primitives can easily be extended by adding new functions. For
example, we could add a primitive that reverses the order of all the
remaining path segments:
> reverseMT :: RouteMT () > reverseMT = modify reverse
In Route'
and Route
, we would need to extend the data-type and
also the interpreter functions. But we get the ability to inspect,
rewrite, or interpret the actions differently.
As we saw, we don't need the Free
or operational
monads to reap these benefits. So, a big question is, what do those monads actually buy us?
In this example, we saw that it allowed us to derive the Functor
instance and we got the Monad
instance automatically too. So, we can be sure that our Functor
and Monad
instances actually follow the laws.
But, we also paid a price. In the Free
implementation, our interpreter functions are cluttered up by a bunch of Free
and Pure
constructors.
The operational
monad provided benefits similar to the Free
monad, but with a nicer interface. The operational
monad is reportedly nearly isomorphic to the Free
monad. So, it should be able to do almost everything the Free
monad can.
I've heard rumors that the Free
monad can have quadratic runtime, while the operational
monad does stuff to avoid that? Though there is also some way to use the codensity
monad to fix the Free
monad?
So, experts, which implementation should I use for this example? And what are the benefits of that solution over the other alternatives? My inclination is to use the operational
. The interpretive solution clearly provides the most flexibility. Of the three intepretive solutions, the operational
monad seems like the easiest solution to implement and to understand. I am not sure what (if anything) I am missing out on by using operational
instead of Free
...
Discussion can be found here
Haskell is slowly moving onto the browser -- and that is very exciting. We have Fay, GHCJS, UHCJS, Haskell-like languages such as Elm, and more!
In this post I want to demonstrate two ways in which this is awesome.
we can now define a data-type once and use it everywhere
we can now define a type-checked AJAX interface between the browser and the server.
In this post we will be using:
happstack-server
- a modern Haskell based web application serveracid-state
- a native Haskell database systemfay
- a compiler which compiles a subset of Haskell to JavascriptLet's first consider a more traditional system where we use
happstack-server
, a SQL database, and Javascript. If we have a value
we want to store in the database, manipulate on the server, and send
to the client, we need to manually create several representations of
this value and manually write code to marshal to and from the various
representations.
In the SQL database, the value needs to be represented via columns and relations in one or more tables. On the server-side we need to create an algebraic data type to represent the value. To transmit the value to the client we need to the convert the value into a JSON object. And then on the client-side we may then need to convert the JSON value into a Javascript object.
To switch between each of these representations we need to manually write code. For example, we need to write the SQL statements to update and retrieve the value from the database.
So in addition to the four representations of our data, we have 3 bidirectional conversions to manage as well:
SQL <=> ADT <=> JSON <=> JAVASCRIPT
Now let's say we need to make a change to our datatype -- we have to correctly update 10 different aspects of our code.
Because SQL and Javascript are outside of the domain of Haskell, we don't even get the help of the typechecker to make sure we have keep all the types in-sync.
A popular mantra in computer programming is DRY
- "Don't repeat
yourself". Yet, here we have to repeat ourselves 10 times!
In addition to keeping everything in sync, we still have the problem of having to think about the same data in 4 different ways:
The picture when using happstack-server
, acid-state
, and fay
is
radically different. In this system we define our data type as a nice
algebraic data type which can be stored in acid-state, manipulated on
the server, and sent to the client, where it is also treated as the
same ADT. This definition occurs in once in a normal Haskell file
that is shared by all three pieces of the system.
The data does still need to be serialized by acid-state
and for
communication to/from the client (via AJAX), however, this serialization
is done entirely automatically via template haskell and generics.
I have created a simple example of using happstack-server
,
acid-state
, and fay
to implement an interactive web 2.0 mastermind
clone. The board updates all occur client side and communication is
done over a typed AJAX communication channel.
You can find all the source code here:
http://hub.darcs.net/stepcut/mastermind
A demonstration of the game play is shown is this video:
http://www.youtube.com/watch?v=K2jdUlhX_E8
There are some bugs in the code, unimplemented features, etc. It seems to display correctly in Chrome, but not Firefox (and possibly others). If any of these things bother you, feel free to submit patches. These issues, do not get in the way of the interesting things we want to demonstrate, and so they will likely remain unfixed.
The tree is organized as follows:
MasterMind.Client.*
- client-side Fay codeMasterMind.Server.*
- server-side Haskell codeMasterMind.Shared.*
- code that is shared between the client and serverMasterMind.Shared.Core
contains the datatypes needed to define the state of the game
board. There is not much to say about these types -- they are
basically what you would expect to see for a game like mastermind.
In
MasterMind.Server.Acid
those types are stored persistently using acid-state
. All played
games are retained in the database, though there is currently no code
implemented to browse them.
In
MasterMind.Client.Main
those same types (such as Color
, Guess
, and Board
) are imported and used for the client-side
interactions.
By virtue of the fact that everything fits together so seamlessly -- there isn't much to say. It looks like we just defined some normal Haskell datatypes and used them in normal Haskell code -- just like any other Haskell program. The interesting part is really what is missing! We've managed to eliminate all that manual conversion, having to think about multiple representation of the same data, javascript, SQL, etc, and left ourselves with nice, simple Haskell code! When we want to change the type, we just change the type in one place. If we need to update code, the type-checker will complain and let us know!
Best of all, we do not need to rely on special syntax introduced via QuasiQuotation. We define the types using normal Haskell data declarations.
There is a bit of Template Haskell code in the acid-state
portions
of the code. To create SafeCopy
instances we use
deriveSafeCopy
. In principle this is not much different from the
standard deriving Data, Typeable
mechanism. However, for those that
eschew Template Haskell, there is work on allowing SafeCopy
to use
the new Generics
features in GHC 7.2.
There is also a Template Haskell function makeAcidic
which would be
a bit more difficult to remove.
Now that we have a way to share types between the client and server, it is relatively straight-forward to use those types to build a type-safe communication channel between the client and server.
At the end of MasterMind.Shared.Core
there is a type:
> data Command > = SendGuess Guess (ResponseType (Maybe Row)) > | FetchBoard (ResponseType (Maybe Board)) > deriving (Read, Show, Data, Typeable)
The Command
type defines the AJAX interface between the server and
the client. The constructors 'SendGuess' and 'FetchBoard' are commands
that the client wants to send, and the ResponseType a
is what the
server will return.
It would be far more sensible to declare Command
as a GADT
:
> data Command r where > SendGuess :: Guess -> Command (Maybe Row) > FetchBoard :: Command (Maybe Board)
Unfortuantely, Fay does not support GADTs
at this time, so we have
to use a series of hacks to get the type safety we are hoping
for. Language.Fay.AJAX
(from happstack-fay
) defines a type:
> data ResponseType a = ResponseType
This gives us a phantom type variable that we can use to encode the type of the response.
Looking at the Command
type again, you will see that the last argument to every constructor is a ResponseType
value:
> data Command > = SendGuess Guess (ResponseType (Maybe Row)) > | FetchBoard (ResponseType (Maybe Board)) > deriving (Read, Show, Data, Typeable)
On the client-side we can use call
to send an AJAX command:
> call :: (Foreign cmd, Foreign res) => > String -- ^ URL to @POST@ AJAX request to > -> (ResponseType res -> cmd) -- ^ AJAX command to send to server > -> (res -> Fay ()) -- ^ callback function to handle response > -> Fay ()
For example:
> call "/ajax" (SendGuess guess) $ \mRow -> > case mRow of > Nothing -> alert "Invalid game id" > (Just row) -> updateBoard row
You will note that the type signature for call
is a bit funny. The type for the cmd
argument is:
> (ResponseType res -> cmd)
instead of just
> cmd
But on closer examination, we see that is how the type-checker is able
to enforce that command and response handler types match. When we
actually use call
we just leave off the last argument to the
constructor, and the code is quite readable.
Also, note that call
is asynchronous -- meaning that call
we
return immediately, and the handler will be called after the server
sends back a response. That is why we pass in a callback function
instead of just doing:
> mRow <- call "/ajax" (SendGuess guess) -- this is not how it actually works
We could create a synchronous version of call, however the underlying
javascript engine is single-threaded and that could result in the UI
blocking. We could probably give the appearance of a blocking call
by using continuations in some fashion, but we will consider that
another time.
On the server-side we use a pair of functions:
> handleCommand :: (Data cmd, Show cmd, Happstack m) => > (cmd -> m Response) > -> m Response > > fayResponse :: (Happstack m, Show a) => > ResponseType a > -> m a > -> m Response
handleCommand
decodes the AJAX request and passes it to a
handler. fayResponse
is used to convert a return value into a valid
Fay response. The ResponseType a
parameter is used to enforce type
safety. So in the code we are going to have something like this in our
top-level route:
> , dir "json" $ handleCommand (commandR acid)
where commandR
looks like:
> commandR :: AcidState Games > -> Command > -> ServerPart Response > commandR acid cmd = > case cmd of > (SendGuess guess rt) -> fayResponse rt $ sendGuessC acid guess > (FetchBoard rt) -> fayResponse rt $ fetchBoardC acid
We see that we pull the ResponseType
value from the constructor and
pass it to fayResponse
, so that the type checker will enforce that
constraint.
The command handlers have types like:
> sendGuessC :: AcidState Games > -> Guess > -> ServerPart (Maybe Row)
> fetchBoardC :: AcidState Games > -> ServerPart (Maybe Board)
Hopefully we can add GADTs
to Fay soon, which will remove some of
the boilerplate.
If we want to use cabal to build and install our web application, then we need to tell cabal how to compile the Fay code to Javscript. I believe the long term plan is for Cabal to somehow directly support Fay packages. But in the meantime, this custom Setup.hs seems to do the trick:
Setup.hs for building Fay code
Note that in mastermind.cabal
we have build-type: Custom
instead
of that standard build-type: Simple
. You need to specify Custom
or
cabal will ignore the Setup.hs
.
Fay is still very raw and buggy. In order to get this simple application working I had to file four bugs against Fay and commit several other patches myself. When the developers say that Fay is still alpha they mean it.
On the other hand, the Fay team was very responsive and fixed my issues quickly!
If you want to experiment with Fay, I highly recommend it -- but be prepared to run into some issues.
Programming in Fay is far nicer than Javascript. But, ultimately we still have to deal with the DOM model that the browser is based around. And, even in Fay, that still sucks (even with bootstrap and jQuery to help). However, now that we have a nice language to work with, we can hopefully create a nice Fay-based library for client-side UI development.
Fay (and friends) are definitely a huge step in the right
direction. Things are still just getting started, but they are
definitely set to revolution Haskell web programming. We already have
mature solutions for web 1.0 programming such as happstack-server
and reform
. But, technologies like Fay are making it far easier to
provide web 2.0 solutions with rich client-side functionality.
I have released happstack-fay on hackage which provides the glue code needed for AJAX communication.
In future blog posts I hope to cover three additional topics:
how to incorporate type-safe routing using web-routes
how to add client-side validation to reform using Fay
how to use HSX for client-side HTML generation
We would love to hear your feedback!
Hello! It is I, your editor Jeremy Shaw. I am pleased to bring you Happstack Irregular News Issue #2. Some exciting things have happened since the last issue!
The biggest news since the last issue is the release of clckwrks:
clckwrks is a Haskell-based blog and CMS framework with support for editing pages via the browser plus downloadable themes and plugins.
clckwrks is now powering happstack.com and clckwrks.com.
We are currently focusing on making the clckwrks blogging portion solid. We have moved the official Happstack blog to clckwrks in order to encourage us to make it better :)
If you want to help out, you can browse our bug list and find something to take action on. We are more than happy to provide guidance and other assistance.
The other new big release was
reform. reform
is a form
generation library that continues in the footsteps of formlets
and
digestive-functors <= 0.2
. digestive-functors 0.3
has gone off to
explore a different direction, and we wanted to continue pushing the
development in this direction. There are still many ideas we can share
between the two libraries. Two changes we want to make in the next
release include:
switch to Bifunctors
package instead of homebrewed
IndexedApplicative
(thanks to Leonid Onokhov for pointing that
out). (Another alternative might be index-core
, though it does not
yet export the Applicative
instances).
consider using a Free Applicative
/ Operational Applicative
for
implementing the reform
applicative instances. digestive functors
0.3
does something like this and Jasper Van der Jeugt said it was
very beneficial and we should try it in reform
as well.
Dag Odenhall has released happstack-yui
, which makes it easy to use
YUI with Happstack. According the YUI website:
"YUI is a free, open source JavaScript and CSS framework for building richly interactive web applications."
Niklas Broberg and I (Jeremy Shaw) did some work on HSX. It now builds
with GHC 7.4 and we also fixed some hidden bugs in
HSX.Transform
. One thing we have been experimenting with is a
QuasiQuoter
for HSX. A demo version can be found here:
darcs get http://src.seereason.com/hsx-qq/
The QQ provides an alternative to the trhsx
preprocessor and
allows you to write things like:
html :: (XMLGenerator m) => XMLGenT m (XMLType m)
html = [hsx| <p class="foo"><% map toUpper "hello, world!" %></p> |]
This should be included in the next release of HSX.
The next release of HSX will also contain a major refactoring of the
packages. Mostly we are just planning to move modules into different
packages and divide things up differently. One major benefit of the
new arrangement is that you will no longer be required to install
HJavaScript
and HJScript
even though you probably never use them.
changed types in happstack-lite
so that serveFile
and asContentType
work better together, and added guessContentType
, MimeMap
, mimeType
patched happstack-jmacro
to work with older versions of template haskell
tweaks to ixset.cabal
so that it does not require the latest Cabal
to build
I have started research into why hackage2 requires so much RAM to run. I will be blogging about that separately. I do expect that we can substantially reduce that amount of RAM it requires. So far I have uncovered two minor issues:
it turns out that mapM Lazy.readFile fileList
returns the file
contents lazily but opens all the files immediately. This means you
can run out of file descriptors if you have a lot of checkpoints or
event files. A patch has been submitted for acid-state
and it will
be fixed in the next release.
acid-state
reads the entire checkpoint file into RAM before
decoding it. There are a couple places in the code that cause this to
happen. The first place is in cereal
. The getLazyByteString
function does return a lazy ByteString
.. but it does it by first
reading a strict ByteString
of the required length and then
converting it into a lazy ByteString
. Changing the behavior of
getLazyByteString
is actually quite difficult, as cereal
was
designed to allow for value-level error handling, instead of throwing
async exceptions.
We can probably work around this by using runGetState
to get
one-chunk at a time and build the lazy ByteString
that way. That
might actually be a lot less hackish than it sounds at first, because
it allows us to explicity detect and handle failure cases and control
how much and when things are read into RAM. Though, at that point, it
starts to feel a bit like enumerators/iteratee/etc. Perhaps we will
switch to pipes
at some point in time. pipes
provides streaming for
pure (non-IO) values -- which is probably what we want here.
Evan Czaplicki has been doing a ton of work on ELM recently. As described on the ELM Language Homepage:
"Elm is a type-safe, functional reactive language that compiles to HTML, CSS, and JavaScript."
It is easy to use ELM with Happstack -- no special support is
required. (i.e., we do not need happstack-elm
). Vincent Ambo has
created a simple demo here:
Vincent also wrote a nice blog post showing how to combine web-routes
(type-safe URL routing) with Hamlet
(a QuasiQuoter
for generating
blaze-html
from HTML-like syntax):
Best of both worlds: Using Hamlet and web-routes with Happstack
JMacro
is great for creating JavaScript, but we still have a hole
when it comes to generating CSS. The
language-css
library already contains combinators and a syntax ADT for CSS3.
If it had a parser, then we could also create a syntax-checking
[css| |]
QuasiQuoter
.
I have discussed the idea with Anton Kholomiov, and he is interested -- but we could use some one else to help write the parser. If you love writing parsers, this should be a fun little project.
Finally, if you could suggest one thing that would make the happstack.com website nicer that would be awesome. There are four things we already plan to change:
use black on white text instead of gray on white
fix the paragraph width so that paragraphs are around 45em wide.
fix the grid alignment so that things are properly aligned to the grid
add more dates to the pages so that it clear that the site and project is still active
If you have other suggestions, we would love to hear them! If you want to hack on the theme directly, that is even better!
Until next time, happy hacking.
Jeremy Shaw
I am pleased to annouce the release of 'reform'. A full tutorial is available here:
http://www.happstack.com/docs/crashcourse/Reform.html
Reform is an HTML form generation and validation library. It follows in the footsteps of formlets and digestive-functors <= 0.2. In fact, much of the code in reform comes from the digestive-functors-0.2 code base.
Reform is designed to be usuable with a wide variety of Haskell web servers and templating libraries. You can find the following packages on hackage:
The source code is available via darcs at:
http://patch-tag.com/r/stepcut/reform
The darcs repo also includes proof-of-concept support for 'Heist'.
Reform will feel very familiar to formlets and digestive-functors <= 0.2 users.
The primary motivation behind this library is to provide a supported alternative to digestive-functors 0.2 for users that prefer 0.2 over 0.3.
The key new feature in reform is the ability to separate the validation code from the view generation code. This allows library authors to provide validators (known as Proofs) which the users can use when constructing their forms. The proof names appear in the type-signatures. This allows the library author to ensure that the value returned by a user created form is not merely the correct type, but has also passed validation.
The reform-happstack package also provides simple and transparent Cross-Site Request Forgery (CSRF) protection, using the double-submit method. This method has some weaknesses. For example, I believe it can be circumvented if your site is vulnerable to cross-site scripting (XSS) attacks. If you have suggestions on how to improve the CSRF protection -- please let us know!
I hope to do a full comparison of reform vs digestive-functors 0.3 vs yesod forms in a few weeks.
I am pleased to announce that the Happstack blog is now hosted on happstack.com.