</p> This article was first published 2007 March 22, and is converted from a doctools document. Some translation errors may remain. I’ve tested viewing this page under both Linux and Macintosh environments, with Chrome and Firefox. (I do not have access to Internet Explorer, sorry.) </p>
When I first started coding in Haskell, not one month ago, the concept of monads just thoroughly confused the heck out of me. Sure, like any clueless newbie, I could code in Haskell without knowledge of monads. But who wants to use a gun without knowing how to clear the chamber first? Who wants to cook without knowing how the stove works? Who wants to learn Haskell without knowing about monads? After all -- it's only natural!
After much frustration, I finally came to know them well enough to write this article.
2 Monads as Constructors
To grossly over-simplify, monadic expressions are constructors. You read that right -- constructors. As in object constructors. To understand why this is the case, we need to look at the simplest possible monad: the State Monad.
OK, scenario one: you are trying to LR-parse data from some input stream, and you need to maintain some kind of state while iterating over the input. Let's conveniently ignore unfoldr for the time being; you're inexperienced, and you are trying to apply your knowledge of how to code something in C towards the Haskell solution. At least, that is, for this example.
Scenario two: you're writing a window manager for the X Window System, and you need to maintain the list of windows currently visible on the screen. How do you do this in a purely function environment, especially when you also need to process events from a multitude of different sources, parse (ahem) configuration files, etc?
Due to the genericity of the problem of maintaining state, I won't get into specific data structures for such state. Instead, I will concentrate plumbing required to make the magic of maintaining state happen in a functional language.
3 Observation on Sequential Evaluation
In any functional language, you cannot guarantee the order of execution of functions. At least, you're not supposed to. Just think: according to all the math texts out there, the results of a function magically appear only once all its inputs are satisfied. Functions are supposed to "just work." There's no real way to impose an order of evaluation on them. Or can you?
A function cannot be evaluated meaningfully without providing all the parameter values it needs to produce a result. Thus, it follows that you can artificially impose a sequential evaluation order by threading the output of one function into the input of another, like this:
> let hello w = w ++ "!" > in "hello" : (hello "world") : 
The above code looks like a normal list construction, but remember that you can't build the list until you have the nodes to stuff into it in the first place. Hence, it must evaluate the hello function before the list can be fully evaluated.
4 Observation on Threading State
Now that we know how to thread the execution of functions properly, we turn our attention to the obvious problem of maintaining state from one function to the next. Well, this is actually pretty simple -- simply pass, and return, that state as parameters and/or tuples. For example:
> data Stack a = Stack a (Stack a) > push :: a -> Stack a -> Stack a > push a stk = Stack a stk > pop :: Stack a -> (a, Stack a) > pop (Stack a stk) = (a, stk)
Let's ignore errors for the time being. What is important to observe in the above code is that (a) we're passing explicit state (though it's not strictly true, you can think of stk as an object, and its use as analogous to self or this in more traditional object oriented languages) at all times, and (b) the stack "methods" are always returning an updated state as a result; this is implicit in object oriented languages, where state is updated in-place. This isn't necessarily true in functional languages, though optimizations usually produce equivalent code.
Well, as you can see, having to maintain all that state, manually threading the output of one function to the input of another, can get substantially tedious! This can lead to code that is harder to read, even harder to debug, and all but impossible to modify at some future date.
There has to be some way of factoring this hand-threaded code out.
It turns out that there is. But first, we need one more observation about the nature of software before we can put all the pieces together.
5 Observation on Threading of Flow Control
In any given imperative programming language, like Python for instance, it's generally accepted that when you write something like:
>>> a = 5 >>> b = 4 >>> c = a+b >>> print c
it is pretty clear that a is assigned a value, then b is assigned a value, then the sum of those values are printed to the screen. In other words, order of execution is determined exclusively by order of listing. Control flow follows, strictly, from top down. Unless told otherwise, of course, usually by a for loop, or some such. But those are minor details.
It is pretty clear, in the above code, that we cannot evaluate the value of c meaningfully without first having evaluated both a and b first. Sound familiar? Yes -- that's right -- nested functions. In this case, if you'll allow me some literary license, we can rewrite the above code using lambda expressions in a continuation passing notation, like so:
>>> (lambda (a) (lambda (b) (lambda (c) print c)(a+b)))(4)))(5)
Yikes! Maybe Python isn't such a good language to express lambda substitutions with bound variables in. Indeed, lambdas are going away in Python in the future, so we might as well look at the equivalent code in a language that directly supports the notion:
> show $ (\a -> (\b -> (\c -> c) (a+b)) 4) 5
What, this still isn't clear? OK, let me show you again, only this time we'll take it statement by statement. Or, rather, their equivalents.
> show $ (\a -> ...) 5
If this looks like normal function calling syntax, that's because it is. What we're doing is using a lambda expression to assign the value 5 to the formal variable a. Pretty slick, eh? Now, let's look at the next "statement":
> show $ (\a -> (\b -> ...) 4) 5
Notice a pattern yet? Note how "the rest of the program" is treated as a single function, expressed in terms of the preceding variable assignments? Note how the only way to successfully evaluate the functions is to evaluate them in the proper order? Yes; that's right -- a program, mathematically speaking, is just a nested set of functions. Evaluate them in the proper order, and you will get the same results as an imperative languge program.
6 Observation: Thinking Algebraically
By now, you're likely to have had at least one epiphany on where we're going. If not, I'll now make things more explicit by making one more observation: at any point in an imperative program, you can always split it between statements, such that evaluation of the top part is a necessary precondition to the execution of the bottom part. It seems pretty obvious at first glance, but it's a critical observation to make explicit. Because, after all, if we can manipulate functions like anything else in higher-order languages, and we can, then it should be possible to build some function which returns a properly composed sequence of other functions, that does precisely that. I mean, execute A before B, that is. For the sake of argument, let's call this >>. We can therefore rewrite our Python example like so:
(a = 5) >> (b = 4) >> (c = a+b) >> (print c)
The associativity of >> really doesn't matter; it can be shown that:
(a = 5) >> ((b = 4) >> ((c = a+b) >> (print c)))
is equal to:
(((a = 5) >> (b = 4)) >> (c = a+b)) >> (print c)
Hence, we don't usually bother drawing parentheses around such constructs. Haskell is, by default, left-associative, so Haskell says, also, that >> is left-associative. But, strictly speaking, it doesn't have to be. Anyway, drawn yet another way:
a = 5 >> -- Note how these operators are "between" statements b = 4 conceptually. >> c = a+b >> print c
By the definition of >>, whatever is on the left-hand (top) side of the operator must evaluate before the right-hand (bottom) side.
The >> operator properly arranges for sequences of code execution, but it certainly doesn't address that sticky issue of transferring state from one statement to the next. How does, after all, one bind variables to values? Well, remember that dirty trick of nested lambda expressions? >> creates those, but as shown above, doesn't thread state from one portion of a program to the next. Fortunately, we have >>= to do that for us:
> 5 >>= \a -> 4 >>= \b -> a+b >>= \c -> show c
Remember the "we don't care about associativity because it just works out" rule for >>? It's the same for >>= too, since it, too, sequences bits of the program correctly. In fact:
> a >> b = a >>= \_ -> b
In other words, >> is just a special case of >>=, where we don't particularly care about the result of a at all.
7 Putting it Together: The Not So Obvious
So, we now have all the pieces we need to finally explain just what the heck monads are. So, let's get to work building our very own state monad!
7.1 Data Types
We need something to stuff into our state, but we don't know precisely what. Hence, we're going to use parametric types to describe it. But one thing is for sure, we know that properly sequenced functions depends on nested lambda expressions. Therefore:
> newtype ExState s a = ExS (s -> (s,a))
That's right; our data type basically is itself a function. It takes some state as input, and returns (hopefully) some value of interest, along with a modified state. Remember the observation about how we threaded state as input and got another state as output? The plumbing is right there, but is abstracted in a newtype. Note that a data type could work here just as well.
Next, we need some function to compose pieces of our program together -- we need to define the >>= operator. It's pretty clear that a program takes raw input, but provides a transformed version of those inputs in some capacity (otherwise, what's the point of writing the program?) Even if the program provides no useful value for a, the fact that it updated some kind of state somewhere is of great value. Hence, the result of a program must be of some type of ExState. Otherwise, what's the point?
> (>>=) :: ExState s a -> (a -> ExState s b) -> ExState s b > top >>= btm = ExS (\initialState -> > let (sTop, vTop) = perform top initialState > (sBtm, vBtm) = perform (btm vTop) sTop > perform (ExS f) = f > in (sBtm, vBtm))
Look at what it's doing. The result of >>= is a kind of function, just like we said it should be before. It takes an initialState, and returns the pair (sBtm, vBtm), where vBtm is the return value from the complete computation, and sBtm is the new state. And, as both names and intuition would suggest, these values correspond to the results you'd get by finishing the program at the bottom of its listing; just like in any other programming language. We see that perform (btm vTop) sTop is used to compute these values. Note that btm vTop must, by definition, be some kind of ExState; we use the helper function perform to yank the function out of it. That allows us to invoke the bottom part of the program's functionality.
Remember in our earlier discussion how >>= bound only a single variable? Well, it turns out that by doing so, and requiring that function to return a monadic value itself, we get the following algebraic substitutions:
> let (sBtm, vBtm) = perform (btm vTop) sTop > let (sBtm, vBtm) = (\someState -> (anotherState, anotherValue)) sTop > let (sBtm, vBtm) = (finalState, finalValue)
Pretty slick, eh? Note the middle line where substitution produces a function. But where does sTop come from? Yes; it comes from the first let-binding:
> let (sTop, vTop) = perform top initialState
So, in order to properly evaluate the bottom, we must first evaluate the top portion of the program. The result of the binding is itself a function which, upon evaluation, evaluates these "sub-programs" in the proper order. And, as you might expect, sub-programs can consist of sub-sub-programs, and sub-sub-sub-programs, ad infinitum. The associativity rules of >>= allow us to just keep on threading and threading and threading.
And since we're passing around all these crazy data structures with functions containing functions containing functions, and states being carefully threaded from function to function, now you see why I said, earlier in this article, that monadic expressions are constructors. What we're building, literally, is a direct-threaded representation of a program. The name direct-threaded isn't accidental; invented in the 1960s and first successfully used by the Forth programming language, it's purpose was to thread together a bunch of functions, passing state from function to function. Just like what we're doing here!
But, there is one last issue involved with all this: once we are working inside a monad, how do we actually make use of the state we're maintaining?
There are generally two kinds of accessors: getters and setters. Haskell makes this patently clear when working with monads, especially State monads, because the only way to access the "hidden" state is through such accessors.
> getState >>= \st -> doSomethingWith st >>= . . .
It's clear that getState must be a function that returns ExState in this case, since executing it must produce the pair (sTop, vTop) inside the >>= function.
> getState = ExS (\s -> (s,s))
Yes, it really is as simple as that. What we're doing is we're taking our state, and returning it as the next return value; we're also leaving it unchanged in the next state as well.
Changing state is accomplished with a similar, but complimentary, function:
> putState x = ExS (\s -> (x,x))
Note that we "return" x as well; it's not strictly necessary, since 99.999% of the time, putState would be used with >> rather than >>=, thus discarding the result anyway. But, no matter what, it's patently clear that the state half of the pair returned is, in fact, being set to x; precisely what we want.
One final "accessor" that really isn't is the return. This is a kind of hybridized putState; its purpose is simple: return a value, leaving the state otherwise unaltered:
> return x = ExS (\s -> (x, s))
7.3 Running "Programs" in the State Monad
So, let's recap. We have a set of operators that construct structures in memory representing what has to be performed (at least conceptually; I should point out that compilers designed to work with monads often optimize this step out), at what time, and with what state. Speaking of state, we have operators which grant us access to it, for both alteration and query purposes. We can construct the cure for world hunger with these basic primitives, but if only we can actually invoke the programs we create, and get the results!
Once we have the facilities for all that plumbing in place, and we can now clearly define the concept of "program execution" in terms of simple algebraic functions, we can now turn our attention to actually doing something useful with it. Like, incrementing a counter, for example:
> inc = getState >>= \s -> (putState (s+1))
Looks pretty straight-forward; admittedly, it's a far cry from curing world hunger, but it is a start! In fact, we can thread several "invokations" of this function along too:
> counterIncrementByThree = inc >> inc >> inc
Before you know it, we'll be curing AIDS. But, this syntax, while useful at times, is pretty ugly and hard to manage overall, so haskell allows us to use conventional imperative-style programming constructs, called do-notation, to write clearer code:
> inc = do > s <- getState > putState (s+1)
See what's happening? The <- operator maps to the >>= operator, complete with lambda variable binding. Quite convenient indeed! But, we needn't always use variables:
> counterIncrementByThree = do > inc > inc > inc
In this case, because we're not assigning variables, the Haskell compiler knows to use the >> operator. But, remember how >> was defined in terms of >>= earlier? That's why we didn't need to explicitly define our own >> operator.
This is all fine, but, the observant reader will point out that we still have yet to explain how to actually reap the rewards of our programming. We need to extract the results of the computation. This is usually done with a runner. In fact, we have already seen such a runner:
> perform (ExS f) = f
That's the one. Previously, we defined it to be local to the >>= operator, and to not bother with the details of passing initial state. A more complete runner, however, does exactly that -- deal with the state, I mean:
> runExS (ExS prg) initialState = snd $ prg initialState
Since the result of running a monadically constructed function is a pair containing (state', result), we use the snd function to extract only the result. In some cases, you'll be more interested in the state; in this case, you can define another function if you like:
> stateExS (ExS prg) initialState = fst $ prg initialState
If both are of interest to you, then instead of evaluating prg twice, which can be a gratuitous waste of time if you're computing the digits of pi to a billion places, then you can forego the pair selection functions, and just return the whole pair itself:
> valueAndStateExS (ExS prg) initialState = prg initialState
Typically, you'd use it something like this in a real-world program:
> main = do > let (someState, someValue) = valueAndStateExS myCounter 0 > putStr $ "Resulting value: " ++ (show someValue) > putStr $ "Resulting state: " ++ (show someState)
Doesn't look like much, does it? But note that 0 hanging off to the right on the call to valueAndStateExS? That is the program's initial state -- in C or C++, this is equivalent to the global state, usually established by numeric constants in main(), or through global (static, hopefully!) variables in some module.
Now that you know how the state monad works, you can apply this concept to any other monad, including the IO monad. There are other kinds of monads that are not stateful or state-like though. These include the list monad () and specialized data types like Maybe. But, for now, I'll let these specializations of the concept rest. You've already been through a lot, and I'm itching to get this paper online. And, besides, having gone through all this work, you can now finally appreciate the Control.Monad.State and related libraries.
Well, there you have it; I must bring this essay to a close now, knowing that you have read yet another tutorial on monads and what they actually are. My essay differs from most of the others in that it doesn't invoke the concept of containers, which apparently has confused a lot of people. It also doesn't invoke any complex mathematics (like category theory, where monads come from). Instead, I resort to normal, day-to-day programming experience, possessed by any programmer of any imperative programming language. I hope that this explanation has proven as useful to you, as it has to me.
The software contained in previous sections are valid Haskell fragments of code. However, they don't tell the whole story. Some details I've had to leave out for brevity's sake. But fear not -- contained herein is the complete program that allowed me to write this essay. It's pretty short and dense, but by following along with the earlier parts of this essay, hopefully you'll be able to see how all the data flows fit together with the monadic plumbing. I should also warn you, this code is far from optimal -- it's written so that it works, and is clear. If you look at, e.g., Control.Monad.State library, you'll find a substantially terser definition, that does the same basic things.
> newtype ExState s a = ExS (s -> (s,a)) > > instance Monad (ExState s) where > top >>= btm = ExS (\initialState -> > let (vTop, sTop) = perform top initialState > (vBtm, sBtm) = perform (btm vTop) sTop > perform (ExS f) = f > in (vBtm, sBtm)) > > return x = ExS (\initialState -> (x, initialState)) > > getState = ExS (\initialState -> (initialState, initialState)) > putState x = ExS (\initialState -> (x, x)) > > stateExS (ExS prg) initialState = fst $ prg initialState > runExS (ExS prg) initialState = snd $ prg initialState > valueAndStateExS (ExS prg) initialState = prg initialState
I would like to take this time to thank the folks in the #Haskell IRC channel for opening my eyes to this topic. I could not have come to this understanding without their input. I especially would like to single out Cale, Dolio, Kowey, and SamB, whose patience with me has been the only thing keeping my interest in understanding monads alive.
Special thanks to Cale and Dolio for reviewing this document. Their contributions were invaluable at helping to clarify various points.
And, oddly and finally, thanks to James Burke, for providing a literary style that I continuously endeavor to match. It's highly conversational, and very engaging; it's not the dreary doldrum you'd expect from a paper on, of all things, stuff called "Monads." I mean, c'mon, who ever heard of monads? They sound like medical conditions! Anyway, his creative use of suspense and his masterful touch of humor in educational literature (as distinct from academic literature) lures the reader into wanting to learn more, which is precisely the effect I'm looking for. I hope I've succeeded.