Software design patterns were forged at a time when object oriented programming (OOP) reigned. This led to "design patterns" becoming somewhat synonymous with "OOP design patterns". But design patterns are solutions to problems and problems are relative to the context in which they occur. A design problem in OOP is not necessarily one in functional programming (FP), and vice versa.
From a Haskell perspective, many (but not all) of the well known "Gang of Four" patterns [Design patterns, Gamma et al.] become so easy to solve that it is not worth going to the trouble of treating them as patterns.
However, we still want to identify patterns of problems and solutions, so that when we have a dejavu moment of having experienced a problem before, we are somewhat prepared. Design patterns remain relevant for Haskell, after all, "dejavu is language neutral" (Erich Gamma).
Modularity means more than modules. Our ability to de-compose a problem into parts depends directly on our ability to glue solutions together. To support modular programming, a language must provide good glue." | ||
--Why Functional Programming Matters - John Hughes |
In order to have a meaningful conversation about Haskell design patterns, we'll begin our exploration by looking at the three primary kinds of "glue" in Haskell: first-class functions, the Haskell type system, and lazy evaluation. This chapter revisits the Haskell you already know through the lens of design patterns, and looks at:
Higher-order functions
Currying
Recursion
Types, pattern matching, polymorphism
Lazy Evaluation
Monads
Functions are our first kind of "glue" in Haskell.
Haskell functions are first-class citizens of the language. This means that:
We can name a function just as we can name any primitive value:
square = \x -> x * x
We can pass functions to other functions:
map square [1, 3, 5, 7]
(Here,
map
is a higher-order function.)Functions can produce other functions (here, by currying the
foldr
function):sum = foldr (+) 0
Functions can form part of other data structures:
let fs = [(* 2), (* 3), (* 5)] zipWith (\f v -> f v) fs [1, 3, 5]
This places Haskell functions on an equal footing with primitive types.
Let's compose these three functions, f
, g
, and h
, in a few different ways:
f, g, h :: String -> String
The most rudimentary way of combining them is through nesting:
z x = f (g (h x))
Function composition gives us a more idiomatic way of combining functions:
z' x = (f . g . h) x
Finally, we can abandon any reference to arguments:
z'' = f . g . h
This leaves us with an expression consisting of only functions. This is the "point-free" form.
Programming with functions in this style, free of arguments, is called tacit programming.
It is hard to argue against the elegance of this style, but in practice, point-free style can be more fun to write than to read: it can become difficult to infer types (and, therefore, meaning). Use this style when ease of reading is not overly compromised.
Haskell allows for both curried and uncurried functions:
greetCurried :: String -> String -> String greetCurried title name = "Greetings " ++ title ++ " " ++ name greetUncurried :: (String, String) -> String greetUncurried (title, name) = "Greetings " ++ title ++ " " ++ name
Let's suppose that we need a function with the first argument fixed:
greetCurried' :: String -> String greetCurried' = greetCurried "Ms" greetUncurried' :: String -> String greetUncurried' name = greetUncurried ("Ms", name)
In both cases, we have applied one of the arguments and thereby specialized our original function. For the uncurried function we needed to mention all parameters in the reshaped function, while for the curried one we could just ignore subsequent arguments.
Since it is fairly easy to translate a curried function to an uncurried function (and vice versa) the question arises: why and when would one want to use uncurried functions?
Consider a function that returns a few pieces of data, which we choose to express as a tuple:
g n = (n^2, n^3)
Then suppose we want to find the maximum value in that tuple:
max (g 11)
This would not work because the max
function is curried, but we can easily align the types by uncurrying:
uncurry max (g 11)
Whenever we have a function returning a tuple and we want to consume that tuple from a curried function, we need to uncurry that function. Alternatively, if we are writing a function to consume an output tuple from another function, we might choose to write our function in uncurried (tuple arguments) form so that we don't have to later uncurry our function or unpack the tuple.
It is idiomatic in Haskell to curry by default. There is a very important reason for this. Thanks to currying, we can do this:
map (map square) [[1], [2,2], [3,3,3]]
We cannot, however, do this:
let map' = uncurry map map' (map' square) [[1], [2,2], [3,3,3]]
We need to explicitly curry map'
in order to compose it with other functions:
(curry map') (curry map' square) [[1], [2,2], [3,3,3]]
Curried functions are composable, whereas uncurried functions are not.
If we can apply one function argument at a time, nothing stops us from doing so at entirely different places in our codebase. For instance, we might "wire in" some authentication-related information into our function at one end of the codebase and use the specialized function in another part of the codebase that has no cognizance of the authentication argument and its related types.
This can be a powerful tool for decoupling, the site of decoupling being the function argument list!
Recursion is even more fundamental than functions and types, in the sense that we can have recursive functions and types. Moreover, recursion
can refer to syntax (a function or type referring to itself) or to the execution process.
Recursion can be viewed as a pattern for avoiding mutable state:
sumNonTail [] = 0 sumNonTail (x:xs) = x + (sumNonTail xs)
Without recursion, we would need to iterate through the list and keep adding to an intermediary sum until the list is exhausted, consider the example:
sumNonTail [2, 3, 5, 7]
This first expands into a nested chain of deferred operations, and when there are only primitives left in the expression, the computation starts folding back on itself:
-- 2 + sumNonTail [3, 5, 7] -- 2 + (3 + sumNonTail [5, 7]) -- 2 + (3 + (5 + sumNonTail [7])) -- 2 + (3 + (5 + (7 + sumNonTail []))) -- 2 + (3 + (5 + (7 + 0))) -- 2 + (3 + (5 + 7)) -- 2 + (3 + 12) -- 2 + 15 -- 17
The sumNonTail
function is non-tail-recursive. Because the recursion is "trapped" by the +
operator, we need to hold the entire list in memory to perform the sum.
Tail recursion addresses the exorbitant use of space we have with non-tail-recursive processes:
sumTail' acc [] = acc sumTail' acc (x:xs) = sumTail' (acc + x) xs sumTail xs = sumTail' 0 xs
This form of recursion looks less like mathematical induction than the sumNonTail
function did, and it also requires a helper function sumTail'
to get the same ease of use that we had with sumNonTail
. The advantage is clear when we look at the use of "constant space" in this process:
-- sumTail [2, 3, 5, 7] -- sumTail' 0 [2, 3, 5, 7] -- sumTail' 2 [3, 5, 7] -- sumTail' 5 [5, 7] -- sumTail' 10 [7] -- sumTail' 17 [] -- 17
Even though sumTail
is a recursive function, it expresses an iterative process. sumNonTail
is a recursive function that expresses a recursive process.
Tail recursion is captured by the foldl
function:
foldlSum = foldl (+) 0
The foldl
function expands in exactly the same way as sumTail'
. In contrast, foldrSum
expands in the same way as sumNonTail
:
foldrSum = foldr (+) 0
One can clearly see the tail recursion in the definition of foldl
, whereas in the definition of foldr
, recursion is "trapped" by f
:
foldr _ v [] = v foldr f v (x:xs) = f x (foldr f v xs) foldl _ v [] = v foldl f v (x:xs) = foldl f (f v x) xs
Algebraic types give us a very concise way to model composite types, even recursive ones. Pattern matching makes it easy to work with algebraic types. Type classes enable both the fundamental types of polymorphism: parametric and ad-hoc.
Together, these capabilities allow us to easily express many of the Gang of Four patterns.
Algebraic data types can express a combination of types, for example:
type Name = String
type Age = Int
data Person = P String Int -- combination
They can also express a composite of alternatives:
data MaybeInt = NoInt | JustInt Int
Here, each alternative represents a valid constructor of the algebraic type:
maybeInts = [JustInt 2, JustInt 3, JustInt 5, NoInt]
Type combination is also known as "product of types" and type alternation as "sum of types". In this way, we can create an "algebra of types", with sum and product as operators, hence the name Algebraic data types.
By parameterizing algebraic types, we create generic types:
data Maybe' a = Nothing' | Just' a
Algebraic data type constructors also serve as "deconstructors" in pattern matching:
fMaybe f (Just' x) = Just' (f x) fMaybe f Nothing' = Nothing' fMaybes = map (fMaybe (* 2)) [Just' 2, Just' 3, Nothing']
On the left of the =
sign we deconstruct; on the right, we construct. In this sense, pattern matching is the complement of algebraic data types.
We can capture the Composite pattern very succinctly with recursive algebraic types, for example:
data Tree a = Leaf a | Branch (Tree a) (Tree a)
This pattern describes the need to sometimes unify a composite structure with individual members of that structure. In this case, we're unifying Leaf
(a leaf being a part of a tree) and Tree
(the composite structure). Now we can write functions that act on trees and leaves:
size :: Tree a -> Int size (Leaf x) = 1 size (Branch t u) = size t + size u + 1
Functions over recursive types are typically recursive themselves.
Polymorphism refers to the phenomenon of something taking many forms. In Haskell, there are two kinds of polymorphism: parametric and ad-hoc (first described by Strachey in Fundamental Concepts in Programming Languages, 1967).
In the following code, we have a function defined on a list of any type. The function is defined at such a high level of abstraction that the precise input type simply never comes into play, yet the result is of a particular type:
length' :: [a] -> Int length' [] = 0 length' (x:xs) = 1 + length xs
The length
function exhibits parametric polymorphism because it acts uniformly on a range of types that share a common structure, in this case, lists:
length' [1,2,3,5,7] length' ['1','2','3','5','7']
In this sense, length
is a generic function. Functions defined on parametric data types tend to be generic.
The canonical example of adhoc polymorphism (also known as overloading) is that of the polymorphic + operator, defined for all types that implement the Num
typeclass:
class Num a where (+) :: a -> a -> a instance Int Num where (+) :: Int → Int → Int x + y = intPlus x y instance Float Num where (+) :: Float → Float → Float x + y = floatPlus x y
In fact, the introduction of type classes into Haskell was driven by the need to solve the problem of overloading numerical operators and equality.
When we call (+
) on two numbers, the compiler will dispatch evaluation to the concrete implementation, based on the types of numbers being added:
let x_int = 1 + 1 -- dispatch to 'intPlus' let x_float = 1.0 + 2.5 -- dispatch to 'floatPlus' let x = 1 + 3.14 –- dispatch to 'floatPlus'
In the last line, we are adding what looks like an int
to a float
. In many languages, we'd have to resort to explicit coercion (of int
to float
, say) to resolve this type of "mismatch". In Haskell, this is resolved by treating the value of 1
as a type-class polymorphic value:
ghci> :t 1 -- Num a => a
1
is a generic value; whether 1
is to be considered an int
or a float
value (or a fractional, say) depends on the context in which it will appear.
There are two kinds of ad-hoc polymorphism. We've seen the first type already in this chapter:
data Maybe' a = Nothing' | Just' a fMaybe f (Just' x) = Just' (f x) fMaybe f Nothing' = Nothing'
The fMaybe
function is polymorphically defined over the alternations of Maybe
. In order to directly contrast the two kinds of polymorphism, let's carry this idea over into another example:
data Shape = Circle Float | Rect Float Float area :: Shape -> Float area (Circle r) = pi * r^2 area (Rect length width) = length * width
The area
function is dispatched over the alternations of the Shape
type.
We could also have achieved a polymorphic area
function over shapes in this way:
data Circle = Circle Float data Rect = Rect Float Float class Shape a where area :: a -> Float instance Shape Circle where area (Circle r) = pi * r^2 instance Shape Rect where area (Rect length' width') = length' * width'
Tip
Downloading the example code
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Instead of unifying shapes with an algebraic "sum of types", we created two distinct shape types and unified them with the Shape
type-class. This time the area
function exhibits class-based ad-hoc polymorphism.
It is tempting to ask "which approach is best?" Instead, let's explore the important ways in which they differ:
While exploring adhoc polymorphism, we found 2 ways to express static
type dispatch in Haskell (where static means that the dispatch is resolved at compile time, as opposed to dynamic dispatch
, which is resolved only at runtime). Let's return to our area
function:
area (Circle 10)
The compiler will dispatch to the appropriate overloaded area
function, this can happen in two ways:
In the case of alternation-based adhoc polymorphism, the dispatch is done by matching on the
Shape
alternation (Circle
orRect
)for class-based adhoc polymorphism, dispatch is achieved by matching on the typeclass implemented by
Circle
(in this caseShape
)
We've referred to this as "dispatching on type" but, strictly speaking, type dispatch would have to resemble the following invalid Haskell:
f v = case (type v) of Int -> "Int: " ++ (show v) Bool -> "Bool" ++ (show v)
So far, we have only seen dispatching on one argument or single dispatch
. Let's explore what double-dispatch
might look like:
data CustomerEvent = InvoicePaid Float | InvoiceNonPayment data Customer = Individual Int | Organisation Int payment_handler :: CustomerEvent -> Customer -> String payment_handler (InvoicePaid amt) (Individual custId) = "SendReceipt for " ++ (show amt) payment_handler (InvoicePaid amount) (Organisation custId) = "SendReceipt for " ++ (show amt) payment_handler InvoiceNonPayment (Individual custId) = "CancelService for " ++ (show custId) payment_handler InvoiceNonPayment (Organisation custId) = "SendWarning for " ++ (show custId)
The payment_handler
function defines behavior for all four permutations of CustomerEvent
and Customer
. In an OOP language, we would have to resort to the visitor pattern
to achieve multiple dispatch.
On the one hand, we have parametric polymorphism, where a single generic function acts on a variety of types. This is in contrast to adhoc polymorphism, where we have an overloaded function that is resolved to a particular function implementation by the compiler. Put another way, parametric polymorphism allows us to lift the level of abstraction, whereas ad-hoc polymorphism gives us a powerful tool for decoupling.
In this sense, parametric polymorphism is considered to be "true polymorphism", while ad hoc is only "apparent" (or "synthetic").
Haskell blurs the distinction between ad hoc (specialized) and parametric (generic) polymorphism. We can see this clearly in the definition of the type class for equality:
class Eq a where (==), (/=) :: a -> a -> Bool x == y = not (x /= y) x /= y = not (x == y)
(==
) and (/=
) are both given mutually recursive default implementations. An implementer of the Eq
class would have to implement at least one of these functions; in other words, one function would be specialized (ad-hoc polymorphism), leaving the other defined at a generic level (parametric polymorphism). This is a remarkable unification of two very different concepts.
Functions and types intersect in several ways. Functions have a type, they can act on types, they can belong to type classes, and they can be specific or generic in type. With these capabilities, we can express several more Gang of Four patterns.
Thanks to higher-order functions (HOF), we can easily inject behavior:
strategy fSetup fTeardown = do setup -– fullfil this function's purpose teardown
Here, we are defining an abstract algorithm by letting the caller pass in functions as arguments, functions that complete the detail of our algorithm. This corresponds to the strategy pattern, also concerned with decoupling an algorithm from the parts that may change.
"Strategy lets the algorithm vary independently from clients that use it." | ||
--Design Patterns, Gamma et al. |
In "OOP speak", the strategy pattern uses delegation to vary an algorithm, while the template pattern uses inheritance to vary parts of an algorithm. In Haskell, we don't have OOP inheritance, but we have something far more powerful: type classes. We might easily abstract an algorithm with the following typeclass that acts as an abstract class:
class TemplateAlgorithm where setup :: IO a → a teardown :: IO a → a doWork :: a → a fulfillPurpose = do setup doWork teardown
"Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation" | ||
--Design Patterns, Gamma et al. |
The map
function takes care of navigating the structure of the list, while the square
function only deals with each element of the list:
map square [2, 3, 5, 7]
We have decoupled flow control from function application, which is akin to the iterator pattern.
The history of Haskell is deeply entwined with the history of lazy evaluation.
"Laziness was undoubtedly the single theme that united the various groups that contributed to Haskell's design... Once we were committed to a lazy language, a pure one was inescapable." | ||
--History of Haskell, Hudak et al |
Thanks to lazy evaluation, we can still consume the undoomed part of this list:
doomedList = [2, 3, 5, 7, undefined] take 0 xs = [] take n (x:xs) = x : (take (n-1) xs) main = do print (take 4 doomedList)
The take
function is lazy because the cons operator (:
) is lazy (because all functions in Haskell are lazy by default).
A lazy cons evaluates only its first argument, while the second argument, the tail, is only evaluated when it is selected. (For strict lists, both head and tail are evaluated at the point of construction of the list.)
The proxy pattern has several motivations, one of which is to defer evaluation; this aspect of the proxy pattern is subsumed by lazy evaluation.
The simple idea of laziness has the profound effect of enabling self-reference:
infinite42s = 42 : infinite42s
Streams (lazy lists) simulate infinity through "the promise of potential infinity" [Why Functional Programming Matters, Hughes]:
potentialBoom = (take 5 infinite42s)
A stream is always just one element cons'ed to a tail of whatever size. A function such as take
consumes its input stream but is decoupled from the producer of the stream to such an extent that it doesn't matter whether the stream is finite or infinite (unbounded). Let's see this in action with a somewhat richer example:
generate :: StdGen -> (Int, StdGen) generate g = random g :: (Int, StdGen) -- import System.Random main = do gen0 <- getStdGen let (int1, gen1) = (generate g) let (int2, gen2) = (generate gen1)
Here we are generating a random int
value and returning a new generator, which we could use to generate a subsequent random int
value (passing in the same generator would yield the same random number).
Carrying along the generator from one call to the next pollutes our code and makes our intent less clear. Let's instead create a producer of random integers as a stream:
randInts' g = (randInt, g) : (randInts' nextGen) where (randInt, nextGen) = (generate g)
Next, we suppress the generator part of the stream by simply selecting the first part of the tuple:
randInts g = map fst (randInts' g) main = do g <- getStdGen print (take 3 (randInts g))
We still pass in the initial generator to the stream, but now we can consume independently from producing the numbers. We could just as easily now derive a stream of random numbers between 0 and 100:
randAmounts g = map (\x -> x `mod` 100) (randInts g)
This is why it is said that lazy lists decouple consumers from producers. From another perspective, we have a decoupling between iteration and termination. Either way, we have decoupling, which means we have a new way to modularize and structure our code.
Consider a banking system, where we want to record the current balance of a customer's account. In a non-pure functional language, we would typically model this with a mutable variable for the balance: for each debit and credit in the "real world", we would mutate the balance variable.
"Can we avoid identifying time in the computer with time in the modeled world?" | ||
--[Structure and Interpretation of Computer Programs, p. 316], Abelson and Sussman |
Yes, we can describe the evolution of a variable as a function of time. Instead of a mutable variable for bank balance, we have a sequence of balance values. In other words, we replace a mutable variable with the entire history of states:
bankAccount openingB (amt:amts) = openingB : bankAccount (openingB + amt) amts (take 4 (bankAccount 0 [-100, 50, 50, 1]))
Here we have modeled the bank account as a process, which takes in a stream of transaction amounts. In practice, amounts are more likely to be an unbounded stream, which we can easily simulate with our randAmounts
stream from earlier:
(take 4 (bankAccount 0 (randAmounts g)))
"if we concentrate on the entire time history, we do not emphasize change" | ||
--[Structure and Interpretation of Computer Programs, p. 317], Abelson and Sussman |
Streams provide an antidote to mutation, but as with all powerful medicine, streams create new problems. Because streams pretend to express a complete list while only incrementally materializing the list, we cannot know exactly when evaluation of a list element happens. In the presence of side effects, this ignorance of the order of events becomes a serious problem. We will devote the next chapter to dealing with mutation in the presence of laziness.
The Monad typeclass is best understood by looking at it from many perspectives. That is why this book has no definitive section or chapter on Monad. Instead, we will successively peel off the layers of this abstraction.
Let's begin by looking at a simple example of interpreting expressions:
data Expr = Lit Int | Div Expr Expr eval :: Expr -> Int eval (Lit a) = a eval (Div a b) = eval a `div` eval b
The eval
function interprets expressions written in our Expr
data type:
(eval (Lit 42)) –- 42 (eval (Div (Lit 44) (Lit 11))) -- 4
Stripped of real-world concerns, this is very elegant. Now let's add (naive) capability to deal with errors in our interpreter. Instead of the eval
function returning integers, we'll return a Try
data type, which caters for success (Return
) and failure (Err
):
data Try a = Err String | Return a
The refactored evalTry
function is now much more syntactically noisy with case statements:
evalTry :: Expr -> Try Int evalTry (Lit a) = Return a evalTry (Div a b) = case (evalTry a) of Err e -> Err e Return a' -> case (evalTry b) of Err e -> Err e Return b' -> divTry a' b' -- helper function divTry :: Int -> Int -> Try Int divTry a b = if b == 0 then Err "Div by Zero" else Return (a `div` b)
The reason for the noise is that we have to explicitly propagate errors. If (evalTry a
) fails, we return Err
and bypass evaluation of the second argument.
We've used the Try
data type to make failure more explicit, but it has come at a cost. This is precisely where monads come into play. Let's make our Try
data type an instance of Monad
:
instance Monad Try where return x = Return x fail msg = Err msg Err e >>= _ = Err e Return a >>= f = f a
Next, we refactor evalTry
by using the bind
operator (>>=
):
evalTry' :: Expr -> Try Int evalTry' (Lit a) = Return a evalTry' (Div a b) = (evalTry' a) >>= \a' -> (evalTry' b) >>= \b' -> divTry a' b' -– evalTry' (Div (Lit 44) (Lit 0))
The bind operator enables error propagation:
Err e >>= _ = Err e
Whenever we encounter an Err
, the subsequent argument to the bind
operator will be ignored and thereby propagate the error. While this gets rid of our case statements, it is hardly very friendly. Let's rewrite it using the do
notation:
evalTry'' (Lit a) = Return a evalTry'' (Div a b) = do a' <- (evalTry' a) b' <- (evalTry' b) divTry a' b'
The Try
data type helped us make failure more explicit, while making it an instance of Monad
made it easier to work with. In this same way, Monad
can be used to make many other "effects" more explicit.
"Being explicit about effects is extremely useful, and this is something that we believe may ultimately be seen as one of Haskell's main impacts on mainstream programming" | ||
--History of Haskell, Hudak et al. |
The IO monad is particularly interesting and played an important role in the development of Haskell. When Haskell was first conceived, there were no monads and also no clear solution to the "problem of side effects". In 1989, Eugenio Moggi used monads, from Category theory, to describe programming language features. Phillip Wadler, a then member of the Haskell Committee, recognized that it was possible to express Moggi's ideas in Haskell code:
"Although Wadler's development of Moggi's ideas was not directed towards the question of input/output, he and others at Glasgow soon realised that monads provided an ideal framework for I/O" | ||
--History of Haskell, Hudak et al |
Because Haskell is a purely functional language, side effects call for special treatment. We will devote a whole chapter to exploring this topic in Chapter 2, Patterns for I/O.
As useful as monads are for capturing effects, we also need to compose them, for example, how do we use monads for failure, I/O, and logging together?
In the same way that functional composition allows us to write more focused functions that can be combined together, monad composition allows us to create more focused monads, to be recombined in different ways. In Chapter 3: Patterns for Composition, we will explore monad transformers and how to create "monad stacks" of transformers to achieve monad composition.
In this chapter, we explored the three primary kinds of "glues" that Haskell provides: functions, the type system, and lazy evaluation. We did so by focusing on composability of these building blocks and found that wherever we can compose, we are able to decompose, decouple, and modularize our code.
We also looked at the two main kinds of polymorphism (parametric and ad hoc) as they occur in Haskell.
This chapter set the scene for starting our study of design patterns for purely functional programming in Haskell.
The next chapter will focus on patterns for I/O. Working on I/O in the face of lazy evaluation is a minefield, and we would do well with some patterns to guide us.