Vincent Zoonekynd's Blog

Sat, 01 Jun 2013: Haskell

Since I had not learnt a new programming langage for years (the previous one was Ruby, three years ago), I decided to try a new one: Haskell. It is a functionnal language, i.e., one keeping the good ideas of Lisp and forgetting the awful syntax -- Lisp is still being used, in domains where, in spite of its excessive use of parentheses, it keeps an edge on easier programming languages -- for instance in computer music. Haskell often wins the ICFP contest, which is not limited to functional languages.

http://www.amazon.com/Notes-Metalevel-Studies-Music-Research/dp/9026519753
http://en.wikipedia.org/wiki/International_Conference_on_Functional_Programming_Contest
http://linuxfr.org/2007/01/11/21899.html

Other marginal (i.e., not yet mainstream) languages one might want to learn are OCaml (another functional language, as fast as C or C++ according to the programming languages shootout and used in some real-world applications, e.g., in finance), Erlang (another functional language, developped and used by Ericson, that allows you to write multi-threaded or even distributed applications without really thinking about it) or Squeak (a Smalltalk offspring, that prominently features in the OLPC (One laptop per child) initiative).

http://www.galois.com/cufp/slides/2006/YaronMinsky.pdf
http://shootout.alioth.debian.org/
http://www.algorithm.com.au/downloads/talks/Concurrency-and-Erlang-LCA2007-andrep.pdf
http://www.squeak.org/
http://linuxfr.org/2007/01/06/21860.html

Here are the notes I took while learning Haskell.

Programming should be a declarative task

When I write a function (usually in Perl or R, sometimes in C), I start to write a few assertions to check that the arguments given to my function make sense, I then write a few more assertions to check that the result of the function makes sense, and finally add the body of the function, in between, so that my assertions stop failing.

My first functions in Haskell were like that: I write the properties my function should satisfy and -- well, that is all. Those properties usually uniquely define the function: the interpreter can then figure out how to compute it.

Indeed, this is how computers should be working: doing our job, with only minimal input from us.

That "minimal input" can be troublesome, though: if the computer chooses an inefficient path to perform the computations, the programmer will have to figure out what it is doing and how to tweak it into being more efficient. With some languages (Prolog), this is very unnatural, but I have not had any problem so far with Haskell: it usually makes the right decisions and the path it follows is simply dictated by rule rewriting.

Experiments in mathematics

Computer languages can be used in mathematics classes, especially for numerical computations, linear algebra, formal computations (derivatives, integrals) or geometry, mainly as an experimental tool.

Haskell is such a tool, but it also applies to formal computations (logics) and axiomatic systems.

For instance, the following is an axiomatic definition of the natural numbers, saying that a natural number is either zero or the successor of a natural number.

data Natural = Zero | Successor Natural

As an exercise, you can define the arithmetic operations from that, and even start to play with number theory.

http://www.amazon.com/Haskell-Road-Logic-Maths-Programming/dp/0954300696

Category theory

Haskell documentation, and perhaps even Haskell code, is littered with references to category theory. Here are a few examples.

Most of the time, we consider the category whose objects are types (integer, double, strings, list of integers, tree, etc.) and whose morphisms are functions between those types.

http://golem.ph.utexas.edu/category/2006/11/a_categorical_manifesto.html
http://golem.ph.utexas.edu/category/2006/08/cartesian_closed_categories_an.html

Internal homomorphisms

The fact that functions are "first class citizens" translates into the existence of internal homsets, i.e., given two types (objects) a and b, there exists a type a -> b.

Curry isomorphism

The Curry isomorphism states that giving a function (a,b) -> c is equivalent to giving a function a -> (b -> c): in category theory, this is the adjunction (the equal sign should be understood as a bijection, Hom(x,y) is the set of functions from x to y, HOM(x,y) is the type of functions from x to y, i.e., the internal homset mentionned above, i.e., x -> y)

Hom(a*b,c) = Hom(a,HOM(b,c))

In Haskell:

-- From Prelude.hs
curry          :: ((a,b) -> c) -> (a -> b -> c)
curry f x y     = f (x,y)

uncurry        :: (a -> b -> c) -> ((a,b) -> c)
uncurry f p     = f (fst p) (snd p)

fst            :: (a,b) -> a
fst (x,_)       = x

snd            :: (a,b) -> b
snd (_,y)       = y

"map" and functors

The "map" function, that turns a function a -> b into a function between lists [a] -> [b] illustrates the notion of a functor: "List" not only creates types (from a type a you get the type [a] of lists of a) but also functions (from a function a -> b, you get a function [a] -> [b]), in a way compatible with function composition.

This is only one example of a functor: you can define your own, for instance with trees or graphs instead of lists.

-- From Prelude.hs
class Functor f where
    fmap :: (a -> b) -> (f a -> f b)

-- "List" is a functor
instance Functor [] where
    fmap = map

-- "Maybe" is a functor
-- ("Maybe a" is a type whose values are those of "a" plus "Nothing",
-- that can be used to represent missing values or failure.)
data Maybe a = Nothing | Just a
instance Functor Maybe where
    fmap f Nothing  = Nothing
    fmap f (Just x) = Just (f x)

Version control and coproducts

Darcs is a version control system (similar to CVS or Subversion), i.e., a software to keep track of all the changes you make to your files (when you are writing a book or a software). It is written in Haskell and based on the "theory of patches": it does not really store the successive versions of your work, but rather the changes, i.e., the repository is the catogory whose objects are the different versions of your work and whose morphisms are the changes you make.

If several people work on the same project, they may contribute different patches, say A -> B and A -> C. In order to reconcile ("merge") those patches, Darcs has to augment the repository by their coproduct.

http://en.wikibooks.org/wiki/Understanding_darcs:Patch_theory

Monads

One of the trickiest (and most central) Haskell notions is that of a monad -- I am not sure that knowing the categorical notion, formerly called a "triple", helps -- on the contrary.

http://www.case.edu/artsci/math/wells/pub/pdf/ttt.pdf

The "Maybe" class, introduced above, is a monad. From a categorical point of view, it means we have morphisms

Maybe Maybe a -> Maybe a      (product)
a -> Maybe a                  (unit)

The code is:

-- From Prelude.hs
instance Monad Maybe where
    Just x  >>= k = k x
    Nothing >>= k = Nothing
    return        = Just
    fail s        = Nothing

The unit is called "return":

Prelude> :type return
return :: (Monad m) => a -> m a

The product is not defined explicitely but can be obtained from the >>= operator as follows.

Prelude> :type (>>=)
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
Prelude> :type (>>= (\x -> x))
(>>= (\x -> x)) :: (Monad m) => m (m b) -> m b

In the case of the "Maybe" monad, it is simply

p :: Maybe (Maybe a) -> Maybe a
p Just Just x = Just x
p _ = Nothing

The reason for the apparently complicated type of the >>= (monadic) operator is that when using monads one often wants to compose functions f :: a -> m b and g :: b -> m c. (In the following, function composition is denoted ".".)

Main> :type f
f :: a -> Maybe b
Main> :type g
g :: b -> Maybe c
Main> :type (fmap g) . f
fmap g . f :: a -> Maybe (Maybe c)
Main> :type p . (fmap g) . f
p . fmap g . f :: a -> Maybe c

Main> :type (\x -> (f x) >>= g)
\x -> f x >>= g :: a -> Maybe c

The >>= operator allows one to chain such functions:

Main> :type (\x -> return x >>= f >>= g)
\x -> return x >>= f >>= g :: a -> Maybe c

Monads and actions

The main use of monads is the encapsulation of side effects: since Haskell is a pure functional programming language, a function, given the same arguments, should always return the same value (Haskell can even choose to remember this value and avoid recomputing it), and is not allowed to have side effects -- in particular, a Hello World program looks impossible.

Side effects can be achieved by using a monad, here, the IO monad. For instance, the type of the "print" function is a -> IO (), which means it take an argument of type "a" (e.g., a string, an integer, etc.) and returns nothing "()" with IO side effects.

A function with such side effects is called an action.

The >>= operator can then be used to chain actions: imperative programming becomes possible. (To enhance readability, actions can equivalently be chained with a "do" construct -- you do not really need to know about monads to use them.)

Since actions are just functions, and since functions are just a type like any other, actions can be manipulated, transformed, reordered before being executed. This is similar to the "eval" mechanism in some languages, but does not require any parsing and therefore avoids any syntax error in the "generated code".

Monads can also be used to store intermediate results (e.g., a "state") or to implement "continuations".

http://en.wikipedia.org/wiki/Continuation

More about Haskell

Installation

There are two main Haskell environments: hugs (simple, lightweight) and ghc (more complete); the corresponding interpreters are called "hugs" and ghci". On Gentoo, you can install them as follows.

emerge hugs ghc

Documentation

Tutorials:

http://en.wikibooks.org/wiki/Haskell/YAHT
http://www.haskell.org/haskellwiki/Hitchhikers_Guide_to_the_Haskell

Books:

http://www.cs.vu.nl/~ralf/JoLLI06/
http://www.haskell.org/soe/

Meta documentation:

http://www.haskell.org/haskellwiki/Learning_Haskell
http://www.haskell.org/haskellwiki/Books_and_tutorials
http://www.haskell.org/haskellwiki/MetaTutorial

More general ressources on functional programming and Haskell

http://lambda-the-ultimate.org/
http://sequence.complete.org/

posted at: 19:17 | path: /Linux | permanent link to this entry