Write Yourself a Scheme in 48 Hours

A Haskell Tutorial

By Jonathan Tang

Download in .tar.gz format
Download in .zip format

Contents

  1. Overview
  2. First Steps: Compiling and running
  3. Parsing
    1. A Simple Parser
    2. Whitespace
    3. Literal Numbers and Strings: Return Values
    4. Lists, Dotted Lists, and Quoted Data: Recursive Parsers
  4. Evaluation, Part 1
    1. Displaying Values: Show and Typeclasses
    2. Evaluating Primitive Values: Pattern Matching
    3. Evaluating Primitive Functions: First-class Functions
  5. Intermezzo: Error Checking & Exceptions
  6. Evaluation, Part 2
    1. Additional Primitives: Partial Application
    2. Conditionals: Pattern Matching 2
    3. List Primitives: car, cdr, and cons
    4. Equal? and Weak Typing: Heterogenous Lists
  7. Building a REPL: Basic I/O
  8. Adding Variables and Assignment: Mutable State in Haskell
  9. Defining Scheme Functions: Closures and Environments
  10. Creating IO Primitives: File I/O
  11. Towards a Standard Library: Fold and Unfold
  12. Conclusion & Further Resources

5. Intermezzo: Error Checking

Currently, there are a variety of places within the code where we either ignore errors or silently assign "default" values like #f or 0 that make no sense. Some languages - like Perl and PHP - get along fine with this approach. However, it often means that errors pass silently throughout the program until they become big problems, which means rather inconvenient debugging sessions for the programmer. We'd like to signal errors as soon as they happen and immediately break out of execution.

First, we need to import Control.Monad.Error to get access to Haskell's built-in error functions:

import Control.Monad.Error
Then, we should define a data type to represent an error:
data LispError = NumArgs Integer [LispVal]
               | TypeMismatch String LispVal
               | Parser ParseError
               | BadSpecialForm String LispVal
               | NotFunction String String
               | UnboundVar String String
               | Default String
This is a few more constructors than we need at the moment, but we might as well forsee all the other things that can go wrong in the interpreter later. Next, we define how to print out the various types of errors and make LispError an instance of Show:
showError :: LispError -> String
showError (UnboundVar message varname) = message ++ ": " ++ varname
showError (BadSpecialForm message form) = message ++ ": " ++ show form
showError (NotFunction message func) = message ++ ": " ++ show func
showError (NumArgs expected found) = "Expected " ++ show expected 
                                  ++ " args; found values " ++ unwordsList found
showError (TypeMismatch expected found) = "Invalid type: expected " ++ expected
                                       ++ ", found " ++ show found
showError (Parser parseErr) = "Parse error at " ++ show parseErr

instance Show LispError where show = showError
Our next step is to make our error type into an instance of Error. This is necessary for it to work with GHC's built-in error handling functions. Being an instance of error just means that it must provide functions to create an instance either from a previous error message or by itself:
instance Error LispError where
     noMsg = Default "An error has occurred"
     strMsg = Default
Then we define a type to represent functions that may throw a LispError or return a value. Remember how parse used an Either data type to represent exceptions? We take the same approach here:
type ThrowsError = Either LispError
Type constructors are curried just like functions, and can also be partially applied. A full type would be "Either LispError Integer" or "Either LispError LispVal", but we want to say "ThrowsError LispVal" and so on. We only partially apply Either to LispError, creating a type constructor ThrowsError that we can use on any data type.

Either is yet another instance of a monad. In this case, the "extra information" being passed between Either actions is whether or not an error occurred. Bind applies its function if the Either action holds a normal value, or passes an error straight through without computation. This is how exceptions work in other languages, but because Haskell is lazily-evaluated, there's no need for a separate control-flow construct. If bind determines that a value is already an error, the function is never called.

The Either monad also provides two other functions besides the standard monadic ones:

  1. throwError, which takes an Error value and lifts it into the Left (error) constructor of an Either
  2. catchError, which takes an Either action and a function that turns an error into another Either action. If the action represents an error, it applies the function, which you can use to eg. turn the error value into a normal one via return or re-throw as a different error.
In our program, we'll be converting all of our errors to their string representations and returning that as a normal value. Let's create a helper function to do that for us:
trapError action = catchError action (return . show)
The result of calling trapError is another Either action which will always have valid (Right) data. We still need to extract that data from the Either monad so it can passed around to other functions:
extractValue :: ThrowsError a -> a
extractValue (Right val) = val
We purposely leave extractValue undefined for a Left constructor, because that represents a programmer error. We intend to use extractValue only after a catchError, so it's better to fail fast than to inject bad values into the rest of the program.

Now that we have all the basic infrastructure, it's time to start using our error-handling functions. Remember how our parser had previously just return a string saying "No match" on an error? Let's change it so that it wraps and throws the original ParseError:

readExpr :: String -> ThrowsError LispVal
readExpr input = case parse parseExpr "lisp" input of
    Left err -> throwError $ Parser err
    Right val -> return val
Here, we first wrap the original ParseError with the LispError constructor Parser, and then use the built-in function throwError to return that in our ThrowsError monad. Since readExpr now returns a monadic value, we also need to wrap the other case in a return function.

Next, we change the type signature of eval to return a monadic value, adjust the return values accordingly, and add a clause to throw an error if we encounter a pattern that we don't recognize:

eval :: LispVal -> ThrowsError LispVal
eval val@(String _) = return val
eval val@(Number _) = return val
eval val@(Bool _) = return val
eval (List [Atom "quote", val]) = return val
eval (List (Atom func : args)) = mapM eval args >>= apply func
eval badForm = throwError $ BadSpecialForm "Unrecognized special form" badForm
Since the function application clause calls eval (which now returns a monadic value) recursively, we need to change that clause. First, we had to change map to mapM, which maps a monadic function over a list of values, sequences the resulting actions together with bind, and then returns a list of the inner results. Inside the Error monad, this sequencing performs all computations sequentially but throws an error value if any one of them fails - giving you Right [results] on success, or Left error on failure. Then, we used the monadic "bind" operation to pass the result into the partially applied "apply func", again returning an error if either operation failed.

Next, we change apply itself so that it throws an error if it doesn't recognize the function:

apply :: String -> [LispVal] -> ThrowsError LispVal
apply func args = maybe (throwError $ NotFunction "Unrecognized primitive function args" func)
                        ($ args)
                        (lookup func primitives)
We didn't add a return statement to the function application ($ args). We're about to change the type of our primitives, so that the function returned from the lookup itself returns a ThrowsError action:
primitives :: [(String, [LispVal] -> ThrowsError LispVal)]
And, of course, we need to change the numericBinop function that implements these primitives so it throws an error if there's only one argument:
numericBinop :: (Integer -> Integer -> Integer) -> [LispVal] -> ThrowsError LispVal
numericBinop op singleVal@[_] = throwError $ NumArgs 2 singleVal
numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op
We use an at-pattern to capture the single-value case because we want to include the actual value passed in for error-reporting purposes. Here, we're looking for a list of exactly one element, and we don't care what that element is. We also need to use mapM to sequence the results of unpackNum, because each individual call to unpackNum may fail with a TypeMismatch:
unpackNum :: LispVal -> ThrowsError Integer
unpackNum (Number n) = return n
unpackNum (String n) = let parsed = reads n in 
                          if null parsed 
                            then throwError $ TypeMismatch "number" $ String n
                            else return $ fst $ parsed !! 0
unpackNum (List [n]) = unpackNum n
unpackNum notNum = throwError $ TypeMismatch "number" notNum
Finally, we need to change our main function to use this whole big error monad. This can get a little complicated, because now we're dealing with two monads (Error and IO). As a result, we go back to do-notation, because it's nearly impossible to use point-free style when the result of one monad is nested inside another:
main :: IO ()
main = do
    args <- getArgs
    evaled <- return $ liftM show $ readExpr (args !! 0) >>= eval
    putStrLn $ extractValue $ trapError evaled

Here's what this new function is doing:
  1. args is the list of command-line arguments
  2. evaled is the result of:
    1. taking first argument (args !! 0)
    2. parsing it (readExpr)
    3. passing it to eval (>>= eval; the bind operation has higher precedence than function application)
    4. calling show on it within the Error monad.
    Note also that the whole action has type IO (Either LispError String), giving evaled type Either LispError String. It has to be, because our trapError function can only convert errors to Strings, and that type must match the type of normal values
  3. caught is the result of
    1. calling trapError on evaled, converting errors to their string representation.
    2. calling extractValue to get a String out of this Either LispError String action
    3. printing the results through putStrLn
Compile and run the new code, and try throwing it a couple errors:
jdtang@debian:~/haskell_tutorial/code$ ghc -package parsec -o errorcheck listing5.hs
jdtang@debian:~/haskell_tutorial/code$ ./errorcheck "(+ 2 \"two\")"
Invalid type: expected number, found "two"
jdtang@debian:~/haskell_tutorial/code$ ./errorcheck "(+ 2)"
Expected 2 args; found values 2
jdtang@debian:~/haskell_tutorial/code$ ./errorcheck "(what? 2)"
Unrecognized primitive function args: "what?"

Some readers have reported that you need to add a --make flag to build this example, and presumably all further listings. This tells GHC to build a complete executable, searching out all depedencies listed in the import statements. The command above works on my system, but if it fails on yours, give --make a try.

Next: Evaluation, Part 2