August 10, 2025

Building Lisp from the Ground Up

A follow-up to "The Quiet Elegance of Lisp"

In my previous post, I wrote about Lisp's quiet elegance—how its uniformity and minimalism create something surprisingly powerful. But perhaps the most remarkable thing about Lisp isn't just its philosophical beauty; it's how that beauty translates into practical reality.

John McCarthy's original 1960 paper showed that you could build an entire programming language from just seven primitive operations. Today, I want to show you exactly how that works by building a working Lisp interpreter from scratch.

What We're Building

Before diving into code, let's understand what components we need for a complete Lisp system:

1. Data Representation - How to represent atoms and lists in our host language
2. Core Operations - McCarthy's seven primitives plus practical additions
3. Evaluator - The heart that transforms data into computation
4. Parser - Converting text into our internal data representation

That's it. Four components that bootstrap an entire programming language. The elegance is that each component is simple, but their combination creates infinite expressive power.

The Foundation: Seven Primitives Plus Two

McCarthy identified these fundamental operations as computationally complete:

atom - test if something is an atom (not a list)
eq - test if two atoms are equal
car - get the first element of a list
cdr - get the rest of a list
cons - construct a new list by prepending an element
cond - conditional evaluation
quote - prevent evaluation of an expression

Historical note: Why car and cdr instead of first and rest? These names come from the IBM 704 computer where Lisp was first implemented. car stood for "Contents of the Address part of Register" and cdr for "Contents of the Decrement part of Register"—specific hardware instructions that happened to be perfect for list operations. The names stuck, becoming one of computing's most enduring examples of historical accident becoming universal convention.

We'll add practical necessities for dynamic programming:

lambda - create anonymous functions at runtime
define - bind names to values in the environment

Plus basic arithmetic and boolean operations for readable examples. The key insight: everything beyond these primitives can be built using these primitives.

Why lambda and define? McCarthy's original model assumed functions would be mathematical definitions handled outside the evaluator. But for interactive programming, we need to create and name functions dynamically:

// lambda creates anonymous functions:
['lambda', ['x'], ['*', 'x', 2]]  // A function that doubles its input

// define binds names in the environment:
['define', 'double', ['lambda', ['x'], ['*', 'x', 2]]]  // Now we can call (double 5)

// This enables recursion - functions can call themselves by name:
['define', 'factorial', ['lambda', ['n'], 
    ['cond', 
        [['eq', 'n', 0], 1], 
        [true, ['*', 'n', ['factorial', ['-', 'n', 1]]]]]
]]

Implementation: Building the Interpreter

Now let's implement each component. The complete interpreter is surprisingly compact:

Data Representation:

function isAtom(x) { return typeof x !== 'object' || x === null; }
function isList(x) { return Array.isArray(x); }

Parser (Converting Text to Data):

function parse(input) {
    // Better tokenization that handles multi-line input
    let tokens = input.replace(/\(/g, ' ( ')
                      .replace(/\)/g, ' ) ')
                      .replace(/\n/g, ' ')          // normalize newlines
                      .replace(/\s+/g, ' ')         // normalize whitespace
                      .trim()
                      .split(' ')
                      .filter(t => t.length > 0);   // remove empty tokens
    
    function parseExpr(tokens) {
        if (tokens.length === 0) {
            throw new Error("Unexpected end of input");
        }
        
        let token = tokens.shift();
        if (token === '(') {
            let expr = [];
            while (tokens.length > 0 && tokens[0] !== ')') {
                expr.push(parseExpr(tokens));
            }
            if (tokens.length === 0) {
                throw new Error("Missing closing parenthesis");
            }
            tokens.shift(); // consume ')'
            return expr;
        } else if (token === ')') {
            throw new Error("Unexpected closing parenthesis");
        } else {
            return isNaN(token) ? token : Number(token);
        }
    }
    return parseExpr(tokens);
}

Core Operations:

function atom(x) { return isAtom(x); }
function eq(x, y) { 
    // Handle empty lists specially
    if (Array.isArray(x) && Array.isArray(y)) {
        return x.length === 0 && y.length === 0;
    }
    return x === y; 
}
function car(list) { return list[0]; }
function cdr(list) { return list.slice(1); }
function cons(element, list) { return [element, ...list]; }

The Evaluator (Where Magic Happens):

function evaluate(expr, env = {}) {
    // Atoms: look up in environment or return value
    if (isAtom(expr)) {
        return (typeof expr === 'string' && expr in env) ? env[expr] : expr;
    }
    
    // Lists: function calls
    if (isList(expr) && expr.length > 0) {
        let [fn, ...args] = expr;
        
        // Special forms (don't evaluate arguments automatically)
        if (fn === 'quote') return args[0];
        if (fn === 'cond') return evalCond(args, env);
        if (fn === 'lambda') return { type: 'lambda', params: args[0], body: args[1], env };
        if (fn === 'define') { 
            env[args[0]] = evaluate(args[1], env); 
            return env[args[0]]; 
        }
        
        // Primitives
        if (fn === 'atom') return atom(evaluate(args[0], env));
        if (fn === 'eq') return eq(evaluate(args[0], env), evaluate(args[1], env));
        if (fn === 'car') return car(evaluate(args[0], env));
        if (fn === 'cdr') return cdr(evaluate(args[0], env));
        if (fn === 'cons') return cons(evaluate(args[0], env), evaluate(args[1], env));
        
        // Arithmetic (practical additions)
        if (fn === '+') return args.reduce((sum, arg) => sum + evaluate(arg, env), 0);
        if (fn === '*') return args.reduce((prod, arg) => prod * evaluate(arg, env), 1);
        if (fn === '-') {
            if (args.length === 1) return -evaluate(args[0], env);
            let values = args.map(arg => evaluate(arg, env));
            return values.reduce((a, b) => a - b);
        }
        if (fn === '/') {
            let values = args.map(arg => evaluate(arg, env));
            return values.reduce((a, b) => a / b);
        }
        
        // Boolean operations (practical additions)
        if (fn === 'and') return args.every(arg => evaluate(arg, env));
        if (fn === 'or') return args.some(arg => evaluate(arg, env));
        if (fn === 'not') return !evaluate(args[0], env);
        
        // Function calls
        let func = evaluate(fn, env);
        if (func?.type === 'lambda') {
            let newEnv = { ...func.env };
            func.params.forEach((param, i) => newEnv[param] = evaluate(args[i], env));
            return evaluate(func.body, newEnv);
        }
    }
    throw new Error(`Cannot evaluate: ${JSON.stringify(expr)}`);
}

function evalCond(clauses, env) {
    for (let [condition, result] of clauses) {
        if (evaluate(condition, env)) return evaluate(result, env);
    }
    throw new Error("No cond clause was true");
}

Testing Our Lisp

Now we have a complete Lisp system! The addition of define transforms our interpreter from a theoretical demonstration into a genuinely useful programming environment. We can:

Create named functions that persist across evaluations
Write recursive functions that reference themselves
Build libraries of reusable functions
Create complex programs by composing simpler definitions

Let's see it in action:

let env = {};

// Basic operations
evaluate(parse("(+ 1 2 3)"), env);                       // 6
evaluate(parse("(car (quote (1 2 3)))"), env);           // 1
evaluate(parse("(cons 1 (quote (2 3)))"), env);          // [1, 2, 3]

// Boolean operations
evaluate(parse("(and true true false)"), env);           // false
evaluate(parse("(or false true)"), env);                 // true
evaluate(parse("(not false)"), env);                     // true

// Define and call functions
evaluate(parse("(define double (lambda (x) (* x 2)))"), env);
evaluate(parse("(double 5)"), env);                      // 10

// Recursion works!
evaluate(parse(`
    (define factorial (lambda (n)
        (cond
            ((eq n 0) 1)
            (true (* n (factorial (- n 1)))))))
`), env);
evaluate(parse("(factorial 5)"), env);                   // 120

// Building complexity from simple parts
evaluate(parse("(define square (lambda (x) (* x x)))"), env);
evaluate(parse("(define sum-squares (lambda (x y) (+ (square x) (square y))))"), env);
evaluate(parse("(sum-squares 3 4)"), env);               // 25

// Higher-level data structures and operations
evaluate(parse(`
    (define assoc (lambda (key alist)
        (cond
            ((eq alist (quote ())) (quote ()))
            ((eq (car (car alist)) key) (car alist))
            (true (assoc key (cdr alist))))))
`), env);

evaluate(parse(`
    (define map (lambda (func list)
        (cond
            ((eq list (quote ())) (quote ()))
            (true (cons (func (car list)) 
                       (map func (cdr list)))))))
`), env);

// Using our higher-level functions
evaluate(parse(`
    (define phonebook (quote 
        ((alice 555-1234) 
         (bob 555-5678) 
         (charlie 555-9012))))
`), env);
evaluate(parse("(assoc bob phonebook)"), env);           // ["bob", "555-5678"]

evaluate(parse("(map square (quote (1 2 3 4)))"), env);  // [1, 4, 9, 16]
evaluate(parse("(map double (quote (10 20 30)))"), env); // [20, 40, 60]

Here is a link to a gist with the full sourcecode that you can play with.

Theoretical vs Practical

Why add arithmetic and booleans if seven primitives are complete? Because there's a difference between computational completeness and practical usability.

Theoretically, even numbers could be Church numerals and booleans could be selector functions. The seven primitives prove Lisp is theoretically minimal. The practical additions make it humanly usable. Everything still reduces to the core primitives when needed.

The Emergence of Complexity

From these simple foundations, any data structure or control pattern emerges. Here are some mind-bending examples:

Numbers as Pure Functions (Church Numerals):

Similarly, even numbers could be represented as pure functions using Church numerals. This sounds weird, but here's the key insight: numbers are just "how many times you do something."

// Church numerals: numbers as repetition counters
let ZERO = ['lambda', ['f', 'x'], 'x'];                    // do nothing to x
let ONE = ['lambda', ['f', 'x'], ['f', 'x']];              // apply f once to x
let TWO = ['lambda', ['f', 'x'], ['f', ['f', 'x']]];       // apply f twice to x

Let's see this with a concrete example. Imagine f is "add an exclamation mark" and x is "hello":

// What ZERO does:
// Input: f="add !", x="hello" 
// ZERO applies f zero times: "hello" (unchanged)

// What ONE does:  
// Input: f="add !", x="hello"
// ONE applies f once: "hello!"

// What TWO does:
// Input: f="add !", x="hello" 
// TWO applies f twice: "hello!!"

So ['lambda', ['f', 'x'], 'x'] is ZERO because it takes a function f and value x, then applies f zero times (just returns x unchanged).

Booleans as Choice Functions:

Even more surprisingly, booleans can be pure functions that make choices:

// Booleans as selector functions
let TRUE = ['lambda', ['x', 'y'], 'x'];    // always pick the first option
let FALSE = ['lambda', ['x', 'y'], 'y'];   // always pick the second option

Think of TRUE and FALSE as answering "which one do you want?" TRUE always picks the first thing you offer, FALSE always picks the second:

// TRUE('apple', 'banana') = 'apple'  (TRUE picks first)
// FALSE('apple', 'banana') = 'banana' (FALSE picks second)

// Now logical operations become natural:
let IF = ['lambda', ['condition', 'then-branch', 'else-branch'],
    ['condition', 'then-branch', 'else-branch']];

// IF(TRUE, 'yes', 'no') = TRUE('yes', 'no') = 'yes'
// IF(FALSE, 'yes', 'no') = FALSE('yes', 'no') = 'no'

// AND: only TRUE if both inputs are TRUE
let AND = ['lambda', ['p', 'q'], ['p', 'q', FALSE]];
// If p is TRUE: pick q (could be TRUE or FALSE)  
// If p is FALSE: pick FALSE

// OR: TRUE if either input is TRUE  
let OR = ['lambda', ['p', 'q'], ['p', TRUE, 'q']];
// If p is TRUE: pick TRUE
// If p is FALSE: pick q (could be TRUE or FALSE)

Data Structures and Control Flow:

// Association lists (dictionaries)
evaluate(parse(`
    (define assoc (lambda (key alist)
        (cond
            ((eq alist (quote ())) (quote ()))
            ((eq (car (car alist)) key) (car alist))
            (true (assoc key (cdr alist))))))
`), env);

// Map function  
evaluate(parse(`
    (define map (lambda (func list)
        (cond
            ((eq list (quote ())) (quote ()))
            (true (cons (func (car list)) 
                       (map func (cdr list))))))
`), env);

Every programming construct—objects, modules, exception handling—can be built this way.

The Profound Simplicity

What we've built in around 150 lines of code is a complete programming language foundation. From McCarthy's seven primitives plus two practical additions, we've created a system that can:

Define and call functions
Handle recursion and complex data structures
Express any control flow pattern
Even modify its own code (since code is data)

This isn't just academic beauty—it's practical power. When you understand computation at this level, you gain an almost supernatural ability to see through complexity to the essential patterns underneath. You recognize that most "advanced" language features are just different arrangements of these same basic building blocks.

Beyond the Bootstrap

Real Lisp systems build upward from here, adding efficient data structures, advanced macro systems, module systems, and rich libraries. But remarkably, these additions don't change the fundamental model—they're all built using the same primitives we started with, just arranged in more sophisticated patterns.

The Meta-Circular Magic: Perhaps most remarkably, our JavaScript Lisp interpreter could be rewritten in Lisp itself. Since Lisp code is just lists, and our interpreter manipulates lists, we could define evaluate as a Lisp function:

(define evaluate (lambda (expr env)
    (cond
        ((atom expr) (lookup expr env))
        ((eq (car expr) (quote quote)) (car (cdr expr)))
        ((eq (car expr) (quote cond)) (eval-cond (cdr expr) env))
        ; ... and so on
    )))

True Homoiconicity: This reveals why Lisp has true homoiconicity while other languages only approximate it. Many languages let you manipulate strings of code or abstract syntax trees, but in Lisp, the runtime data structure is the code structure. When you write (+ 1 2), you're not writing text that represents addition—you're writing a list that is addition.

There's no parsing step between code and data, no AST transformation, no separate representation. A Lisp program can examine, modify, and generate other Lisp programs using exactly the same operations (car, cdr, cons) that it uses for any other data. This is why macros in Lisp are so powerful—they're just functions that manipulate lists, and those lists happen to be executable code.

This self-hosting property—where a language can implement itself—reveals something profound about computation. Lisp doesn't just process lists; it is lists processing themselves. The distinction between program and data doesn't just blur—it disappears entirely.

Conclusion: The Quiet Revolution

When you first encounter (+ 1 2 3), you might think it's just a different way to write 1 + 2 + 3. But now you know better. Those parentheses aren't just syntax—they're the visible expression of a profound insight about computation itself.

In a world where programming languages grow ever more complex, Lisp remains a reminder that sometimes the most powerful approach is also the simplest. Seven primitives. Two practical additions. Infinite possibilities. Quiet elegance.

The next time you encounter seemingly intractable complexity, remember McCarthy's insight: somewhere in that complexity is a simple pattern waiting to be discovered—a few basic building blocks that, when properly understood and composed, can express anything you need.

↑ Back to top