What is a programming language?

This post is my highly condensed take on Abelson and Sussman’s classic Structure and Interpretation of Computer Programs (SICP) and Krishnamurthi’s Programming Languages: Application and Interpretation (PLAI), which I think do a nice (if incomplete) job of answering the question “what is a programming language?” by implementing several of them.

Just as in the reference texts, the example languages I’ll give are fragments of Lisp, implemented in Racket (a dialect of Lisp in the Scheme family). Lisp syntax is unusual in that it forces the programmer to explicitly write out parse trees for code, e.g. (+ 2 (* 3 4)) instead of 2 + 3 * 4.

The benefit of this is that I don’t have to write a parser just to experiment with languages: code written in Lisp-like syntax can be read literally as a data structure with the Lisp builtin quote (usually abbreviated with a single tick mark ').

My goal in this post is to implement a programming language which can run the following two test programs with minimal syntactic revision.

The first test is a recursively defined factorial function; that is, I want our language to support recursion.

(define (fact n)
  (if (= n 0)
      1
      (* n (fact (- n 1)))))

The second test is Paul Graham’s accumulator function, specified in the appendix to his article Revenge of the Nerds. Graham thinks a good test of a language’s expressiveness is the length of its shortest accumulator implementation, and who am I to argue with him? The idea of an accumulator: one calls (make-accumulator 100), assigning the result to a variable a; then a is a procedure, and evaluating (a 10) once gives 110, evaluating (a 10) again gives 120, and evaluating (a -34.1) at this point gives 85.9. Different accumulator objects produced by make-accumulator should be independent.

(define (make-accumulator n)
  (lambda (x)
    (begin (set! n (+ n x))
           n)))

The above Racket implementation relies on the following three language features, all of which I intend to capture in the toy language of this post:

  • Functions are first-class values: they are valid inputs to and return values from other functions.
  • Mutable data: variable values can be reassigned after they’ve been defined.
  • Lexical variables: I want the ability to both read and write to intermediate scopes.1

With our goal in mind, we start by defining a very simple little programming language which, while Turing complete, isn’t sufficiently powerful to easily do what we want.
Continue reading

Advertisements