24.03.09

On Scheme, Python, and 6.001

Tags: , , , , , — Jeff @ 09:10

From a day at a Lisp conference (emphasis in original):

Costanza asked Sussman why MIT had switched away from Scheme for their introductory programming course, 6.001. This was a gem. He said that the reason that happened was because engineering in 1980 was not what it was in the mid-90s or in 2000. In 1980, good programmers spent a lot of time thinking, and then produced spare code that they thought should work. Code ran close to the metal, even Scheme — it was understandable all the way down. Like a resistor, where you could read the bands and know the power rating and the tolerance and the resistance and V=IR and that’s all there was to know. 6.001 had been conceived to teach engineers how to take small parts that they understood entirely and use simple techniques to compose them into larger things that do what you want.

But programming now isn’t so much like that, said Sussman. Nowadays you muck around with incomprehensible or nonexistent man pages for software you don’t know who wrote. You have to do basic science on your libraries to see how they work, trying out different inputs and seeing how the code reacts. This is a fundamentally different job, and it needed a different course.

Really? I can’t ever ever ever remember a time when Mozilla code hit a bug in Uniscribe, or Pango, or Cairo, or the Microsoft CRT (but only under some versions), or Core Text (but only under older, almost-unsupported versions) and had to determine exactly what caused the bug so it could be fixed or avoided (this latter in the case of non-free and open source libraries). Every time we’ve interacted with external libraries it’s been fairies and ponies all the way down.

A feral pony in Mount Rogers National Recreation Area in southern Virginia
FERAL PONY

Oh yeah. Except those ponies are feral and will quickly eat sugary things out of your hand with their sharp, pointy teeth if you put them right in front of their mouths, or they’ll nibble on salty backpack belts if you leave them at ground level and unattended. And those fairies? They’re really just cardboard cutouts.

Continuing on:

And why Python, then? Well, said Sussman, it probably just had a library already implemented for the robotics interface, that was all.

How droll. I suspect it was more than that (more human-readable syntax and structure, almost pseudocode, without lots of insidious silly parentheses, for the obvious starter), but having a library ready is a substantial boost up.

Sussman has a point that we don’t construct programs from the simplest parts to the complex whole. My first substantial code patch for Mozilla was perhaps large, but much of it was bits-and-bytes hacking with an occasional foray into almost-trivial data structures with no non-trivial behavior. Much of the patch could be written in isolation of knowledge of the surrounding code, except to the minimal extent needed to modify the right XPCOM objects to get the desired behavior. In contrast my work on httpd.js often requires interaction and knowledge of foreign components, even beyond their meager-to-nonexistent documentation. A case in point: I needed a simple way to copy data from one stream to another, so I used an nsIAsyncStreamCopier to do it. I later returned to this code to implement keep-alive support to discover that our only such copier in Mozilla closes the target stream when the copy completes, not that you could tell that from its documentation; that interface will probably change in the Firefox after 3.5 to make it possible to change this behavior. Sometimes you work at the low levels, but there’s so much less to do there; we engineers fit puzzle pieces together more often than we make them.

I was fortunate (more fortunate than I thought at the time) to take 6.001 before its demise and replacement with 6.01, the Python-based class. Scheme was a new and foreign concept to me, and it forced certain methods of thought and attack to be used to their fullest. Consider (of course) recursion:

(define (factorial n)
  (if (or (= n 0) (= n 1))
    1
    (* n (factorial (- n 1)))
  )
)

The recursion and its concept are precisely clear. The function returns a value. In the base case that value is 1. In the recursive case that value is n times the factorial of one less than n. Can recursion be made any sparer or more precise? An imperative language requires some understanding of an entire method to grasp its recursion (if only to examine every return throughout it); in an expression-based language entire trees of expressions and method calls may be ignored completely.

All that said I still prefer Python to Scheme, not least because forcing the expression of iteration as recursion is an excellent way to obscure complexity costs. That said, 6.001 and CS152 were two of the best classes I ever took for forcing me to think differently, making me a more flexible programmer.