Back to index

Under the hood

This chapter describes some internals of LispMe. You don't need this to work with LispMe, but some people asked me about the implementation, so here you are:

General

LispMe was developed with GCC 2.7.2.2, PilRC 1.7 and prc-tools-0.5.0 on a PC running OS/2 Warp and tested with CoPilot. All tools have been ported to OS/2 by Thomas Johler.

Altogether LispMe consists of about 10000 lines of highly compact C code. All error handling is done using Pilot's ErrCatch and ErrThrow macros/functions and not a single function from the C library has been used.

I finally found a way to create executables >32k code with GCC without having to use shared libraries! This involves some tricks with object code and runtime library order. If you're interested in these topics, feel free to email me.

SECD machine model

LispMe is based on the SECD virtual machine described in
Peter Henderson
Functional Programming - Application and Implementation
Prentice Hall International
ISBN 0-13-331579-7
The name stems from the 4 main registers this machine operates on:
Sstackused to hold intermediate values while evaluating an expression
Eenvironmentused to hold the current environment
Ccontrolthe machine-language program being executed
Ddumpa stack the other registers are pushed on when a new function is called

SECD virtual machine

In fact, all registers hold pointers to list structures in the heap, so for example going to the next machine instruction is accomplished by an assignment of the form C = cdr(C);

The virtual machine currently has 137 different instructions, most of them dealing with LispMe's primitive procedures. The main extensions to Henderson's machine are

The runtime environment is represented as a list of frames, where each frame is a list of values. Variables are always accessed by lexical adresses (a pair indicating the frame and the offset of the variable). This is exactly the form you see when tapping the
names button. In fact, the first step in each evaluation is assigning this list to the E register.

SECD compiler

In contrast to Henderson's approach, the compiler is not bootstrapped, it's implemented in C like the rest of LispMe. The reasons for that are compilation speed and saving heap memory for user programs.

A LispMe expression is compiled by traversing its tree structure and consing up a list of machine instructions backwards to avoid using append. There's a simple peephole optimizer mainly used for removing unnecessary stack operations and for propagating returns through conditionals to provide proper tail recursion elimination.

There's also a compile time environment (represented as a list of lists, mirroring the structure of the runtime environment) to determine lexical adresses for all variables. You can view the SECD code of a closure with disasm.

Macros are expanded by calling the VM with the expansion procedure and the original source expression as argument.

Parsing expressions

Parsing is divided into 2 steps:
  1. The scanner groups characters into tokens by using a finite automaton consisting of 22 states. Most of the states deal with real and complex numbers and escape sequences in strings.
  2. The parser is recursive-descent
Scanning and formatting floating point was particularly nasty, as the Pilot ROM routines FlpFToA and FlpAToF are unusable (limited precision, limited exponent range, no error support), so all this stuff had to be written from scratch.

Memory management

LispMe uses 3 blocks of memory, where each block size is adjustable in the Preferences dialog. Starting with version 1.7, LispMe mo more uses the dynamic heap for its memory, instead it works directly on DB blocks, gaining write access via MemSemaphoreReserve. This is a much discussed technique, as accessing memory out of its bounds can overwrite other databases and even require a hard reset. On the other hand, LispMe has been out for some months now and several hundred people have tested it and I never got a report of an "invalid pointer access". Furthermore, there's no way in LispMe to create an invalid pointer by will or accident (like in other Scheme implementations, too). All pointers are created in the memory subsystem which almost certainly is bugfree. So I finally decided to use this technique to make a bigger heap available on all Pilots.

Garbage collection

LispMe uses mark/scan garbage collector for heap cells and floating point cells. The reasons for preferring mark/scan to copying garbage collection are: All in all, a typical garbage collection of 16k heap takes about 0.2 seconds.