Extending LispMe

Beginning with version 3.0, LispMe allows extension with modules written in C.

Overview

Currently extension modules are statically linked into the LispMe executable, so you have to add them to the makefile like usual. On the other hand, extension modules can use all functions and (global) data from other LispMe object modules. Maybe in future versions we'll provide dynamic linking of extension modules, too.

Extension modules allow definitions of

symbols
variables
primitive operations in terms of existing VM opcodes
native operations by calling a C function
new data types
new port types
compilation function for new special forms
actions to be taken on session creation/activation etc.

General conventions for extension modules

LispMe reserves the PTR bit patterns ffff ffss sssp 1010 for primitive symbols. Thus, 2048 builtin symbols/operations are possible. This range is partioned into 64 frames of (max.) 32 symbols each. You can use the MKPRIMSYM(frame,sym) macro to create a PTR value.

LispMe maintains a 2-dimensional (ragged) table of primitive symbols and their associated values which is defined in builtin.h:

typedef struct 
{
  char*      name;
  UInt8      kind;
  UInt8      stypes[5];
  nativeFct* fun;
  char*      doc;
} BuiltIn;

typedef BuiltIn BuiltInModule[];

BuiltInModule (a single frame of max. 32 symbols together with their definitions) is the primary interface between LispMe and the extension modules. Usually, your extension modules defines and exports an array of BuiltIn aka BuiltInModule like this:

BuiltInModule sampleBuiltins = 
{
  // list your builtins here
};

To add this frame to LispMe, modify extend.h as follows:

Invent a symbolic name for your module
Add #include for your module header
Add a line to the ADD_ALL_MODULES macro
Add the object to the makefile at the LispMe: dependancy

The first entry in the BuiltInModule table must be the module control function, use MODULE_FUNC(NULL) if you don't need one.

The end of the table is indicated by an entry with a NULL symbol name like this:

{NULL}

There are several macros in builtin.h for creating BuiltIn entries:

Defining symbols

LispMe supports both ordinary symbols (which can be used as variable names) and keywords (which can't). Use either the BUILTINKEYWORDQ(name) or the BUILTINSYMBOLQ(name) macro to define them (or use the version without suffix Q when defining names including C token separators like :)

The symbols defined here behave like ordinary LispMe symbols, but they don't need any space in the atom store.

To provide the names a help string (displayes when pressing the Info button), use the macros with the suffix _H.

Defining builtin operations

These come in two flavours:

Primitives

To define a VM opcode sequence. Its use for extension modules is quite limited since for defining new opcodes you would have to directly modify vm.c and vm.h. There are many macros (all defined in builtin.h) for defining various kinds of operations:

PRIMn(arity,opcode₁,...) where n is the number of opcodes following. These are used for builtins with a fixed number of arguments arity. (Arguments in LispMe are evaluated from right to left, so such the first argument is on top of the stack S)
PRIM12OP(opcode₁,opcode₂) The operation takes either one or two arguments and uses a different opcode in each case.
PRIMDEF(arity,opcode,default) The operation takes either arity or arity+1 arguments and uses default when the first argument is omitted.
PRIMFOLD(arity,opcode,neutral) These are used for 'folding' operations (like +, append) with an arbitrary number of arguments (but at least arity) opcode is the opcode to combine two values, neutral is the neutral element for the operation (either a small integer or a 'magic' value, see builtin.h for details)
PRIMLIST(opcode) All arguments are gathered into a list which is given to opcode.

Since LispMe can't infer the types for primitive operations, you should always provide a help string for them, so the macros don't end with _H.

Natives

Please note that the calling convention for native functions has changed from V3.0 to V3.1. In V3.0, arguments were passed as a list, but in V3.1 and later, arguments are passed in an array.

Using natives, you can associate a Scheme function with a plain C function. The C function is called with all arguments in an array, which is a copy from the actual runtime stack S. This means that you can modify the it to pass it on to other functions. The leftmost argument is the first item in the array (args[0]), etc.

For your convenience, all arguments are typechecked at runtime before calling the C function, therefore the arguments' types must be declared.

It's also possible to define natives with a variable number of arguments. In this case, the actual number of arguments is given to the C function. Additional arguments are not typechecked, but you can do this yourself. The last array element contains a list of the additional elements.

The C function should return its result as a single PTR.

NATIVEn(function,type₁,...) where n is the arity of the builtin. This declares a fixed arity native with parameter types type_i. The signature for the associated C function is
```
typedef PTR (nativeFct)(PTR* args);
```
The parameters can be accessed as standard LispMe PTR values in args[0] upto args[n-1].
NATIVEVn(function,type₁,...) This declares a variable arity native taking at least n arguments with parameter types type_i. The signature for the associated C function is
```
typedef PTR (nativeVarFct)(int argc, PTR* args);
```
The fixed parameters can be accessed as standard LispMe PTR values in args[0] upto args[n-1]. args[n] contains a list of the optional parameters. In fact, it contains a longer list (some parts of the S register), but only the first argc - n list elements are actual function parameters.

Implementing a builtin operation as a native C function is much simpler than defining new VM opcodes, but needs some more runtime overhead, so I'd recommend native C functions especially for wrapping PalmOS API calls when top performance is not needed.

LispMe creates a default help string for natives by listing its parameters' types. However, you can override the default help string by specifying one in the corresponding macros ending with a _H suffix.

Defining variables

To create variables usable both from Lisp and from the extension module, 4 steps are required:

Define the variable's name as a symbol
Declare a C variable PTR varCell pointing to the Lisp variable's cell
In the module control function, on message INIT_HEAP: Call this function:
```
createGlobalVar(MKPRIMSYM(MODULE, VARIABLE), initVal());
```
In the module control function, on message SESS_CONNECT: Cache the cell of your variable (not necessary, but improves performance)
```
varCell = symLoc(MKPRIMSYM(MODULE, VARIABLE));
```

These steps ascertain that your variable appears in LispMe's global environment frame and is normally accessible from LispMe and is protected from garbage collection.

On the C side, you should access the variable with locVal:

PTR oldVal = locVal(varCell);
locVal(varCell) = foo(oldVal);

The module control function

It is often necessary to perform some actions in your extension module when events like starting LispMe, creating a new session etc. occur. LispMe allows to define a module control function for each extension module, which gets called on those events.

The module control function is the first entry in each BuiltInModule array. It is put into the table by

MODULE_FUNC(init)

and is declared like this

static void init(ModuleMessage mess);

The enumeration ModuleMessage (defined in store.h) defines some action codes which are sent to all modules on several occassions:

Code Sent on Purpose
APP_START LispMe has just been launched Open libraries, read profiles, register hooks for datatypes
SESS_CREATE A new session has been created All standard session database records have been created, but the heap is not yet initialized and write protection is not yet turned off.
INIT_HEAP The heap is being initialized Create your global variables here.
SESS_CONNECT A session has been activated The heap is already writable and all user-defined objects have been unpickled already. Cache pointers to your global variables here.
SESS_DISCONNECT A session is about to be deactivated The heap is still writable and no user-defined objects have been pickled yet.
SESS_DELETE A new session is about to be deleted The heap is no more accessible, free your session data here
APP_STOP LispMe is shutting down Close libraries, write modified profiles

Code	Sent on	Purpose
`APP_START`	LispMe has just been launched	Open libraries, read profiles, register hooks for datatypes
`SESS_CREATE`	A new session has been created	All standard session database records have been created, but the heap is not yet initialized and write protection is not yet turned off.
`INIT_HEAP`	The heap is being initialized	Create your global variables here.
`SESS_CONNECT`	A session has been activated	The heap is already writable and all user-defined objects have been unpickled already. Cache pointers to your global variables here.
`SESS_DISCONNECT`	A session is about to be deactivated	The heap is still writable and no user-defined objects have been pickled yet.
`SESS_DELETE`	A new session is about to be deleted	The heap is no more accessible, free your session data here
`APP_STOP`	LispMe is shutting down	Close libraries, write modified profiles

"Constructive" events are sent to modules in ascending order, "destructive" events in descending order.

Types

The type system used in LispMe is very simple, each type is represented by a small integer (see @@ty in builtin.h) and has a symbolic name starting with ty.

User-defined types are represented by small integers starting with 0, too, but when they're used in a context where builtin types are permitted as well (for example in a signature declaration), these numbers are offset by tyFOREIGN to distinguish them from builtin types. Use the FOREIGN(ftype) macro in this case.

Currently, only MAX_FOREIGN_TYPES are allowed (see @@maxFT in extend.h to keep the callback tables small, but this is no fixed limit.

Defining your own data types

General

LispMe reserves some PTR bit-pattern for foreign or user-defined types. An object of foreign type consists of two cons-cells on the heap, the first cell contains the type id (which is an integer in the range from 0 to MAX_FOREIGN_TYPES-1) and a PTR to another cell which holds an arbitrary 32 bit value whose meaning is user-defined. By allocating the descriptor cell at a lower address than the 32 bit value LispMe can assure that the contents of the value cell aren't misinterpreted as PTR values.

For many user-defined types, 32 bits are sufficient to store the data itself, for example a timestamp (see date.h) or a reference to an open PalmOS database (see dm.h) fit nicely in 32 bits. However, if you need more space, simply put a 32 bit pointer here. The type declared for these 32 bit values is void*, so you have to cast accordingly.

To define your own type, nothing special is required, just use the function

allocForeign(void* val, UInt8 ftype)

to create an object of your type ftype. Use the macro IS_FOREIGN(ptr) to check if a PTR is a foreign value, FOREIGNTYPE(ptr) to extract the type id and FOREIGNVAL(ptr) to extract the value from a foreign value.

You should define a symbolic name for your type id in extend.h to avoid using the same type id twice.

The structure containing your type's value can perfectly contain pointers (PTR) to other LispMe values on the heap (see the implementation of ports in port.c which uses LispMe strings as I/O buffers and port-specific data). You just have to remember to care for those PTRs in your foreign types' call-back functions.

Default behavior

LispMe provides some default behavior for foreign types:

The type name prints as foreign type id in error messages
Values are printed [foreign type value] where value is the hexadecimal content of the value cell
Two foreign values are considered eqv? when both their type id and their values are equal.
Nothing is done when garbage collecting foreign types or when connecting/disconnecting a session

You can replace the default behavior by registering type-specific hooks:

Registering type-specific hooks

In general, all registration function should be called in your module control function, on message APP_START, to make sure that those hooks are active all the time.

Registering the type name

To make LispMe print the name of your type (e.g. in error messages), announce it by calling

registerTypename(UInt8 ftype, char* name)

Otherwise it will be printed foreign type n.

Registering the printing function

You can define a custom printing function for your type, which will be called by display, write, object->string, or the REP loop. Do this by calling

registerPrinter(UInt8 ftype, printerType fp)

where printerType is the type of the printing function:

void myPrinter(Boolean machineFormat, void* value)

Your printing function is called with two parameters, the first indicating if the output is caused by write (true) or by display (false), the second is the value itself.

In your printing function use outStr(char*) to output a string and writeSEXP(PTR) to recursively output subobjects (if your type contains a PTR) Both functions are defined in io.c and care for buffer length and print depth checking automatically. To prepare parts of your output (e.g., with StrIToA()) use the character buffer token, which is defined in io.c, too.

Registering the comparison function `eqv?`

To implement your type's idea of being equivalent to another value, call

registerEQV(UInt8 ftype, eqvType fp)

where eqvType is the type of the equivalence function:

Boolean myEQV(void* value1, void* value2)

Registering the memory hook

This hook allows customizing the actions performed when value of your type are garbage collected or pickled/unpickled (this is explained in detail in the next section) To install the hook, call

registerMemHook(UInt8 ftype, memHookType fp)

where memHookType is the type of the memory hook function:

void myMemHook(MemMessage mess, PTR obj)

The enumeration MemMessage (defined in store.h) defines the action codes which are sent to all values of a foreign type when a memory hook function has been registered. See here for some of theses occasions.

Code Sent on Purpose
MEM_MARK mark phase of GC The value is still in use and has just been marked to avoid deleting it during the scan phase. If your type contains LispMe PTRs to the heap, you must mark() them here, too. (See sample.c for an example)
MEM_DELETE value is about to be deleted Either the value is no more accessible and thus deleted during the scan phase of garbage collection, or the entire heap is cleared due to a user command. In any case you can be sure that the value is unpickled and the session is connected. Free/release any resources associated with your type here.
MEM_PICKLE pickle value The current session has been disconnected and each foreign value has the chance to turn into its external representation.
MEM_UNPICKLE unpickle value Before connecting to the session, each foreign value has the chance to (re-)create its internal from its external representation.

Code	Sent on	Purpose
`MEM_MARK`	mark phase of GC	The value is still in use and has just been marked to avoid deleting it during the scan phase. If your type contains LispMe `PTR`s to the heap, you must `mark()` them here, too. (See `sample.c` for an example)
`MEM_DELETE`	value is about to be deleted	Either the value is no more accessible and thus deleted during the scan phase of garbage collection, or the entire heap is cleared due to a user command. In any case you can be sure that the value is unpickled and the session is connected. Free/release any resources associated with your type here.
`MEM_PICKLE`	pickle value	The current session has been disconnected and each foreign value has the chance to turn into its external representation.
`MEM_UNPICKLE`	unpickle value	Before connecting to the session, each foreign value has the chance to (re-)create its internal from its external representation.

Internal/external values (aka pickling/unpickling)

Motivation

Often 32 bits for a foreign value are sufficient (e.g. for a timestamp), but when more data is needed and the foreign type is implemented by a 32 bit pointer to a memory block (e.g. allocated by MemPtrNew()), there's a problem when exiting and re-launching LispMe: The pointer is no more valid, it has been freed by PalmOS on exiting LispMe.

This problem is unique to LispMe due to the fact that the heap in LispMe itself is a persistent object surviving program exits. You should try to make your data types behave the same way. This can be done by implementing the pickling/unpickling protocol.

A session can be in two states:

connected (=unpickled): an active session (possibly including memory blocks from the dynamic heap) while LispMe is running
unconnected (=pickled): either an inactive session or LispMe is not running at all

Pickling a session means saving all needed data from the dynamic heap to a more permanent location, for example by writing it to a PalmOS database (the session database itself is here the first choice to keep all relevant session data in one place if possible). Unpickling is the reverse operation. One should also consider what to do when pickling foreign types containing PalmOS resources like open databases, sockets, or files. Remember that session databases can be backed up or beamed to other devices!

Sample scenarios

Simple data fitting in 32 bit

Example: Timestamp, date and time types in date.c Here nothing has to be done at all.

Larger data structures

Example: The FooType datatype in sample.c These object are allocated on the dynamic heap with MemPtrNew() (in fact, MemHandleNew() would be better to allow relocation of the memory blocks by the OS) in nativeMakeFoo() (note how LispMe strings are accessed here) Handling MEM_MARK and MEM_DELETE is straightforward, however the pickling/unpickling is more interesting:

For each FooType to be pickled a record is created in the current session database (which is always accessible by the variable dbRef) and the memory block is simply copied. The value cell is updated, it no more contains the address of the memory block, but the record number in the database now!

On unpickling, the inverse is done: Retrieving the record by number, allocating memory from the dynamic heap and removing the record. One subtle fact is here important: The record numbers must stay constant (remember that PalmOS shifts record numbers when inserting in the middle!), so new records have to be put at the end of the database. But how can we be sure that on unpickling (when the record of the value being unpickled is removed) record numbers don't shift? LispMe guarantees that objects are unpickled in the reverse order they have been pickled, so the record to be deleted is always the last one in the database and thus no record numbers shift when deleting this last record.

There are utility functions standardPickle() and standardUnpickle() in util.c implementing the behavior described above.

Instead of allocating the memory block on the dynamic heap, you can also allocate it in a database record (similar to how LispMe allocates strings and vectors) which has both advantages and disadvantages:

Advantage: much more memory available, simpler pickling/unpickling
Disadvantage: slower access

Even when allocating as a database record, you have to pickle/unpickle these types: Since creation and deletion of db records will happen in an unforeseeable order, you can't store the record number (which will shift) in the value cell, instead you must store the record unique id (which will stay constant during the whole lifetime of a record) On the other hand, beaming and restoring a session db can change the unique ids, so a pickled session should reference records by number, not by unique id.

Open resources

If at all possible, try to make foreign types wrapping PalmOS resources transparent to session switching by dumping them to a database and reopen them on reconnect. See dm.c how this is done with open PalmOS databases. If the reopen fails for any reason, you can take advantage of a special NULL/invalid indicator value of the PalmOS type, which you can also use as a last resort when the foreign type doesn't allow reopening the resource at all.

You should keep in mind that anything can happen to the databases between disconnecting and reconnecting a session, the session can even be continued on another device!

State transition

The following state transsition diagram shows all commands/events relevant for the extension mechanism. The commands are displayed in bold, after that the messages sent to the extension modules are listed in order (fixed font).

The successful pickling is recorded in the session database, so a potential damaged session caused by a system crash will be recognized at the next reconnect attempt, which is denied.

Compiler extensions

Use the BUILTINCOMPKEY macro to announce a compilation functions for a new special form to the compiler. A compilation function is declared

typedef PTR (nativeCompFct)(PTR expr, PTR names, PTR* code);

Whenever the compiler encounters an expression whose car is the keyword specified in BUILTINCOMPKEY, the associated compiler function is called with these parameters:

expr: The entire expression to be compiled (including the keyword itself, just like LispMe's macro)
names: The compile-time environment, a two-level list of symbols relative to which lexical addresses are computed.
code: The continuation of expr, already compiled. Prefix the VM code to this list.

If the compiler function returns an expression != BLACK_HOLE, the entire expression is considered a tail call and the compiler is invoked on the returned expression.

To compile sub-expressions, use any of the compiler helper functions comp(), compseq(), complist(), compVar(), or compConst() (see comp.h for definitions of these)

You can use a compiler function to define macros in C. To do this just create an expansion expression and pass it to comp():

  newExpr = ...;
  comp(newExpr, names, code);
  return BLACK_HOLE;

The compilation function for let* (see misc.c) works this way.

But a compilation function is not restricted to that, it can generate any code necessary, like, for example compileWhile().

Creating new VM opcodes

If your special form needs special handling at run-time, you can get this effect by generating a EXEC VM instruction in the compilation function. EXEC has one parameter, which should be keyword of your special form. When the VM executes an EXEC keyword instruction, it calls the compilation function (sic!) with keyword as first argument, a NIL names list, and a NULL continuation. Now modify the VM registers (S, E, C, D) as required. Have a look at compExecEval() in misc.c to see how this works in detail.

Implementing your own ports

There is a generic mechanism for ports, which is built on top of the foreign type mechanism. The port interface encapsulates the device-specific differences and allows uniform access to them via the standard Scheme I/O calls display, write, newline, read, read-line, read-char, peek-char, close-input-port, and close-output-port.

The generic port functions care for parsing, printing, buffer handling etc. and call port-specific functions to transfer data to/from the actual devices. There are generally two different mechanisms for buffering I/O data:

Memo-like ports (Memo and Memo32 files) transfer the data directly to the memo's memory chunk without using an additional buffer. Those ports don't have to be closed, too.
Buffered ports (all other) use a LispMe string in the port structure for buffering data. These ports actually read and write from their buffers and fill/empty the buffers on demand. These ports have to be closed explicitely to ensure all pending buffers are flushed.

In the following sections, only buffered ports are described, since memo-like ports are already implemented :-)

Defining port types

Add symbolic names for your port in extend.h at the section Input/output port subtypes (names starting with PT_) Currently, there are 16 different port types allowed (to keep the dispatch tables small), but this is extensible. Note that no distinction between input and output ports is made here, in fact most of the ports exist in both variations.

Creating ports

Registering call-back functions

Just like foreign types, all registration functions should be called in your module control function, on message APP_START, to make sure that those hooks are active all the time. For an input port, use

registerReadFct(UInt8 ptype, readFctType fp);

where readFctType is the type of the reader function.

For an output port use

registerWriteFct(UInt8 ptype, writeFctType fp);

where writeFctType is the type of the writer function.

Note that in contrast to foreign types registering call-backs is not optional for ports, there is no default mechanism (how could there be one?).

Input ports

Structure

typedef struct {
  UInt8     type; 
  UInt8     status;
  UInt32    uid;    // == 0 to use stringBuffer instead
  UInt16    pos;    // current read position from beginning of buffer
  UInt16    next;   // next record (DOC)
  UInt16    nRec;   // number of records (DOC)
  UInt16    bufLen; // length of buffer          
  DmOpenRef dbRef;  // MEMO, MEMO32, or DOC database
  PTR       impl;   // implementation specific
  PTR       buf;    // a LispMe string used as buffer
} InPort;

Creation

Use makeBufInPort(UInt8 ptype, PTR impl, PTR buf) to create a buffered input port. You have to provide (or create) the string used to buffer input yourself (you can use the empty string here)

Reader function

The reader function Boolean myReader(ReaderCmd cmd, InPort* port) is called with one of two commands:

RC_READ: Fill the read buffer buf with data actually read from the source and set both bufLen and pos accordingly. It's possible to allocate a new buffer string on each call or reuse an existing one. Return if the operation was successfull.
RC_CLOSE: Set status to INP_STATUS_CLOSED to make subsequent read attempts fail and release any resources you needed. Setting buf to NIL is also a good idea to allow the buffer to be garbage collected.

Output ports

Structure

typedef struct {
  UInt8     type; 
  UInt8     flags;
  UInt32    uid;    // uid of memo-like database
  UInt16    pos;    // current write position from beginning of buffer
  UInt16    bufLen; // length of buffer          
  DmOpenRef dbRef;  // MEMO or MEMO32 database reference
  PTR       impl;   // depending on port type
  PTR       buf;    // a LispMe string used as buffer
  char*     mem;    // cache the actual buffer
} OutPort;

Creation

Use makeBufOutPort(UInt8 ptype, PTR impl, UInt16 bufLen, UInt8 flags) to create a buffered output port. This functions allocates a buffer string of length bufLen. Possible flags (OR'ed together) are

OPF_SUPPRESS_NOPRINT: Don't print a top-level #n to the port.
OPF_AUTOFLUSH: Automatically flush the buffer (i.e., call the writer function) to the device after every port write command.
OPF_NUL_TERMINATE: Append a \0 byte after each write operation.

Writer function

The writer function void myWriter(WriterCmd cmd, OutPort* port) is called with one of two commands:

WC_WRITE: Flush the write buffer buf to the destination. Only the substring from position 0 to pos is relevant. To reuse the buffer for subsequent write operations, simply set pos to 0.
WC_CLOSE: First flush the write buffer as above and set the closed bit in flags to make subsequent write attempts fail and release any resources you needed.

General programming tips

Coding under PalmOS and especially within LispMe context has several pitfalls:

There's only about 2kb stack space. Keep local variables to a minimum, especially in potential recursive functions like the mem handler.
There's not too much global memory, either, and LispMe already uses a lot of it.
LispMe disables write protection to database memory, so be extra careful with pointers, you can easily overwrite Memory Manager or system structures requiring a cold reboot!
For the same reason, you must wrap user interface calls in ReleaseMem() and GrabMem()

Extending LispMe

Overview

General conventions for extension modules

Defining builtin operations

Primitives

Natives

Defining variables

The module control function

General

Default behavior

Registering type-specific hooks

Registering the type name

Registering the printing function

Registering the comparison function eqv?

Registering the memory hook

Motivation

Sample scenarios

Simple data fitting in 32 bit

Larger data structures

Open resources

Creating new VM opcodes

Implementing your own ports

Defining port types

Creating ports

Registering call-back functions

Input ports

Structure

Creation

Output ports

Structure

Creation

General programming tips

Registering the comparison function `eqv?`