Back to index
Extending LispMe
Beginning with version 3.0, LispMe allows extension with modules
written in C.
Overview
Currently extension modules are statically linked into the LispMe
executable, so you have to add them to the makefile like
usual. On the other hand, extension modules can use all functions
and (global) data from other LispMe object modules. Maybe in future
versions we'll provide dynamic linking of extension modules, too.
Extension modules allow definitions of
- symbols
- variables
- primitive operations in terms of existing VM opcodes
- native operations by calling a C function
- new data types
- new port types
- compilation function for new special forms
- actions to be taken on session creation/activation etc.
General conventions for extension modules
LispMe reserves the PTR bit patterns
ffff ffss sssp 1010 for primitive symbols. Thus, 2048
builtin symbols/operations are possible. This range is partioned into
64 frames of (max.) 32 symbols each. You can use the
MKPRIMSYM(frame,sym) macro to create a
PTR value.
LispMe maintains a 2-dimensional (ragged) table of primitive
symbols and their associated values which is defined in
builtin.h:
typedef struct
{
char* name;
UInt8 kind;
UInt8 stypes[5];
nativeFct* fun;
char* doc;
} BuiltIn;
typedef BuiltIn BuiltInModule[];
BuiltInModule (a single frame of max. 32 symbols together
with their definitions) is the primary interface between LispMe and
the extension modules. Usually, your extension modules defines
and exports an array of BuiltIn aka BuiltInModule
like this:
BuiltInModule sampleBuiltins =
{
// list your builtins here
};
To add this frame to LispMe, modify extend.h as follows:
- Invent a symbolic name for your module
- Add #include for your module header
- Add a line to the ADD_ALL_MODULES macro
- Add the object to the makefile at the LispMe:
dependancy
The first entry in the BuiltInModule table must be
the module control function, use MODULE_FUNC(NULL) if you
don't need one.
The end of the table is indicated by an entry with a NULL symbol
name like this:
{NULL}
There are several macros in builtin.h for creating
BuiltIn entries:
LispMe supports both ordinary symbols (which can be used as variable
names) and keywords (which can't). Use either the
BUILTINKEYWORDQ(name) or the
BUILTINSYMBOLQ(name) macro to define them (or use
the version without suffix Q when defining names including
C token separators like :)
The symbols defined here behave like ordinary LispMe symbols, but
they don't need any space in the atom store.
To provide the names a help string (displayes when pressing the
Info button), use the macros with the suffix _H.
Defining builtin operations
These come in two flavours:
Primitives
To define a VM opcode sequence. Its use for
extension modules is quite limited since for defining
new opcodes you would have to directly modify vm.c and
vm.h. There are many macros (all defined in
builtin.h) for defining various kinds of operations:
- PRIMn(arity,opcode1,...)
where n is the number of opcodes following. These
are used for builtins with a fixed number of arguments
arity. (Arguments in LispMe are evaluated from right
to left, so such the first argument is on top
of the stack S)
- PRIM12OP(opcode1,opcode2)
The operation takes either one or two arguments and uses
a different opcode in each case.
- PRIMDEF(arity,opcode,default)
The operation takes either arity or
arity+1 arguments and uses default
when the first argument is omitted.
- PRIMFOLD(arity,opcode,neutral)
These are used for 'folding' operations (like +,
append) with an arbitrary number
of arguments (but at least arity)
opcode is the opcode to combine two values,
neutral is
the neutral element for the operation (either a small integer
or a 'magic' value, see builtin.h for details)
- PRIMLIST(opcode)
All arguments are gathered into a list which is given to
opcode.
Since LispMe can't infer the types for primitive operations, you
should always provide a help string for them, so the macros
don't end with _H.
Natives
Please note that the calling convention for native functions
has changed from V3.0 to V3.1. In V3.0, arguments were passed
as a list, but in V3.1 and later, arguments are passed in an array.
Using natives, you can associate a Scheme function with a
plain C function. The C function
is called with all arguments in an array, which is a
copy from the actual runtime stack S. This means that you
can modify the it to pass it on to other functions.
The leftmost argument is the first item in the array (args[0]),
etc.
For your convenience, all arguments are typechecked at runtime
before calling the C function, therefore the arguments'
types must be declared.
It's also possible to define natives with a variable number of
arguments. In this case, the actual number of arguments is given
to the C function. Additional arguments are not typechecked,
but you can do this yourself. The last array element contains a
list of the additional elements.
The C function should return its result as a single PTR.
Implementing a builtin operation as a native C function is much
simpler than defining new VM opcodes, but needs some more runtime
overhead, so I'd recommend native C functions especially for
wrapping PalmOS API calls when top performance is not needed.
LispMe creates a default help string for natives by listing its
parameters' types. However, you can override the default help
string by specifying one in the corresponding macros ending
with a _H suffix.
Defining variables
To create variables usable both from Lisp and from the extension
module, 4 steps are required:
- Define the variable's name as a symbol
- Declare a C variable PTR varCell pointing to the
Lisp variable's cell
- In the module control function, on message INIT_HEAP:
Call this function:
createGlobalVar(MKPRIMSYM(MODULE, VARIABLE), initVal());
- In the module control function, on message SESS_CONNECT:
Cache the cell of your variable (not necessary, but improves
performance)
varCell = symLoc(MKPRIMSYM(MODULE, VARIABLE));
These steps ascertain that your variable appears in LispMe's global
environment frame and is normally accessible from LispMe and is
protected from garbage collection.
On the C side, you should access the variable with locVal:
PTR oldVal = locVal(varCell);
locVal(varCell) = foo(oldVal);
The module control function
It is often necessary to perform some actions in your extension
module when events like starting LispMe, creating a new session etc.
occur. LispMe allows to define a module control function
for each extension module, which gets called on those events.
The module control function is the first entry in each
BuiltInModule array. It is put into the table by
MODULE_FUNC(init)
and is declared like this
static void init(ModuleMessage mess);
The enumeration ModuleMessage (defined in store.h)
defines some action codes which are sent to all modules on several
occassions:
Code | Sent on | Purpose
|
---|
APP_START | LispMe has just been launched
| Open libraries, read profiles, register hooks for datatypes
|
SESS_CREATE | A new session has been created
| All standard session database records have been created, but
the heap is not yet initialized and write protection is not
yet turned off.
|
INIT_HEAP | The heap is being initialized
| Create your global variables here.
|
SESS_CONNECT | A session has been activated
| The heap is already writable and all user-defined objects
have been unpickled already. Cache pointers to your
global variables here.
|
SESS_DISCONNECT | A session is about to be deactivated
| The heap is still writable and no user-defined objects have
been pickled yet.
|
SESS_DELETE | A new session is about to be deleted
| The heap is no more accessible, free your session data here
|
APP_STOP | LispMe is shutting down
| Close libraries, write modified profiles
|
"Constructive" events are sent to modules in ascending order,
"destructive" events in descending order.
The type system used in LispMe is very simple, each type is represented
by a small integer (see @@ty in builtin.h) and
has a symbolic name starting with ty.
User-defined types are represented by small integers starting with
0, too, but when they're used in a context where builtin types are
permitted as well (for example in a signature declaration), these
numbers are offset by tyFOREIGN to distinguish them from
builtin types. Use the FOREIGN(ftype) macro in this
case.
Currently, only MAX_FOREIGN_TYPES are allowed (see
@@maxFT in extend.h to keep the callback tables
small, but this is no fixed limit.
General
LispMe reserves some PTR bit-pattern for foreign or
user-defined types. An object of foreign type consists of two
cons-cells on the heap, the first cell contains the type id (which is
an integer in the range from 0 to MAX_FOREIGN_TYPES-1)
and a
PTR to another cell which holds an arbitrary 32 bit value whose
meaning is user-defined. By allocating the descriptor cell at a lower
address than the 32 bit value LispMe can assure that the contents of
the value cell aren't misinterpreted as PTR values.
For many user-defined types, 32 bits are sufficient to store the data
itself, for example a timestamp (see date.h) or a reference
to an open PalmOS database (see dm.h) fit nicely in 32 bits.
However, if you need more space, simply put a 32 bit pointer here.
The type declared for these 32 bit values is void*, so you have
to cast accordingly.
To define your own type, nothing special is required, just use the
function
allocForeign(void* val, UInt8 ftype)
to create an object of your type ftype. Use the macro
IS_FOREIGN(ptr) to check if a PTR is a foreign value,
FOREIGNTYPE(ptr) to extract the type id and
FOREIGNVAL(ptr) to extract the value from a foreign value.
You should define a symbolic name for your type id in extend.h
to avoid using the same type id twice.
The structure containing your type's value can perfectly contain
pointers (PTR) to other LispMe values on the heap (see the
implementation of ports in port.c which uses LispMe strings
as I/O buffers and port-specific data). You just have to remember to
care for those PTRs in your foreign types' call-back functions.
Default behavior
LispMe provides some default behavior for foreign types:
- The type name prints as foreign type id in error
messages
- Values are printed [foreign type value]
where value is the hexadecimal content of the value
cell
- Two foreign values are considered eqv? when both their
type id and their values are equal.
- Nothing is done when garbage collecting foreign types or when
connecting/disconnecting a session
You can replace the default behavior by registering type-specific hooks:
Registering type-specific hooks
In general, all registration function should be called in your
module control function, on message APP_START, to make sure
that those hooks are active all the time.
Registering the type name
To make LispMe print the name of your type (e.g. in error messages),
announce it by calling
registerTypename(UInt8 ftype, char* name)
Otherwise it will be printed foreign type n.
Registering the printing function
You can define a custom printing function for your type, which will
be called by display,
write,
object->string,
or the REP loop. Do this by calling
registerPrinter(UInt8 ftype, printerType fp)
where printerType is the type of the printing function:
void myPrinter(Boolean machineFormat, void* value)
Your printing function is called with two parameters, the first
indicating if the output is caused by write (true)
or by display (false), the second is the value
itself.
In your printing function use outStr(char*) to output a string
and writeSEXP(PTR) to recursively output subobjects (if your
type contains a PTR) Both functions are defined in io.c
and care for buffer length and print depth checking automatically.
To prepare parts of your output (e.g., with StrIToA()) use
the character buffer token, which is defined in
io.c, too.
Registering the comparison function eqv?
To implement your type's idea of being equivalent to another value, call
registerEQV(UInt8 ftype, eqvType fp)
where eqvType is the type of the equivalence function:
Boolean myEQV(void* value1, void* value2)
Registering the memory hook
This hook allows customizing the actions performed when value of your
type are garbage collected or pickled/unpickled (this is explained in
detail in the next section) To install the hook,
call
registerMemHook(UInt8 ftype, memHookType fp)
where memHookType is the type of the memory hook function:
void myMemHook(MemMessage mess, PTR obj)
The enumeration MemMessage (defined in store.h)
defines the action codes which are sent to all values of a foreign
type when a memory hook function has been registered. See
here for some of theses occasions.
Code | Sent on | Purpose
|
---|
MEM_MARK | mark phase of GC
| The value is still in use and has just been marked to avoid
deleting it during the scan phase. If your type contains
LispMe PTRs to the heap, you must mark()
them here, too. (See sample.c for an example)
|
MEM_DELETE | value is about to be deleted
| Either the value is no more accessible and thus deleted
during the scan phase of garbage collection, or the entire
heap is cleared due to a user command. In any case you can
be sure that the value is unpickled and the session is
connected. Free/release any resources associated with your
type here.
|
MEM_PICKLE | pickle value
| The current session has been disconnected and each foreign
value has the chance to turn into its external representation.
|
MEM_UNPICKLE | unpickle value
| Before connecting to the session, each foreign value has the
chance to (re-)create its internal from its external
representation.
|
Motivation
Often 32 bits for a foreign value are sufficient (e.g. for a timestamp),
but when more data is needed and the foreign type is implemented by a
32 bit pointer to a memory block (e.g. allocated by MemPtrNew()),
there's a problem when exiting and re-launching LispMe: The pointer is
no more valid, it has been freed by PalmOS on exiting LispMe.
This problem is unique to LispMe due to the fact that the heap in LispMe
itself is a persistent object surviving program exits. You should try
to make your data types behave the same way. This can be done by
implementing the pickling/unpickling protocol.
A session can be in two states:
- connected (=unpickled): an active session (possibly including
memory blocks from the dynamic heap) while LispMe is running
- unconnected (=pickled): either an inactive session or LispMe is
not running at all
Pickling a session means saving all needed data from the dynamic heap
to a more permanent location, for example by writing it to a PalmOS
database (the session database itself is here the first choice to keep all
relevant session data in one place if possible).
Unpickling is the reverse operation. One should also consider what to
do when pickling foreign types containing PalmOS resources like open
databases, sockets, or files. Remember that session databases can be
backed up or beamed to other devices!
Sample scenarios
Simple data fitting in 32 bit
Example: Timestamp, date and time types in date.c Here
nothing has to be done at all.
Larger data structures
Example: The FooType datatype in sample.c These
object are allocated on the dynamic heap with MemPtrNew()
(in fact, MemHandleNew() would be better to allow
relocation of the memory blocks by the OS) in nativeMakeFoo()
(note how LispMe strings are accessed here) Handling MEM_MARK
and MEM_DELETE is straightforward, however the
pickling/unpickling is more interesting:
For each FooType to be pickled a record is created in the
current session database (which is always accessible by the
variable dbRef) and the memory block is simply copied.
The value cell is updated, it no more contains the address
of the memory block, but the record number in the database now!
On unpickling, the inverse is done: Retrieving the record by number,
allocating memory from the dynamic heap and removing the record.
One subtle fact is here important: The record numbers must stay
constant (remember that PalmOS shifts record numbers when inserting
in the middle!), so new records have to be put at the end
of the database. But how can we be sure that on unpickling (when the
record of the value being unpickled is removed) record numbers don't
shift? LispMe guarantees that objects are unpickled in the
reverse order they have been pickled, so the record to be
deleted is always the last one in the database and thus no
record numbers shift when deleting this last record.
There are utility functions standardPickle() and
standardUnpickle() in util.c implementing the
behavior described above.
Instead of allocating the memory block on the dynamic heap, you can
also allocate it in a database record (similar to how LispMe allocates
strings and vectors) which has both advantages and disadvantages:
- Advantage: much more memory available,
simpler pickling/unpickling
- Disadvantage: slower access
Even when allocating as a database record, you have to
pickle/unpickle these types: Since creation and deletion of db records
will happen in an unforeseeable order, you can't store the record
number (which will shift) in the value cell, instead you must store
the record unique id (which will stay constant during the
whole lifetime of a record) On the other hand, beaming and restoring
a session db can change the unique ids, so a pickled session should
reference records by number, not by unique id.
Open resources
If at all possible, try to make foreign types wrapping PalmOS
resources transparent to session switching by dumping them to
a database and reopen them on reconnect. See dm.c how
this is done with open PalmOS databases. If the reopen fails for
any reason, you can take advantage of a special NULL/invalid
indicator value of the PalmOS type, which you can also use as a
last resort when the foreign type doesn't allow reopening the
resource at all.
You should keep in mind that anything can happen to
the databases between disconnecting and reconnecting a session,
the session can even be continued on another device!
The following state transsition diagram shows all commands/events
relevant for the extension mechanism. The commands are displayed
in bold, after that the messages sent to the extension
modules are listed in order (fixed font).
The successful pickling is recorded in the session database, so
a potential damaged session caused by a system crash will be
recognized at the next reconnect attempt, which is denied.
Use the BUILTINCOMPKEY macro to announce a compilation
functions for a new special form to the compiler. A compilation
function is declared
typedef PTR (nativeCompFct)(PTR expr, PTR names, PTR* code);
Whenever the compiler encounters an expression whose car is
the keyword specified in BUILTINCOMPKEY, the associated
compiler function is called with these parameters:
- expr: The entire expression to be compiled
(including the keyword itself, just like LispMe's
macro)
- names: The compile-time environment, a two-level
list of symbols relative to which lexical addresses are computed.
- code: The continuation of expr, already
compiled. Prefix the VM code to this list.
If the compiler function returns an expression != BLACK_HOLE,
the entire expression is considered a tail call and the compiler is
invoked on the returned expression.
To compile sub-expressions, use any of the compiler helper functions
comp(), compseq(), complist(),
compVar(), or compConst() (see comp.h for
definitions of these)
You can use a compiler function to define macros in C. To do this
just create an expansion expression and pass it to comp():
newExpr = ...;
comp(newExpr, names, code);
return BLACK_HOLE;
The compilation function for let* (see misc.c) works
this way.
But a compilation function is not restricted to that, it can generate
any code necessary, like, for example compileWhile().
Creating new VM opcodes
If your special form needs special handling at run-time, you can get
this effect by generating a EXEC VM instruction in the
compilation function. EXEC has one parameter, which should
be keyword of your special form. When the VM executes an
EXEC keyword instruction, it calls the compilation
function (sic!) with keyword as first argument,
a NIL names list, and a NULL continuation.
Now modify the VM registers (S, E, C, D)
as required. Have a look at compExecEval() in misc.c
to see how this works in detail.
Implementing your own ports
There is a generic mechanism for ports, which is built on top of
the foreign type mechanism. The port interface encapsulates the
device-specific differences and allows uniform access to them via the
standard Scheme I/O calls
display,
write,
newline,
read,
read-line,
read-char,
peek-char,
close-input-port, and
close-output-port.
The generic port functions care for parsing, printing, buffer handling
etc. and call port-specific functions to transfer data to/from
the actual devices. There are generally two different mechanisms for
buffering I/O data:
- Memo-like ports (Memo and Memo32 files) transfer the data directly
to the memo's memory chunk without using an additional buffer.
Those ports don't have to be closed, too.
- Buffered ports (all other) use a LispMe string in the port
structure for buffering data. These ports actually read and write
from their buffers and fill/empty the buffers on demand. These ports
have to be closed explicitely to ensure all pending buffers are
flushed.
In the following sections, only buffered ports are described, since
memo-like ports are already implemented :-)
Defining port types
Add symbolic names for your port in extend.h at the section
Input/output port subtypes (names starting with PT_)
Currently, there are 16 different port types allowed (to keep the
dispatch tables small), but this is extensible. Note that no
distinction between input and output ports is made here, in fact most
of the ports exist in both variations.
Creating ports
Registering call-back functions
Just like foreign types, all registration functions should be called in your
module control function, on message APP_START, to make sure
that those hooks are active all the time. For an input port, use
registerReadFct(UInt8 ptype, readFctType fp);
where readFctType is the type of the
reader function.
For an output port use
registerWriteFct(UInt8 ptype, writeFctType fp);
where writeFctType is the type of the
writer function.
Note that in contrast to foreign types registering call-backs is
not optional for ports, there is no default mechanism
(how could there be one?).
Input ports
Structure
typedef struct {
UInt8 type;
UInt8 status;
UInt32 uid; // == 0 to use stringBuffer instead
UInt16 pos; // current read position from beginning of buffer
UInt16 next; // next record (DOC)
UInt16 nRec; // number of records (DOC)
UInt16 bufLen; // length of buffer
DmOpenRef dbRef; // MEMO, MEMO32, or DOC database
PTR impl; // implementation specific
PTR buf; // a LispMe string used as buffer
} InPort;
Creation
Use makeBufInPort(UInt8 ptype, PTR impl, PTR buf)
to create a buffered input port. You have to provide (or create) the
string used to buffer input yourself (you can use the empty string here)
The reader function Boolean myReader(ReaderCmd cmd, InPort* port)
is called with one of two commands:
- RC_READ: Fill the read buffer buf with data
actually read from the source and set both bufLen and
pos accordingly. It's possible to allocate a new buffer
string on each call or reuse an existing one. Return if the
operation was successfull.
- RC_CLOSE: Set status to INP_STATUS_CLOSED
to make subsequent read attempts fail and release any resources you
needed. Setting buf to NIL is also a good idea
to allow the buffer to be garbage collected.
Output ports
Structure
typedef struct {
UInt8 type;
UInt8 flags;
UInt32 uid; // uid of memo-like database
UInt16 pos; // current write position from beginning of buffer
UInt16 bufLen; // length of buffer
DmOpenRef dbRef; // MEMO or MEMO32 database reference
PTR impl; // depending on port type
PTR buf; // a LispMe string used as buffer
char* mem; // cache the actual buffer
} OutPort;
Creation
Use makeBufOutPort(UInt8 ptype, PTR impl, UInt16 bufLen, UInt8 flags)
to create a buffered output port. This functions allocates a buffer string
of length bufLen. Possible flags (OR'ed together) are
- OPF_SUPPRESS_NOPRINT: Don't print a top-level #n
to the port.
- OPF_AUTOFLUSH: Automatically flush the buffer
(i.e., call the writer function) to the device
after every port write command.
- OPF_NUL_TERMINATE: Append a \0 byte after each
write operation.
The writer function void myWriter(WriterCmd cmd, OutPort* port)
is called with one of two commands:
- WC_WRITE: Flush the write buffer buf to the
destination. Only the substring from position 0 to pos is
relevant. To reuse the buffer for subsequent write operations,
simply set pos to 0.
- WC_CLOSE: First flush the write buffer as above and set
the closed bit in flags
to make subsequent write attempts fail and release any resources you
needed.
General programming tips
Coding under PalmOS
and especially within LispMe context has several pitfalls:
- There's only about 2kb stack space. Keep local variables to a minimum,
especially in potential recursive functions like the mem handler.
- There's not too much global memory, either, and LispMe already
uses a lot of it.
- LispMe disables write protection to database memory,
so be extra careful with pointers, you can easily overwrite
Memory Manager or system structures requiring a cold reboot!
- For the same reason, you must wrap user interface calls in
ReleaseMem() and GrabMem()