$$$$$$$$$$$$$$$ ANNOUNCING THE ONE, THE ONLY $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ >>> BIBOP LISP <<< $$$$$$$$$$$$$$$ December 1973; updated March 1974 The (in)famous "Bibop" (pronounced "bee-bop") LISP scheme has been available for some time now and seems to be more or less reliable. Bibop means "BIg Bag Of Pages", a reference to the method of memory management used to take advantage of the memory paging features of ITS. The average LISP user should not be greatly affected in converting to this new LISP (which very eventually will become the standard LISP, say in a few months). This document is divided into three sections: I. What the average user will need to know. This section is very short (about four pages) and should be read by everyone who wants to use Bibop LISP. II. What sophisticated users (e.g. those who construct and dump systems like MACSYMA and CONNIVER) will need to know. This section details general storage strategies and other features of Bibop, and describes at length how to exploit them. III. What only LISP system hackers and the insanely curious will need to know. This section describes some of the internal workings of Bibop LISP, and is intended primarily as an aid to LISP system hackers. Hopefully most people will need to read only the first section to use Bibop successfully. Much effort has been expended to keep Bibop and non-Bibop LISPs compatible. In particular, if LISP code written for non-Bibop LISP also works "as is" in Bibop LISP, then so will the corresponding LAP and FASL files; that is, the compiler output is completely compatible (and will be kept so). Unless you must alter source code for Bibop LISP, there is no need to recompile. I apologize in advance for the confusion this document will undoubtedly cause. Questions, comments, and complaints may be registered by saying :BUG BIBOP ^C or :FEATURE BIBOP ^C to DDT on either the Mathlab (ML) or the AI PDP-10. Good luck! -- Guy Steele (GLS) Bibop LISP - Section I. - For the Average LISP User - page 2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ I. INFORMATION FOR THE AVERAGE USER $$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ [1] $$$$$ Atomic Symbols Are Different $$$$$ Atomic symbols in Bibop LISP are very different from those in non-Bibop LISP. The PNAME, ARGS, and VALUE "properties" are not stored on the property list, but in a secret hidden place. The ARGS property can be examined and altered with the function ARGS as in non-Bibop LISP, the VALUE property with EVAL, SET, SETQ, and BOUNDP, and the PNAME property with EXPLODE, GETCHAR, etc. The car of an atomic symbol is not -1 (#777777) in Bibop LISP. The cdr of an atomic symbol is still the property list, however; using RPLACD on an atom, though hackish, also works as before. EXPR and SUBR properties are kept on the property list as in non-Bibop LISP. Note well: (CONS (CAR NIL) ) is most definitely not a good way to construct a new atomic symbol (in Bibop, the result won't be an atom at all, but a dotted pair or whatever). If it is necessary to create a new atomic symbol in this manner, use the function COPYSYMBOL, which works as described in LISP ARCHIV in both Bibop and non-Bibop LISPs. $$$$$ [2] $$$$$ The Allocator Is Different $$$$$ When a Bibop LISP starts up, it says: BIBOP LISP ALLOC? All the usual responses have their usual effects; ^Q, ^W, space, Y, N, etc. work as in non-Bibop LISP. Some questions asked are different, however, and they are presented in a different order. Questions and their default values look something like: Bibop LISP - Section I. - For the Average LISP User - page 3 CORE = 51 LIST = 20000 FIXNUM = 5000 FLONUM = 2000 BIGNUM = 2000 SYMBOL = 2000 ARRAY = 200 REGPDL = 10000 SPECPDL = 4000 FXPDL = 10000 FLPDL = 2000 The first parameter controls the total amount of memory desired, in units of 1024. (1k) PDP-10 words. This is not an absolute constraint as it is in non-Bibop lisp, but merely a specification of the initial size desired for the Bibop lisp. The next six parameters control the maximum size for each of six "spaces" used to contain various objects. They are: LIST lists and dotted pairs FIXNUM fixnums (like FXS in non-Bibop LISP) FLONUM flonums (like FLS in non-Bibop LISP) BIGNUM bignums (actually, just the bignum headers, which hold the sign and point to the rest) SYMBOL atomic symbols (this is why you can't use CONS to create symbols; they are not in LIST space!) ARRAY array pointers (seen primarily on property lists as values of ARRAY properties; the allocation for these need not be large) Initially, you start off with much less core than you specify, but Bibop will "grow" each space to the specified size as soon as it is convenient and/or necessary. If the total amount of core you specify is greater than the total of the individual parts plus the size of the initial LISP system, Bibop will portion out the remainder to deserving spaces (like list and fixnum) in order to make the total size of the LISP be what you asked for. The four pdl parameters are upper limits to the sizes (in PDP-10 words) of the four push-down stacks in LISP. Don't be afraid to give large values for these allocations (or for others, either) since you are not really given as much core as you ask for, but only one page of core per pdl; as the pdls grow, more pages of core are added. On the other hand, the default allocations for the pdls are already quite large, and should be sufficient for most programs. As in non-Bibop LISP, if you type ^Q or other magic characters at the ALLOC? question, your .LISP. (INIT) file Bibop LISP - Section I. - For the Average LISP User - page 4 will be read. The first form should be a comment, as usual. The names of the spaces should be the same as the names used by ALLOC given above. Note that supplying nonexistent space names in the comment doesn't hurt, so you can use the same comment in non-Bibop LISPs as well as Bibop LISPs; for example, the comment (COMMENT FXS 4000 FIXNUM 5000 FLS 2000 SYMBOL 4000 FLONUM 2000 BIGNUM 1400) will for non-Bibop set the size for FXS=4000, FLS=2000 and for Bibop will set FIXNUM=5000, FLONUM=2000, SYMBOL=4000, BIGNUM=1400. $$$$$ [3] $$$$$ The ALLOC Function $$$$$ The primary feature of Bibop is that it allows you to expand memory dynamically. It does this automatically to a certain extent; however, the user may explicitly alter the sizes of memory spaces with the function ALLOC. This is a subr whose single argument should look like the comment form given in a .LISP. (INIT) file. Example: (ALLOC '(LIST 40000 FIXNUM 6000 SYMBOL 5000)) will dynamically re-allocate LIST space to be 40000 words, etc. (Note that LIST space will not actually expand to the specified size until the next time a garbage collection for LIST space is invoked.) At present Bibop can only expand memory, not shrink it, so if you specify a size for a space smaller than its current actual size, your request is ignored for that space. ALLOC returns T. (The ALLOC function is actually considerably more complicated, but the above description should suffice for the purposes of most users. An extended description of ALLOC may be found in section II.[3].) $$$$$ [4] $$$$$ A Few New Messages $$$$$ If you should run out of memory for any reason, you will see a message similar to that in non-Bibop LISP, e.g.: FIXNUM STORAGE CAPACITY EXCEEDED ;BKPT GC-LOSSAGE This indicates that you have run out of address space and therefore cannot add any more memory. However, in Bibop LISP another kind of break may occur: Bibop LISP - Section I. - For the Average LISP User - page 5 ;BKPT GC-OVERFLOW this indicates that some space has reached the maximum specified by ALLOC; the name of the space which got too large is the value of the variable ARGS, as usual. Now all is not lost at this point in a Bibop LISP; you may call the function ALLOC to make FIXNUM space bigger, then say $P to continue the computation. (Note that this is not really an error break, but more like a daemon; therefore $P does not error out to top level the way it would for, say, a FAIL-ACT break). To abort the computation, use ^G. If you see a message that looks something like ;ADDING A NEW FIXNUM SEGMENT - FIXNUM SPACE NOW 2000 WORDS it merely means that the garbage collector has decided to add more memory. You will only see this if you have typed ^D to see the garbage collector statistics (as in non-Bibop). Yet another message you may see (which also does not exist in non-Bibop LISP) is: ;BKPT PDL-OVERFLOW This means that you have used excessive amounts of pdl (by exceeding the specified PDLMAX parameter - see section II.6), and therefore are probably stuck in infinite recursion or something. If you want to keep going, just say $P and Bibop will try to extend the pdl if it can. To abort the computation, use ^G. (This procedure is similar to that for GC-OVERFLOW - see above.) Bibop LISP - Section II. - For Sophisticated LISP Users - page 6 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ II. INFORMATION FOR SOPHISTICATED USERS $$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ [1] $$$$$ The Garbage Collector Explained $$$$$ For each memory space (presently these are LIST, FIXNUM, FLONUM, BIGNUM, SYMBOL, and ARRAY) there are three parameters controlling when and how to expand it. These are called the GCSIZE, GCMIN, and GCMAX parameters. The GCSIZE parameter indicates the size that space is normally expected to be; GCMIN indicates the minimum number of words free after a garbage collection; and GCMAX indicates the maximum size to which the space is permitted to expand. The garbage collector is invoked whenever no free cells are left for a space. (We shall ignore the case where the user explicitly invokes it.) The algorithm for getting free cells is roughly as follows: [a] If the total size of the space under question is less than its GCSIZE parameter, then add some more memory to that space and return immediately, without doing a full-blown garbage collection. (If possible, enough memory will be added to bring the total size of the space up to its GCSIZE.) However, if the total size already exceeds the GCSIZE, or if adding more memory would cause the space to exceed the GCMAX parameter, do not add any memory yet, but go on to step [b] instead. [b] Do a garbage collection. If the number of words free is less than the GCMIN parameter, then add enough core to satisfy GCMIN (see below). (However, if satisfying the GCMIN would cause GCMAX to be exceeded for that space, then only as much as GCMAX allows will be added. On the other hand, at least "some" (as described above) will be added.) [c] If GC can't get more memory (the LISP is huge, or ITS won't give LISP any core), then a GC-LOSSAGE interrupt occurs. [d] If the total size of the space exceeds the GCMAX parameter, then a GC-OVERFLOW interrupt occurs. Note that this is not an error interrupt; it is rather more like the GC-DAEMON interrupt, and will not be trapped by ERRSET. The purpose of the GCSIZE parameter is to allow storage spaces to swell quickly to a predetermined size, e.g. while loading a package of functions. When more memory is needed, it is added immediately, without a time-consuming garbage collection. Bibop LISP - Section II. - For Sophisticated LISP Users - page 7 The GCMIN parameter takes over when a storage space becomes larger than GCSIZE. GCMIN may be either a fixnum or a flonum. If it is a fixnum, then more memory is added whenever the number of words reclaimed is less than that. If the GCMIN is a flonum, then more memory is added whenever the number of words reclaimed is less than the GCMIN times the total size of the space. Thus a GCMIN of 2000 for some space says to add more memory if fewer than 2000 words are reclaimed for that space; while a GCMIN of 0.25 says the space should be kept 25% free, adding more memory when necessary to maintain that percentage. Values of 0.2 to 0.4 are recommended for LIST, FIXNUM, and FLONUM spaces; 100. to 500. is recommended for BIGNUM, SYMBOL, and SAR spaces. (If your application involves creating and throwing away large numbers of SYMBOLs, you might be well-advised to use 0.3 or so for the SYMBOL GCMIN.) The GCMAX parameter is used primarily to keep tabs on programs which might go out of control and use up unlimited amounts of memory; this is a problem only because a space, once expanded, may not be shrunk again. A user-supplied GC-OVERFLOW interrupt function may be used to keep track of memory expansions (see section II.[4] below). $$$$$ [2] $$$$$ Dynamic Pdl Expansion $$$$$ The parameters you give to ALLOC for pdl sizes (see above) are very similar to the GCMAX parameters for free storage spaces: the pdls do not actually occupy that much core, but are rather permitted to grow that large if necessary. There is one difference, however: enough address space for the specified pdl sizes is reserved to accomodate the pdls, and at present this address space may not be used for anything else. Thus, if you allot 100K for pdls, the rest of your LISP may never grow larger than 156K. On the other hand, the initial allocations for pdls are quite reasonable (total about 12K), and few programs require more than that. Now the same remark made above about free storage applies to pdls also: one may not want an infinite recursion to gobble up 8K of pdl before giving a pdl overflow message. Each pdl therefore has a PDLMAX parameter, similar to GCMAX for storage spaces. The PDLMAX parameter for a given pdl specifies that when the number of words on the pdl exceeds PDLMAX, a PDL-OVERFLOW user interrupt (number 12.) should be given. The user interrupt function receives as its argument one of the four atomic symbols REGPDL, FLPDL, FXPDL, or SPECPDL. When this happens, the PDLMAX for that pdl is pushed up enough to give the user interrupt a little room to work in; but if the user interrupt function uses too much pdl itself without explicitly setting the PDLMAX up even Bibop LISP - Section II. - For Sophisticated LISP Users - page 8 further, it may be recursively re-invoked. If the absolute limit on the size of the pdl is reached, an uncorrectable error occurs. If this happens, the PDLMAX parameters will be reset to what they were before the PDL-OVERFLOW interrupt pushed them up. $$$$$ [3] $$$$$ The ALLOC Function Extended $$$$$ An extended form of the ALLOC function may be used to set the various parameters for spaces. Essentially, one may replace the number for a space by a 3-list of the GCSIZE, GCMAX, and GCMIN parameters (in that order). A NIL for any parameter means "don't change the current setting". Example: (ALLOC '(LIST (40000 60000 0.2) FIXNUM (4000 7000 NIL))) means to set the LIST GCSIZE to 40000 words, the LIST GCMAX to 60000 words, and the LIST GCMIN to 0.2 (meaning "keep LIST space 20% free"); and to set the FIXNUM GCSIZE and GCMAX to 4000 and 7000, and don't change the FIXNUM GCMIN. A typical call for a LISP with a maximum size of about 100K words might be: (ALLOC '(LIST (30000. 50000. 0.25) FIXNUM (6000. 11000. 0.2) FLONUM (1500. 2500. 0.2) BIGNUM (1000. 2000. 500.) SYMBOL (4000. 5000. 0.15) ARRAY (100. 600. 40.) REGPDL 8000. FLPDL 1000. FXPDL 5000. SPECPDL 8000.)) As a special hack, saying (ALLOC T) will get you back a list of this form, such that you could give it back to ALLOC to set the parameters. Thus, if you forget how to give arguments to ALLOC, just type (ALLOC T) at a Bibop LISP and figure it out! Specifying a value for a pdl sets that pdl's PDLMAX parameter. Note that you may specify only a fixnum for a pdl, not a 3-list. The following example of the use of ALLOC simulates the mode of typein at initial allocation time, and calls the ALLOC function to set parameters for the various spaces: Bibop LISP - Section II. - For Sophisticated LISP Users - page 9 (DEFUN ALLOC! NIL (TERPRI) (PRINC 'BIBOP/ LISP/ ) (PRINC (STATUS LISPVERSION)) (TERPRI) (PRINC 'ALLOC!) (TERPRI) (DO ((Q (ALLOC T) (CDDR Q)) (Y)) ((OR (NULL Q) (EQ Y '$)) (TERPRI) '*) (TERPRI) (PRINC (CAR Q)) (PRINC '/ =/ ) (PRINC (CADR Q)) (DO NIL ((NOT (< (- LINEL CHRCT) 40))) (PRINC '/ )) (AND (MEMQ (TYPEP (SETQ Y (READ))) '(LIST FIXNUM)) (ALLOC (LIST (CAR Q) Y))))) The interaction when this function is invoked would look like this (note that the user must type space after a number or $, and that the user may type a 3-list instead of a number when appropriate): BIBOP LISP 676 ALLOC! LIST = (20000 20263 0.25) 44444 FIXNUM = (5000 5046 0.2) (4000 6000 .4) FLONUM = (1000 1007 0.15) NIL BIGNUM = (1000 1000 100) (NIL NIL .2) SYMBOL = (4000 4036 40) NIL ARRAY = (300 400 10) 350 REGPDL = 13160 20000 FLPDL = 2620 $ * This function is in the file ML:GLS;BIBOP ALLOC! if you want to play with it. $$$$$ [4] $$$$$ Use of GC-OVERFLOW and GC-DAEMON Interrupts $$$$$ If you are concerned with building a system like MACSYMA or CONNIVER, you may want to be more friendly to the user than LISP is when some memory space overflows. For example, you might want to write a function which decides whether or not to add more memory, or asks the user politely, or something. This can be done by supplying a GC-OVERFLOW interrupt function (user interrupt number 13.). This function will be run whenever GC can't add more memory by itself (an explanation of this condition is given in section II.[1] above). It receives an argument which is the name of the space that overflowed. Within this interrupt function more memory probably can be obtained if desired (this is Bibop LISP - Section II. - For Sophisticated LISP Users - page 10 explained below). Here is an example of what a GC-OVERFLOW function might look like: (SETQ GC-OVERFLOW '(LAMBDA (X) (IOG NIL (TERPRI) (PRINC 'YOU/ HAVE/ RUN/ OUT/ OF/ ) (PRINC X) (PRINC '/ SPACE/./ MORE?:/ ) ;Ask if more memory desired. (COND ((MEMQ (READ) '(NIL N NO NEIN NYET)) (ERROR 'SPACE/ CAN/'T/ BE/ EXPANDED X 'GC-LOSSAGE)) (T (PRINC '/ OK/./ /() ;If so, allocate some. (ALLOC (LIST X (LIST NIL (PRINC (+ (CDR (SASSQ X '((LIST . 1400) (FIXNUM . 1400) (FLONUM . 600) (BIGNUM . 400) (SYMBOL . 400) (SAR . 100)) '(LAMBDA NIL '(NIL . 400)))) (CADR (GET (CONS NIL (ALLOC T)) X)))) NIL))) (PRINC '/ WORDS/)) (TERPRI)))))) This function tells the user what's happening and asks whether to go on or not. If the answer is affirmative, it increases the amount of core and returns to continue; otherwise it calls the ERROR function to cause a GC-LOSSAGE error. Note that SASSQ is used so that in case some new kind of space is implemented (and this may happen!) the function will still work properly. The interaction might look like this: YOU HAVE RUN OUT OF FIXNUM SPACE. MORE?: YES OK. (5000 WORDS) YOU HAVE RUN OUT OF LIST SPACE. MORE?: JA OK. (22000 WORDS) YOU HAVE RUN OUT OF FLONUM SPACE. MORE?: NO FLONUM SPACE CAN'T BE EXPANDED ;BKPT GC-LOSSAGE where the user typed "YES " to the first question, "JA " to the second, and "NO " to the third. Naturally, even more imaginative things may be done. This function is in the file ML:GLS;BIBOP GCOVER if you want to play with it. Bibop LISP - Section II. - For Sophisticated LISP Users - page 11 The GC-DAEMON user interrupt (number 20.) may be used to monitor the results of garbage collection as in a non-Bibop LISP. The argument to the GC-DAEMON function is a list of space items, where a space item is of the form ( . ) where is the name of the space, is a fixnum denoting the number of cells free just before the garbage collection, and is a fixnum denoting the number of cells free after garbage collection. (This is the same GC-DAEMON argument format now used by non-Bibop LISP.) $$$$$ [5] $$$$$ Pure Free Storage $$$$$ Note: this section is probably worth reading only if you are building a large system to be purified and run by many users. (MACSYMA people take note!) Bibop allows for four more kinds of storage spaces besides the six we have mentioned up to now. These new kinds are called pure LIST, pure FIXNUM, pure FLONUM, and pure BIGNUM. They are pure because (1) once a cell is consed up it can never be reclaimed, and (2) because this is true, such list structure can reside on read-only pages, and thus be shared between jobs in the same way pure executable code can be shared. A restriction is made that list structure pointers in pure spaces cannot point anywhere except to atomic symbols, subrs, sars, and other pure cells; this implies that (3) not only does the garbage collector not have to reclaim cells for pure spaces, but it need never mark through them (trace them) during garbage collections either; thus a time savings is realized as well as a space savings. Another restriction which follows from the above is that (4) one may not do RPLACAs, NCONCs, etc. on list structure residing in pure spaces (one might get a pure page violation trap). There are two ways to use pure free storage. The first is to make the variable *PURE non-nil before calling FASLOAD. (This is not the same as PURE, which is still used for its original purpose of making executable code pure, and to invoke the so-called "XCT hack".) In a non-Bibop LISP, *PURE is ignored. In a Bibop LISP, any constant list structure referred to by compiled code, and possibly other pieces of list structure (such as the PNAMEs of INTERNed symbols) are made purifiable. (Exactly what gets purified is up to FASLOAD, and is subject to change as this feature is experimented with.) There is thus an implicit assumption that compiled code will not contain such idiotic constructs as Bibop LISP - Section II. - For Sophisticated LISP Users - page 12 (DEFUN FOO (X) (NCONC '(A B) X)) [Think about it.] This all will happen automatically: the list structure will be consed into pure spaces instead of regular spaces, and magic bits will be set for the PURIFY function (see below). The variable *PURE also controls another hack in the function PUTPROP. If the value of *PURE is non-nil, then the value of the variable PUTPROP is a list of property names whose corresponding values may be purified. When *PURE is non-nil, the function PUTPROP checks its third argument against this list; if it is on the list, then the function PURCOPY (see below) is applied to the second argument before it is placed on the property list of the first argument. Furthermore, the two cells of the property list itself are purified if it is a new property being added (as opposed to changing the value of an already present property); in order to do this, PUTPROP may install the property somewhere in the middle of the property list, instead of at the beginning as it usually does. Note that this purification of property lists doesn't hurt, since if, after it is purified, the property is changed again, the system will copy the property list in order to make the change. (The property lists of most system symbols are initially pure in all versions of LISP!) Initially the value of PUTPROP is (SUBR FSUBR LSUBR), these properties may always be safely purified. Warning: ARRAY is definitely not a purifiable property! Example: suppose that the values of the property DATUM are always purifiable, i.e. are either always atomic or are list structure which is never RPLACA'd or otherwise clobbered. Then one might say: (SETQ PUTPROP (APPEND '(DATUM) PUTPROP)) When *PURE is set to non-nil, then PUTPROP will automatically purify all DATUM properties. Another way to use pure free storage is to call the function PURCOPY, a subr of one argument. In a non-Bibop LISP, this will merely return its argument (or possibly a copy thereof, if a number). In a Bibop LISP, the entire s-expression will be copied as much as possible into pure spaces, and the copy returned. Note that reasonable things are done for symbols and sars, though there are no pure SYMBOL or pure ARRAY spaces. This use of PURCOPY is intended for such things as (SETQ HUMUNGOUS-DATA-BASE (PURCOPY HUMUNGOUS-DATA-BASE)) or (PUTPROP 'FOO (PURCOPY (GET 'FOO 'PROP)) 'PROP) where HUMUNGOUS-DATA-BASE and FOO's PROP property will not be RPLACA'd into or anything nasty like that. Of course, there is little point in doing all this unless the job is to be purified and :PDUMPed, or unless the structures involved Bibop LISP - Section II. - For Sophisticated LISP Users - page 13 are very large. Furthermore, there is no point in PURCOPYing an object which is likely to be very temporary. In any case this is primarily an efficiency hack; it does not hurt not to use it. $$$$$ [6] $$$$$ New STATUS/SSTATUS Calls $$$$$ There are many STATUS functions in Bibop for examining and altering the various parameters described above. In the following, must evaluate to one of LIST, FIXNUM, FLONUM, BIGNUM, SYMBOL, or ARRAY, unless otherwise specified. (Actually, some Bibop LISPs may not have all these spaces; in particular BIGNUM may be missing.) To determine what spaces are available, and thus what space names are valid arguments, say: (STATUS SPCNAMES) to get a list of them. Recall that the space parameters described in section II.[1] do not control the actual size of a storage space, but merely specify how to expand it. To find the actual current size of a storage space, say: (STATUS SPCSIZE ) To find the total size of some pure space, say: (STATUS PURSIZE ) where in this latter context must evaluate to one of LIST, FIXNUM, FLONUM, or BIGNUM. Note that the initial LISP system has some pure LIST and FIXNUM storage. To find the current number of words on some pdl, say: (STATUS PDLSIZE ) where should evaluate to REGPDL, FLPDL, FXPDL, or SPECPDL. Note that some random variation is introduced by the fact that the call to STATUS uses some pdl, but this is only a few words. To find the absolute maximum for the size of a pdl (this is the quantity defined once and for all at initial allocation time), say: (STATUS PDLROOM ) where is as above. This quantity will generally be larger than the number given to the initial ALLOC; the Bibop LISP - Section II. - For Sophisticated LISP Users - page 14 latter number is made the initial PDLMAX, and PDLROOM is made larger so that the PDL-OVERFLOW user interrupt handler will have some extra pdl to work in. $$$$$ [7] $$$$$ Use of the PDL-OVERFLOW User Interrupt $$$$$ This is an example of what a handler for the PDL-OVERFLOW user interrupt might look like. It is similar to the one previously shown for the GC-LOSSAGE interrupt. If the PDL-OVERFLOW function returns at all, it means that the pdl should be extended and the computation continued. To abort the computation an error must be signaled explicitly; this is done here with the function ERROR (see LISP ARCHIV). (SETQ PDL-OVERFLOW '(LAMBDA (X) (TERPRI) (PRINC 'YOU/ HAVE/ USED/ ) (PRINC (STATUS PDLSIZE X)) (PRINC '/ WORDS/ OF/ ) (PRINC X) (PRINC '/./ MORE?:/ ) (COND ((MEMQ (READ) '(NIL N NO NEIN NYET)) (ERROR 'OVERFLOW/ LOSSAGE X)) (T (PRINC '/ OK/.) (TERPRI) (ALLOC (LIST X (MIN (STATUS PDLROOM X) (+ (GET (CONS NIL (ALLOC T)) X) 400)))))))) This function is in the file ML:GLS;BIBOP PDLOSS if you want to play with it. $$$$$ [8] $$$$$ The Purify Function $$$$$ The PURIFY function operates in the usual manner when its third argument is T or NIL. If, however, the third argument is BPORG (which means to do clever things when purifying compiled code), Bibop assumes that it knows more than you ever will about the kludgy things it's doing, and therefore more or less ignores the first two arguments!!! (Actually, they should both be fixnums; 0 and 0 will do, and calculating numbers as you would for non-Bibop doesn't hurt.) In any case, all the correct things should happen, including purification of sharable code and of pure free storage. It is, of course, as good an idea in Bibop LISP as in non-Bibop LISP to call the function PAGEBPORG before calling PURIFY with BPORG as its third argument. Bibop LISP - Section II. - For Sophisticated LISP Users - page 15 Another way to purify the LISP is to say (MACDMP 'PURIFY$G) where $=altmode. (Naturally, other stuff may be included in the valret string as well, such as a :PDUMP or whatever.) Unlike non-Bibop LISP, this will purify all possible pages, not just system pages. Bibop LISP - Section III. - For LISP System Hackers - page 16 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ III. INFORMATION FOR THE SUPER-HACKERS $$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ The information in this section is primarily for the use of system hackers, and is for the most part subject to change without notice when implementation requires. This section makes no pretense at being self-explanatory; it should be read in conjunction with a listing of LISP. $$$$$ [1] $$$$$ The Segment Table $$$$$ At the heart of the Bibop scheme is the Segment Table (ST). The entire address space is divided into "segments" which are some fraction of 1K (typically 1/2K; the size is SEGSIZ=<1_SEGLOG>, where SEGLOG is an assembly parameter). The Segment Table contains one word for each segment; the contents of this word indicate what that segment may contain. A Segment Table entry is of this form: BIT # BIT NAME BIT DESCRIPTION 4.9 LS 1=list structure, 0=atomic 4.8 $FS free storage (bit 4.9 should be on also) 4.7 $FX fixnum storage (but not fixnum pdl) 4.6 $FL flonum storage (but not flonum pdl) 4.5 BN bignum header storage 4.4 SY symbol header storage 4.3 SA sar (array pointer) storage 4.2 VC value cell storage (bit 4.9 should be on also) 4.1 $FXP fixnum pdl area 3.9 $FLP flonum pdl area 3.8 $XM existent (random) area 3.7 $NXM nonexistent (random) area 3.6 PUR pure space (one of bits 4.8-4.5 should be on) 3.5-3.1 unused 2.9-1.1 address of one of these seven items: QLIST, QFIXNUM, QFLONUM, QBIGNUM, QSYMBOL, QRANDOM, QARRAY Note that these items occupy consecutive memory locations and thus numerically encode the page type. The basic algorithm for getting a Segment Table entry and testing it (given an address in, say, accumulator A) is: MOVEI TT,(A) ;in case we must save A LSH TT,-SEGLOG ;look at top few bits of address MOVE TT,ST(TT) ;fetch Segment Table entry TLNN TT, ;or maybe TLNE, or something Instead of the TLNN, one may also have something like: Bibop LISP - Section III. - For LISP System Hackers - page 17 JRST @JTABLE-QLIST(TT) ;n-way JRST on TYPEP where JTABLE is a table of addresses. Since bignums are not always present in the system, one ordinarily writes somthing like: JTABLE: JLIST ;jump to JLIST if TYPEP=LIST, etc. JFIXNUM JFLONUM IFN BIGNUM, JBIGNUM JSYMBOL JRANDOM JARRAY Various efficiencies can be introduced in special cases. The above sequence of code is so common that there is a macro called SKOTT (SKip On TT) which produces the above sequence, with a TLNN. (If is merely LS, however, then the MOVE and TLNN are collapsed into a SKIPL.) There is another Segment Table which is used by the garbage collector and related routines; it is called GCST. Its format is quite hairy: the positions and sizes of the bits and bytes in each entry are dependent on SEGLOG (for the sake of an efficiency hack in the GC mark phase). Basically, it indicates the address of the bit table, if any, for marking items in that segment, and also has a link field so that GCST entries may be strung together; there are also random bits telling GC how to mark items in the segment. If a segment requires no bit table, then its GCST entry has 0 as the bit table address. Exception: a segment which contains bit tables uses its GCST entry to aid allocation and initialization of the bit tables within the segment. As an example, if SEGLOG=11 (octal), the usual case, then the format of GCST entries is as follows: BIT # BIT NAME BIT DESCRIPTION 4.9 GCBMRK items in this page are markable 4.8 GCBVC this page contains value cells 4.7 GCBSYM this page contains symbol headers 4.6 GCBSAR this page contains array pointers 4.5 GCBCDR mark from cdrs of items 4.4 GCBCAR mark from cars (only if 4.5 also on) 3.4-2.5 SEGBYT field for chaining entries 2.4-1.1 ---- segment table address, lsh'ed Bibop additionally maintains a table called PURTBL, which has a two-bit byte for every page (that's right, not segment, but page) of possible address space. A zero means the page should be non-existent; a one, that the page should be impure; a two, that it should be pure; and a Bibop LISP - Section III. - For LISP System Hackers - page 18 three, that the page is some special case like a pdl page. These bits reflect what the pages should be, not what they are; indeed, what PURIFY$G does is to force the actual pages to conform to PURTBL! $$$$$ [2] $$$$$ Storage Spaces in Bibop $$$$$ Since Segment Table entries are used to determine TYPEP of objects, clearly things like atomic symbols may not reside in LIST space, as in a non-Bibop LISP. Furthermore, pure and impure objects must be not only in different segments, but on different pages. Thus Bibop distinguishes many kinds of segments and pages. All segments of a given type comprise a storage space. Non-Bibop LISP has five primary storage spaces: LIST, FIXNUM, FLONUM, ARRAY, and Binary Program Space (arrays are kept in BPS also, as opposed to array pointers, which are kept in ARRAY space.) Bibop LISP has many: LIST (pure and impure), FIXNUM (pure and impure), FLONUM (pure and impure), BIGNUM (pure and impure), SYMBOL, ARRAY, and Binary Program Space. The LIST, FIXNUM, FLONUM, and ARRAY spaces are analogous to those in non-Bibop LISP. The BIGNUM space actually holds only bignum headers, which contain the sign of a bignum in the sign bit, and a pointer to a list of fixnums, identical to that in non-Bibop LISP, in the right half. (Actually the left half must contain 0 or 777777; much code depends on it!) Note that with this representation, the sign of any LISP number, be it FIXNUM, FLONUM, or BIGNUM, may be determined by testing the sign of the word addressed by the LISP pointer to that number. SYMBOL space has a very peculiar format: it is actually sectioned into two spaces. One is called symbol header space, and is the space referred to by the garbage collector statistics. One word of symbol header space is required per symbol; thus when GC says 500 SYMBOL words free, it is possible to create 500 more symbols, even though symbols occupy more than one word in Bibop LISP. The other is called symbol block space, and is allocated in two-word chunks. Each symbol block is therefore at an even address. These symbol blocks are not garbage collected, but rather follow an explicit return discipline. Symbol blocks may be purified, while symbol headers may never be purified. Thus there are actually two symbol block spaces, pure and impure, though this is transparent to the user. Whenever a new symbol is created, a fresh symbol header is taken from symbol header space, and a fresh symbol block from impure symbol block space, unless the variable *PURE is non-nil, in which case a pure symbol block is used. Impure symbol blocks are also used when the contents of a pure symbol block must be altered; the contents of the pure symbol block are copied to a fresh impure one and the pure one forgotten forever. Pure symbol blocks are never returned Bibop LISP - Section III. - For LISP System Hackers - page 19 once used; impure symbol blocks are returned when their corresponding symbols are garbage collected. The symbol header has the property list in its right half (thus the cdr of an atomic symbol is its property list, as usual), and a pointer to the symbol block in its left half. The symbol block has two words. The left half of the first word contains random bits, including one which is on iff the symbol block is pure, and one which says that compiled code needs the symbol (and that therefore the symbol may never be garbage collected). This bit cannot be turned on in a pure symbol block, so whenever a pure symbol block is copied, the compiled-code-need-me bit is turned on in the copy - better to be safe than sorry! The index and indirect bits are zero (see below). The right half of the first word points to the value cell for that symbol. If the symbol does not have a value cell of its own, it points to SUNBOUND, the Standard UNBOUND value cell. (Such symbols are heliocentric.) In this way, one can indirect through the symbol block to get the symbol's value, saving an instruction. The left half of the second word contains the ARGS property for the symbol, in the format used by FASLOAD, i.e. as two nine-bit bytes. The PNAME of the symbol is in the right half of the second word. Note that the garbage collector does not mark symbol blocks per se, but only symbol headers; thus keeping a pointer to a symbol block in a marked accumulator does not guarantee that the symbol block will not go away! Another garbage collector hack is that one cannot use the symbol block for the gc mark bit, since it may be pure. However, the pointer in the left half of the symbol header must always be even, since symbol blocks have even addresses. Thus bit 3.1 of the symbol header is used as the mark bit. Of course, GC must reset these bits as it does the final sweep through symbol header space. Before the mark phase it sets a switch (MUNGP) and resets it after the sweep; if somehow ERINIT is reached before GC finishes, it will notice the switch and reset the mark bits. Bibop stores array pointers (also known as sars) in the same manner as non-Bibop lisp; the relevant information appears here for convenience. Sars consist of two words; the first is called the ASAR, and the second the TTSAR. A pointer to a sar points to the ASAR. The ASAR has the following format: Bibop LISP - Section III. - For LISP System Hackers - page 20 BIT # BIT NAMER BIT DESCRIPTION 4.3 AS.FIL this is a file array (for new i/o) 4.2 AS.RDT this is a readtable 4.1 AS.OBA this is an obarray 3.9 AS.SX this array holds s-expressions, 2/wd 3.8 AS.FX this array holds fixnums 3.7 AS.FL this array holds flonums 3.6 AS.GCP GC should mark array entries 3.5-3.1 ---- must be zero 2.9-1.1 ---- address of array header Bits 3.9-3.7 should be mutually exclusive - they define the type of array access mechanism to be used. Bit 3.6 is independent of this consideration; for example, a readtable has AS.FX set, but it also wants its macro character list garbage collected. GC only marks as many words of entries as specified by the aobjn pointer just before the array header. Bits 3.5-3.1 must be zero because compiled code does indirect PUSHJs through the ASAR to reach the array header, which computes subscript information. The format of the TTSAR is as follows: BIT # BIT NAME BIT DESCRIPTION 4.9-4.7 TTSDIM number of dimensions (1-5) 3.7 TTS.CN compiled code needs this sar 3.6 TTS.GC mark bit used by GC 3.5-3.1 ---- must contain (TT) 2.9-1.1 ---- address of first word of array data The indirect and index bits are such that indirecting through the TTSAR causes indexing by accumulator TT. Compiled code uses this fact to open-access arrays. Pure spaces are identical in function to impure spaces, but are allocated on separate pages so that they may be purified. Value cells are also kept in a space of their own for various reasons. They are also kept contiguous, but if the region reserved for value cells (a hole in memory of about 3k words) is ever exhausted, cells from LIST space will be used to create value cells. TYPEP of a value cell is indeed LIST. Value cells in the value cell region follow an explicit return discipline. The function MAKUNBOUND will return the specified symbol's value cell unless the symbol's compiled-code-needs-me bit is set (which might indicate that compiled code references the value cell). The garbage collector will return a value cell if the symbol pointing to it is swept up. Bibop LISP - Section III. - For LISP System Hackers - page 21 Bibop LISP - Section III. - For LISP System Hackers - page 22 $$$$$ [3] $$$$$ Storage Layout and Strategy $$$$$ The initial Bibop LISP system is laid out something like this: ---------------------------------------------------- < 0 | low page - impure (acs, temps, etc.) | ---------------------------------------------------- < BSYSSG | | | initial LISP system | | | | (executable code) | | | ---------------------------------------------------- < BSARSG | initial SARs (for READTABLE, OBARRAY, etc.) | ---------------------------------------------------- < BVCSG | initial value cells (plus expansion room) | ---------------------------------------------------- < BSYMSG | initial atomic symbols | ---------------------------------------------------- < BPFSSG | initial list structure and numbers - pure | ---------------------------------------------------- < BIFSSG | initial list structure and numbers - impure | ---------------------------------------------------- < BSTSG | segment tables (ST and GCST) | ---------------------------------------------------- < BBPSSG | Binary Program Space (expands upward) | < V(BPORG) ---,----,----,----,----,----,----,----,----,----,--- < V(BPEND) | arrays (float at top of BPS) | ---,----,----,----,----,----,----,----,----,----,--- < C(BPSH) | / / / / / / / / / /| | / / / / / / / / / / | | / / big hole of nonexistent memory / / | | / / / (the "big bag of pages") / / | |/ / / / / / / / / / | | / / / / / / / / / /| < C(HINXM) ---'----'----'----'----'----'----'----'----'----'--- | new storage space segments (expands downward) | ---------------------------------------------------- < C(FXC2) | push-down lists (FXP, FLP, P, SP) | | (sizes allocated by initial ALLOC) | ---------------------------------------------------- < BSCRSG | scratch pages (for PDP-6 slave, etc.) | ---------------------------------------------------- < 776000 | / / / high page - never used / / / | ---------------------------------------------------- < 777777 ------- Legend: C(foo) = contents of location foo Address V(foo) = fixnum value of symbol foo Note that "downward" means toward lower addresses, and thus up the diagram (too bad if you don't like it!) Bibop LISP - Section III. - For LISP System Hackers - page 23 The basic strategy is to allocate new storage segments from the top of memory downward, so that this area and Binary Program Space grow toward each other. Extra hair is involved to ensure that pure and impure segments stay on separate pages. The Bibop memory management routines maintain a table of two-bit bytes called PURTBL. Each byte describes the state of one page. For a further description, see TBLPUR$X below. Bibop only allocates pdl pages when necessary. Whenever ERINIT is called ($G startup, ^G quit, error to top level) all pdl pages are flushed, and the pdl pointers are all made to point to a one-word pdl called FAKPDL. Whenever pdl overflow occurs, the internal pdl overflow interrupt handler extends the pdl, getting a new page if necessary and possible. (If it is necessary but not possible, an uncorrectable error occurs). If the pdl's PDLMAX has been exceeded, the PDLMAX is temporarily pushed up some and a PDL-OVERFLOW use interrupt is signaled. The value returned from this interrupt is ignored; the mere fact of its returning indicates that the computation should be continued. The internal pdl overflow handler goes to some trouble to make sure a little pdl (about 8 words) is always available, even if the pdl overflow interrupt is temporarily inhibited (e.g. inside the internal interrupt dispatcher). It also uses two very small pdls called FAKP and FAKFXP for some purposes of its own. $$$$$ [4] $$$$$ Hairy STATUS Calls $$$$$ These STATUS calls are to ease debugging, and may be flushed in the future. BEWARE!!! The GCSIZE, GCMAX, and GCMIN parameters for storage spaces may be hacked en masse by the ALLOC function (see above), or individually by these calls: (STATUS GCSIZE ) (STATUS GCMAX ) (STATUS GCMIN ) (SSTATUS GCSIZE ) (SSTATUS GCMAX ) (SSTATUS GCMIN

) Note that if a flonum GCMIN is specified, it should be between 0.5 and 0.005. The SSTATUS versions return T if they succeed in setting the parameter, otherwise NIL. Bibop may fiddle with the parameters to keep them consistent. Bibop LISP - Section III. - For LISP System Hackers - page 24 To examine and set the PDLMAX parameter for some pdl, say: (STATUS PDLMAX ) (SSTATUS PDLMAX ) where is a fixnum. $$$$$ [5] $$$$$ Super Random Items about Bibop $$$$$ The following STATUS calls do not exist in Bibop: (STATUS FS) (STATUS FWS) (STATUS BT) The following additional status calls exist in Bibop: (STATUS SEGLOG) This is the log base 2 of the amount of memory by which a storage space may be incremented. Typically this is 10 or 11 octal. (STATUS HINXM) This is the highest address of the "hole" in memory the top of Binary O->RMS {GLS} 160453 NO COMPLETION REPLY, R=3 CONN. ABORTED, LAST CODE=350.CLI sent to ->ECC 015726 TO->EAK 015726 CLI sent to ->EAK 021951 INPUT={MAIL 1} TIM=2:19 AM 021958 SPECS:{NET-MAIL-FROM: MIT-DMS},CBF@306{TL=312.} 021958 TO->CBF 022353 INPUT={MAIL 1} TIM=2:23 AM 022356 SPECS:{EAK},ecc@306{TL=133.} 022356 TO->ecc 022356 CLI sent to ->ECC 022453 INPUT={MAIL 1} TIM=2:24 AM 022456 SPECS:{ECC},eak@306{TL=95.} 022456 TO->eak 022456 CLI sent to ->EAK 022941 INPUT={MAIL 1} TIM=2:29 AM 022946 SPECS:{EAK},ecc@306{TL=139.} 022946 TO->ecc 022946 CLI sent to ->ECC 023106 INPUT={MAIL 1} TIM=2:31 AM 023110 SPECS:{ECC},eak@306{TL=249.} 023110 TO->eak 023110 CLI sent to ->EAK 023329 INPUT={MAIL 1} TIM=2:33 AM 023335 SPECS:{ECC},eak@306{TL=168.} 023335 TO->eak 023336 CLI sent to ->EAK 031158 INPUT={MAIL 1} TIM=3:11 AM 031158 SPECS:{NET-MAIL-FROM: MIT-AI},PDP11@306::=(,EAK@306,ECC@306,PDP11@306){TL=209.} 031159 TO->PDP11 031201 TO->ECC 031201 CLI sent to ->ECC 031201 TO->EAK 032251 INPUT={MAIL 1} TIM=3:22 AM 032252 SPECS:{ECC},eak@306{TL=30.} 032252 TO->eak 032252 CLI sent to ->EAK 032359 INPUT={MAIL 1} TIM=3:23 AM 032359 SPECS:{EAK},ecc@306{TL=78.} 032359 TO->ecc 032359 CLI sent to ->ECC 033235 INPUT={MAIL 1} TIM=3:32 AM 033235 SPECS:{ECC},eak@306{TL=170.} 033235 TO->eak 033236 CLI sent to ->EAK 042104 INPUT={MAIL 1} TIM=4:21 AM 042105 SPECS:{EAK},ECC@306{TL=61.} 042105 TO->ECC 042105 CLI sent to ->ECC 042236 INPUT={MAIL 1} TIM=4:22 AM 042239 SPECS:{ECC},eak@306{TL=163.} 042239 TO->eak 042240 CLI sent to ->EAK 042253 INPUT={MAIL 1} TIM=4:22 AM 042302 SPECS:{ECC},eak@306{TL=28.} 042302 TO->eak 042302 CLI sent to ->EAK 042808 INPUT={MAIL 1} TIM=4:28 AM 042809 SPECS:{EAK},ecc@306{TL=58.} 042809 TO->ecc 042809 CLI sent to ->ECC 043021 INPUT={MAIL 1} TIM=4:30 AM 043021 SPECS:{EAK},ecc@306{TL=43.} 043021 TO->ecc 043022 CLI sent to ->ECC 043620 INPUT={MAIL 1} TIM=4:36 AM 043620 SPECS:{NET-MAIL-FROM: MIT-AI},EAK@306{TL=108.} 043620 TO->EAK 043722 INPUT={MAIL 1} TIM=4:37 AM 043722 SPECS:{NET-MAIL-FROM: MIT-AI},MOON@306{TL=512.} 043722 TO->MOON 044034 INPUT={MAIL 1} TIM=4:40 AM 044035 SPECS:{NET-MAIL-FROM: MIT-AI},EAK@306{TL=371.} 044035 TO->EAK 144512 INPUT={MAIL 1} TIM=2:45 PM 144513 SPECS:{EAK},ecc@306{TL=113.} 144513 TO->ecc 172321 INPUT={MAIL 1} TIM=5:23 PM 172321 SPECS:{ECC},eak@306{TL=125.} 172321 TO->eak 173228 INPUT={MAIL 1} TIM=5:32 PM 173228 SPECS:{ECC},ecc@306{TL=56.} 173228 TO->ecc 174542 INPUT={MAIL 1} TIM=5:45 PM 174542 SPECS:{ECC},eak@306{TL=372.} 174542 TO->eak 175210 INPUT={MAIL 1} TIM=5:52 PM 175210 SPECS:{ECC},PDP11@306::=(,EAK@306,ECC@306,PDP11@306){TL=367.} 175211 TO->PDP11 175213 TO->ECC 175213 TO->EAK 180852 INPUT={MAIL 1} TIM=6:08 PM 180852 SPECS:{ECC},eak@306{TL=168.} 180852 TO->eak 181826 INPUT={MAIL 1} TIM=6:18 PM 181828 SPECS:{RMS},EAK@306{TL=369.} 181828 TO->EAK 182735 INPUT={MAIL 1} TIM=6:27 PM 182736 SPECS:{MOON},HOWIE@306{TL=98.} 182736 TO->HOWIE 195848 INPUT={MAIL 1} TIM=7:58 PM 195848 SPECS:{MINSKY},papert@306{TL=169.} 195849 TO->papert 195849 CLI sent to ->PAPERT 203715 INPUT={MAIL 1} TIM=8:37 PM 203715 SPECS:{ECC},bug-tmacs@306::=(,CBF@306,EAK@306,ECC@306,MOON@306){TL=55.} 203715 TO->MOON 203716 TO->ECC 203716 TO->EAK 203716 TO->CBF 215038 INPUT={MAIL 1} TIM=9:50 PM 215041 SPECS:{HQM},raf@306{TL=238.} 215041 TO->raf 220940 INPUT={MAIL 1} TIM=10:09 PM 220942 SPECS:{RAF},HQM@306{TL=15.} 220942 TO->HQM 222920 INPUT={MAIL 1} TIM=10:29 PM 222921 SPECS:{RMF},POPLE@70{TL=91.} 22 214159 ICP->MIT-AI... 214204 QTO->RMS {GLS} 214309 NO COMPLETION REPLY, R=3 CONN. ABORTED, LAST CODE=350.@306::=(,CBF@306,EAK@306,ECC@306,MOON@306){TL=283.} 230542 TO->MOON 230542 TO->ECC 230542 TO->EAK 230542 TO->CBF 000121 INPUT={MAIL 1} TIM=12:01 AM 000125 SPECS:{NET-MAIL-FROM: MIT-MULTICS},DMW@306{TL=190.} 000125 TO->DMW 000154 INPUT={MAIL 1} TIM=12