Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 19 Mar 86 19:47:25 EST Received: from tektronix by csnet-relay.csnet id ag01727; 19 Mar 86 18:47 EST Received: by tektronix (5.31/5.14) id AA17743; Wed, 19 Mar 86 12:37:42 PST Received: by tekchips (5.31/5.14) id AA15513; Wed, 19 Mar 86 12:40:20 PST Message-Id: <8603192040.AA15513@tekchips> To: RRRS-AUTHORS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA, JINX%OZ.AI.MIT.EDU@xx.lcs.mit.edu Subject: Re: S&I's idea of EQ? In-Reply-To: Your message of 17 Mar 1986 21:18 EST (Mon). Date: 19 Mar 86 12:40:12 PST (Wed) From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA JINX's messages have helped, but I still do not understand three of his points: > By first class environments I do not mean that they can be passed > around, but that they can be manipulated. What's the use if they can > only be passed around? As numbers demonstrate, it is both possible and useful to manipulate objects without side effecting them. As I recall, S&ICP uses only the make-environment and eval operations on environments, neither of which mutates an environment. I can imagine many other non-side-effecting operations on environments, such as: (lookup env id) (extend env1 env2) (bind-to-fresh-location env id) (hide env id) Someone might want to argue that these operations would be more useful if they had an exclamation point in their names, but it would take a real argument; the non-side-effecting versions shouldn't be dismissed out of hand. > A cons in MIT-Scheme consists of 4 (move) instructions in compiled > code (without garbage collection). It would be wonderful if compiled code could be made to run without garbage collection. As Kent Dybvig, David Bartley, and I have pointed out, a cons is usually more expensive in gc time than in allocation time. When last I heard, MIT Scheme on the HP 9836 was using a simple stop and copy collector with essentially the same performance as the one in Scheme 312--that is, on a 12 MHz 68000 without wait states the garbage collector will run for 3 to 4 seconds per megabyte of live storage. If the heap is at equilibrium (i.e., storage is being dropped as fast as it is being allocated) and the active semispace is half full, then the cost of allocating 8 bytes of storage cannot be less than 24 to 32 microseconds, no matter how few instructions are used to allocate the storage. It follows that if a variable reference takes 1 microsecond, then creating a closure is about 25 times as expensive as a variable reference. If the active semispace is 75% full, then creating a closure is 75 times as expensive as a variable reference. The HP 9836 does not support virtual memory. Caching or paging would make consing significantly more expensive than the above calculations indicate. The only way to escape from these calculations is to use a more sophisticated garbage collection algorithm such as the Hewitt-Lieberman algorithm (or its little brother, generation scavenging). The worst case behavior of these algorithms appears to be slightly worse than that of straight stop and copy, so it is an empirical question how they will behave in a real Lisp environment. The only data that I have seen appears in a 1984 paper by David Moon; he found that the cost of consing was reduced but remained very significant. Clearly there is still a shortage of good data. If MIT Scheme has changed to use a more sophisticated collector, then I hope someone will post numbers characterizing its performance. > ...once we > have accepted that there is no splitting, it seems that the only > consistent model left is the one that requires every "evaluation" of a > lambda expression to be consed....Either procedures have associated > locations or they do not. A few months ago I proposed the following consistent semantics: if eq? is given procedures as arguments, it returns #!true if the associated locations are the same. If the associated locations are different but the "functional parts" (E* --> K --> C) are equal, then eq? may return #!true or #!false at its whim. If the functional parts are not equal, then eq? must return #!false. Whether the functional parts are equal is of course noncomputable, but a semantics doesn't have to be computable. The fact that eq? can just return #!false if the associated locations are unequal proves the existence of a model for this semantics. The possibility of coalescing optimizations as performed by compilers like Rabbit and the T3 compiler prove the existence of other interesting and useful models for this semantics. Peace, Will  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 17 MAR 86 21:26:58 EST Date: 17 Mar 1986 21:18 EST (Mon) Message-ID: From: Bill Rozas To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA cc: rrrs-authors@MC.LCS.MIT.EDU Subject: S&I's idea of EQ? In-reply-to: Msg of 17 Mar 1986 16:30-EST from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA I disagree that an "optimizing" garbage collector implies an "optimizing" compiler. We have one, but not the other to the same degree. The former is considerably easier to write since it is a much smaller program. By first class environments I do not mean that they can be passed around, but that they can be manipulated. What's the use if they can only be passed around? Clearly it is incremental definition that causes the problem, but I consider them an essential feature of first class environments. A cons in MIT-Scheme consists of 4 (move) instructions in compiled code (without garbage collection). A variable reference consists of 1 to 2 in most cases (the largest exception is when they must be left to the interpreter, because of potential incremental definitions, and this occurs rarely in compiled code). It seems to me that the extra performance is not worth the effort, since it makes the language at lot harder to use because it puts the user at the mercy of the compiler. I believe that a declaration allowing the optimization is appropriate since then any confusion would be caused by the user, not by the compiler doing something unexpected. Again, I believe that this optimization can be obtained by the user almost all of the time. I don't accept the argument that users do not have control over macro expansion, or over imbedded languages. They certainly do not have control, but the macro writer or "imbedder" does and should be careful about the code being generated in the same way that a compiler writer must be careful about using registers. A declaration would help here too since the macro could expand into code that contained it by default. Supposedly the user of the macro could not use any "hidden" lambdas since he would not know the implementation of the macro. I am not necessarily advocating for operational semantics, but once we have accepted that there is no splitting, it seems that the only consistent model left is the one that requires every "evaluation" of a lambda expression to be consed. And this means no coalescing in the absence of declarations or proof. Either procedures have associated locations or they do not. I think that this "solution" of allowing coalescing but not splitting is the worst of both worlds since it breaks both models. It will probably not confuse me (although it might), but I think it would confuse a naive user. PS: How come people object to the declaration allowing the optimization? It seems inocuous to me and would give everybody what they wanted. Given that the semantics are no longer "clean" since splitting is not allowed, we may as well be consistent. Note that some implementations could advertise that the optimization was on by default, and the other behaviour could be obtained by a "negative" declaration.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 17 Mar 86 19:44:39 EST Received: from tektronix by csnet-relay.csnet id av06092; 17 Mar 86 19:04 EST Received: by tektronix (5.31/5.14) id AA15558; Mon, 17 Mar 86 13:27:56 PST Received: by tekchips (5.31/5.14) id AA08962; Mon, 17 Mar 86 13:30:45 PST Message-Id: <8603172130.AA08962@tekchips> To: RRRS-AUTHORS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA Subject: Re: S&I's idea of EQ? Date: 17 Mar 86 13:30:37 PST (Mon) From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Rozas is correct that closing a lambda expression can be as cheap as consing, and that the closure may not be necessary if the procedure receiving it is known to the compiler. (My example wasn't very good for another reason as well, namely that the map procedure is going to cons anyway.) That's not to say that closing a lambda expression is as cheap as a variable reference. Consider an 8 MHz 68000 without wait states. Assuming a stop and copy garbage collector that can traverse and copy 1 megabyte of live storage in 5 seconds, a heap at equilibrium, and a 50% load factor, it takes 5 microseconds to reclaim each byte of storage. Hence a cons would take 40 microseconds in gc time alone. In interpreted Scheme 312 consing or closing takes an additional 30 microseconds in the storage allocator; compiling would reduce that to 20 microseconds. Overall that's 70 microseconds in interpreted code, 60 in compiled code. A variable reference takes 12 to 30 microseconds in interpreted Scheme 312, and would take .5 to 6 microseconds in compiled code. Thus variable references in compiled code are an order of magnitude faster than closing a lambda expression. You can reduce the storage allocation and gc time by using a more sophisticated garbage collector, but if you're going to go to the trouble of doing that then you're probably going to use an optimizing compiler and the average variable reference should be closer to 1 microsecond than to 6 microseconds. If variable references are slow in MIT Scheme, I suspect the problem lies not with environments as first class objects but with the fact that MIT Scheme allows the environment structure to be side effected by internal definitions that do not appear at the head of the lambda body in which they appear. It appears to me that the authors of S&ICP considered such side effects to be an MIT extension to Scheme that other implementations do not have to support; the relevant footnotes are in chapter 1, footnote 23, page 28; in chapter 5, footnote 21, page 440; in chapter 5, footnote 40, page 487; in chapter 5, footnote 42, page 490. One of the problems with using an operational definition such as a meta-circular interpreter to specify the semantics of a programming language is that it is never clear whether a feature is incidental to that particular interpreter or a universal feature that must be supported by all interpreters. It is clear, for example, that the interpreter in chapter 4 of S&ICP creates a new closure every time it encounters a lambda expression, but it is equally clear that if you take the car of such a closure you get the symbol PROCEDURE. That's why I prefer a denotational semantics. Peace, Will  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 14 Mar 86 15:18:27 EST Received: from indiana by csnet-relay.csnet id a008939; 14 Mar 86 15:03 EST Date: Fri, 14 Mar 86 12:56:33 est From: Kent Dybvig To: JAR@mc.lcs.mit.edu Subject: Re: S&I's idea of EQ? (OOPS!) Cc: JINX@oz.ai.mit.edu, RRRS-AUTHORS@mc.lcs.mit.edu Yes, I see what you mean. Here is my real answer to the EQ-Quiz. Sorry for the slip up... I went back to edit lower case t's and f's (because lower case i didn't look right) into upper case T's and F's and apparently screwed up. F (eq? (lambda (x) x) (lambda (y) y)) F (eqv? (lambda (x) x) (lambda (y) y)) T (let ((x (lambda (z) z))) (eq? x x)) T (let ((x (lambda (z) z))) (eqv? x x)) T (let ((x ... any expression evaluating to an exact number ...)) (eq? x x)) I (eq? #\x #\x) There is one important one to add, by the way: F (let ([f (lambda () (lambda (y) y))]) (eq? (f) (f))) Which right now returns true in Chez Scheme. Sigh... Kent  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 14 Mar 86 12:37:52 EST Received: from ti-csl by csnet-relay.csnet id aa06829; 14 Mar 86 11:48 EST Received: by tilde id AA08235; Fri, 14 Mar 86 09:40:12 cst Date: Fri 14 Mar 86 09:27:17-CST From: David Bartley Subject: Re: S&I's idea of EQ? To: JINX%OZ.AI.MIT.EDU%XX.LCS.MIT.EDU@CSNET-RELAY.ARPA, JAR%MC.LCS.MIT.EDU@CSNET-RELAY.ARPA Cc: RRRS-AUTHORS%MC.LCS.MIT.EDU@CSNET-RELAY.ARPA, Bartley%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: Message-Id: <12190651239.68.BARTLEY@CSL60> From JINX: > Besides, coalescing is not really necessary, since it is an > optimization which can be trivially accomplished by the user by > slightly rearranging his/her code. On the other hand, if the > implementation is free to coalesce, the other behaviour can only be > obtained thorugh very obscure and unintuitive coding. I'm not saying > that I agree completely with the philosophy that if an optimization > can be done by the user himself the implementation should not do it, > but it is something worth considering. I would agree with your first sentence except for three problems: (1) Not all programmers have been properly trained in good programming style using Scheme. Let's hope this improves drastically with time, but I'm constantly amazed at the influence of styles from other Lisp dialects, Pascal, and even FORTRAN on Scheme programmers I've run into! (2) Many Scheme systems implement most of the ``special forms'' in RRRS as well as other syntactic extensions as macros. One reason for optimization is to overcome the mish-mash that often results from macro expansion. For example, the programmer who uses DO doesn't have the option of placing the LAMBDA which results in the best possible place in the overall program. (3) Similarly, Scheme programs that are created by other programs often can be substantially improved by an optimizing compiler. > that I agree completely with the philosophy that if an optimization > can be done by the user himself the implementation should not do it, > but it is something worth considering. So, I think it's irrelevant whether an implementation should do an optimization that the users can do themselves, since they frequently can't. Regards, David Bartley -------  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 14 Mar 86 06:10:23 EST Received: from tektronix by csnet-relay.csnet id a003806; 14 Mar 86 5:57 EST Received: from csnet-relay by tektronix.tektronix.CSNET id as23113; 14 Mar 86 2:23 PST Received: from ti-csl by csnet-relay.csnet id aa23215; 13 Mar 86 12:33 EST Received: by tilde id AA05556; Thu, 13 Mar 86 10:05:18 cst Date: Thu 13 Mar 86 09:57:21-CST From: David Bartley Subject: Re: S&I's idea of EQ? (long) To: JINX%OZ.AI.MIT.EDU%xx.lcs.mit.edu@CSNET-RELAY.ARPA, willc%tekchips%tektronix@CSNET-RELAY.ARPA Cc: GJS%MIT-MC%tektronix@CSNET-RELAY.ARPA, JAR%MIT-MC%tektronix <@CSNET-RELAY.ARPA:JAR%MIT-MC%tektronix@csnet-relrrs-author-mc%tektronix>, Bartley%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: Message-Id: <12190394569.68.BARTLEY@CSL60> JINX's analysis is valid, but I disagree with the conclusion that since closures are (roughly) ``no more expensive than conses'' that they are not worth optimizing. It is easy to construct examples of ``inner loops'' in which such consing dominates the execution time. I've investigated these optimizations precisely because of my experience with real programs that needed such help. One such program worked exclusively with numbers we are able to represent as immediates (no number consing), yet mystified its author because of occasional GCs. (The GCs were unexpected because the generated code managed to allocate all variables to registers, avoiding use of the heap.) We'll never supplant Pascal that way! My goal is to develop implementations of Scheme that rival other languages in performance without sacrificing those things that make Scheme so desirable a language in the first place. In this case, I feel that the balance tips in favor of allowing useful optimizations to take place. It seems that any final decision on this point will come down to trading off potential performance for certain semantic ideals. I guess I'm as unconvinced by some of the arguments against coalescing as others are by my concerns about performance! Regards, David Bartley -------  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 13 MAR 86 19:15:36 EST Date: 13 Mar 1986 19:15 EST (Thu) Message-ID: From: Bill Rozas To: Jonathan A Rees Cc: RRRS-AUTHORS@MC.LCS.MIT.EDU Subject: S&I's idea of EQ? In-reply-to: Msg of 13 Mar 1986 18:46-EST from Jonathan A Rees I agree that the advantages of coalescing are implementation dependent, but that is part of what I tried to point out in my message: closures can be implemented in such a way that it is not so clear that coalescing is necessary. The reason some implementations want to coalesce is precisely because closing is too expensive for them (BTW, what's the difference between C and Scheme if closures are so expensive that users avoid them like the plague?). Note that the optimization I suggested does not imply a high variable lookup cost, its only added cost is the "cons", but it is more applicable than coalescing and satisfies the procedures-have-locations semantics. Besides, coalescing is not really necessary, since it is an optimization which can be trivially accomplished by the user by slightly rearranging his/her code. On the other hand, if the implementation is free to coalesce, the other behaviour can only be obtained thorugh very obscure and unintuitive coding. I'm not saying that I agree completely with the philosophy that if an optimization can be done by the user himself the implementation should not do it, but it is something worth considering. Again, I don't care that much, but I have not seen powerful reasons to make the decision go that way.  Date: Thu, 13 Mar 86 18:46:39 EST From: Jonathan A Rees Subject: S&I's idea of EQ? To: JINX@OZ.AI.MIT.EDU cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of 12 Mar 1986 10:11 EST (Wed) from Bill Rozas Message-ID: <[MC.LCS.MIT.EDU].849914.860313.JAR> Date: 12 Mar 1986 10:11 EST (Wed) From: Bill Rozas If map is not known, the time to look its value up at runtime is comparable to the time which it takes to close the procedure. In particular, in MIT-Scheme (because of first class environments, etc), looking up map can take considerably longer than closing the lambda-expression, and the latter time is usually negligible. I think that the small time difference which can be gained in this case is not very interesting. This is an empirical question for which I don't have any data, but my intuition is that the way procedures are used and implemented in T, the consing overhead here would be unacceptable, probably high enough to make T want to be incompatible with Scheme in yet one more way, if Scheme were changed. The fact that the T implementation "coalesces" procedures is exploited very heavily in the implementation and, I suspect, in the way some users write code. Not all Scheme implementations have MIT scheme's high variable lookup overhead; consing and space are not as cheap in most implementations as in MIT scheme; and not all compilers or users choose to do as much analysis and procedure integration as you believe is appropriate.  Date: Thu, 13 Mar 86 18:08:36 EST From: Jonathan A Rees Subject: S&I's idea of EQ? To: dyb%indiana@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MC.LCS.MIT.EDU, JINX@OZ.AI.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].849893.860313.JAR> What would you want the following to return? (eq? (lambda (x) (let ((y x)) y)) (lambda (z) (let ((w (lambda () z))) (w))))  Date: Thu, 13 Mar 86 17:24:24 EST From: Jonathan A Rees Subject: survey results (long message) To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].849840.860313.JAR> Here's a summary of the 16 completed surveys I received: 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) 14 - I, 2 - F Winner: I This question, which I think is what the recent Bartley/JINX debate has been about, was the only one on which there seemed to be consensus. Everyone except Sussman and Halstead said that the result should be implementation-dependent but not an error. (Maybe JINX wants to change his vote now?) The dissenters definitely wanted the answer to be false. I interpret the dissenting view as follows: evaluation of a LAMBDA-expression MUST be a side-effect; a denotational semantics MUST have locations associated with procedures; we must change existing implementations (like T, and maybe TI's compilers); we must change the RRRS, which is mute on this point; and Scheme must differ from Common Lisp here. My editorial opinion: Scheme should be compatible with the published RRRS, with Common Lisp, with the majority, and with its current implementations. People who teach from S&I might want to note that some scheme and CL implementations perform "coalescing" optimizations. 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) 8 - I, 7 - E, 1 - F. Winner: I Half the respondents want EQV? to have the same implementation-dependent behavior as EQ?. The other half want EQV? to be defined in an implementation-independent way. Given the lack of consensus for change, I would again suggest the conservative stance of leaving things as they are, compatible with the RRRS and with Common Lisp's EQL. I would be happy, however, if someone continued this crusade and managed to convince all those people who want an ugly EQV? that they're wrong. 3. (LET ((P (LAMBDA (X) X))) (EQ? P P)) 9 - I, 6 - T, 1 - U. Winner: T Half want particular evaluations of lambda-expressions to "have identity," thus again requiring a more complicated semantics involving locations, and forbidding indiscriminate beta-conversion. The other half want simpler semantics, and the conceptual improvements and implementation freedom which result. (Please pardon my biased interpretation, it's supposed to be mildly humorous.) Again, there's no consensus, so I would recommend staying with the RRRS and Common Lisp status quo, even though it's a "minority" position (among those who turned in surveys) and in opposition to mine. 4. (LET ((P (LAMBDA (X) X))) (EQV? P P)) 5 - T, 7 - E, 3 - I, 1 - U. Winner: T This was asking, among other things, whether the semantics of Scheme needed to associate locations with procedures, even if EQ? were excised. To be consistent with everything else, especially question number 2, I'd say again we'd have to stick with the status quo. 5. (LET ((N )) (EQ? N N)) 5 - T, 9 - I, 1 - E, 1 - U. Winner: I Hard to interpret these results. One issue that hasn't been raised is that Common Lisp and RRRS disagree on this question; RRRS says that the answer should be T, and Common Lisp says I (implementation-dependent - numbers don't "have identity", can be "copied freely"). I think that Sussman says that he and S&I don't care about numbers having identity, and wonders why anyone would really care. This topic might need more discussion; e.g. one could use my argument (previously advanced to an opposite end) that procedures and numbers shouldn't be treated differently, with the conclusion this time that since procedures "have identity", so should numbers. Given that Scheme's EQ? is otherwise so close to Common Lisp's EQ, and that a majority want to change scheme to be compatible with CL, it would be a shame for them to disagree here, so I would be inclined to change the report. E.g. the way it is, it will be quite difficult to embed scheme implementations in common lisp, and vice versa. Note that this change in the language definition would not require anyone's implementation to change, although it might require some people's programs to change (speak up if this is the case!). 6. (EQ? #\X #\X) 7 - I, 2 - T, 7 - X Winner: I I saw this question as asking more or less whether characters should be "interned". I think the consensus is that they don't need to be. That would be consistent with the RRRS and with Common Lisp. [One respondent originally answered E on the grounds that characters shouldn't be expressions, so this would be a syntax error; but I think the consensus on that question is that characters SHOULD self-evaluate.] ---------- I guess I mounted this failed little crusade for pure location-less procedures in an attempt to make Scheme have cleaner, more abstract semantics, with the implication that this would have many practical benefits. I do believe that having a way to denote pure objects (behaviors) would have far-reaching consequences. But maybe Scheme is an inapprioriate vehicle for this kind of progress; the "ideal programming language" would probably differ from Scheme in many other ways as well. My advice to the Yale and TI efforts, which seem to agree with me on the need for location-less procedures, is to implement both kinds of procedures, and support two different dialects (or two different LAMBDA special forms). I know that the "native" T dialect already has to be incompatible with Scheme in various ways (its DEFINE and WRITE and a few other things differ incompatibly), and that semantically, at least, impure procedures can be simulated in T by pure ones, so this is probably not such a big deal - maybe involving a flag in the compiler's LAMBDA-nodes, or maybe something more general. I hope other people who want to work on issues like this one which involve incompatibility with Scheme will come up with similar solutions, implementing both crufty, backwards Scheme AND the wonderful language of their choice, and not give up on either line of work. I'll try to come up with an objective, concise summary of this EQ? debate to include in the next edition of the report. Jonathan  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 13 Mar 86 16:41:45 EST Received: from indiana by csnet-relay.csnet id af25787; 13 Mar 86 16:20 EST Date: Thu, 13 Mar 86 15:37:09 est From: Kent Dybvig To: JINX@OZ.AI.MIT.EDU Subject: Re: S&I's idea of EQ? (long) Cc: rrrs-authors@mit-mc The cost of closing in the very case you describe is indeed large in Chez Scheme, as in most later Scheme systems. Compared with other costs, the cost of allocation in general is quite high. Allocation incurs several direct and indirect costs including (1) the several instructions to do the allocation, (2) the potential for page faults, and ultimately (3) garbage collection overhead. By comparison, all identifier lookups, whether global, local, or through the closure, are one instruction (more correctly, one operand of an instruction). Function call is only a few instructions after pushing the arguments. Very little allocation is performed by the system, and that is why it ends up being so fast. In fact, the only things ever allocated are closures for lambda expressions that are not directly invoked, boxes for assigned identifiers, and stack objects for continuations. Many programs do no allocation at all and most do little outside explicit calls to cons and the like. Chez Scheme always returns the same closure for a lambda with no free variables, thereby avoiding even the cost of a cons. When this closure occurs in a loop, the savings in page faults alone is significant. However, I do not feel that this emphasis on speed is more important than emphasis on functionality. In light of the discussion about using closures as abstract objects and what I feel to be cleanest, my belated answer to Jonathan's EQ-Quiz is as follows (no coalescing, no splitting): T (eq? (lambda (x) x) (lambda (y) y)) T (eqv? (lambda (x) x) (lambda (y) y)) T (let ((x (lambda (z) z))) (eq? x x)) T (let ((x (lambda (z) z))) (eqv? x x)) T (let ((x ... any expression evaluating to an exact number ...)) (eq? x x)) I (eq? #\x #\x) I am all in favor of having switches in the system that trade functionality (flexibility, anyway) for speed, as long as the users knows pretty much what's going on. The user can decide if the extra speed is worth the price. And, as you point out, it is perfectly valid even in the absense of such a switch for the compiler to do anything it cannot be caught at, including coalescing or splitting objects. If anyone cares, I can write a less coherent response related to EQ on non Von-Neumann architectures, such as the cellular computer of Mago, where pointers do not exist. (The earlier remarks about MultiLisp are relevant only to large-grain architecture, which are essentially parallel variants of the Von-Neumann architecture.) My ideas for EQ along these lines are not completely formed but relate to immedidate (pointer-encoded) objects also. Question: On a machine supporting virtual memory, If EQ is based on the (virtual) address, what if two pieces of the virtual address space are mapped to the same physical address?  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 13 Mar 86 07:38:27 EST Received: from tektronix by csnet-relay.csnet id ao19748; 13 Mar 86 6:22 EST Received: from csnet-relay by tektronix.tektronix.CSNET id cr12105; 13 Mar 86 2:59 PST Received: from ti-csl by csnet-relay.csnet id aa15081; 12 Mar 86 20:32 EST Received: by tilde id AA17522; Wed, 12 Mar 86 18:40:14 cst Date: Wed 12 Mar 86 18:28:09-CST From: David Bartley Subject: Re: S&I's idea of EQ? To: willc%tekchips%tektronix@CSNET-RELAY.ARPA, RRRS-AUTHORS%MIT-MC%tektronix@CSNET-RELAY.ARPA Cc: JAR%MIT-MC%tektronix@CSNET-RELAY.ARPA, GJS%MIT-MC%tektronix@CSNET-RELAY.ARPA, Bartley%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: <8603112106.AA26230@tekchips> Message-Id: <12190225414.14.BARTLEY@CSL60> >From: willc%tekchips@tektronix.CSNET > >As I saw it, the result of the earlier EQ? discussion was a consensus that > > (1) it is not an error for EQ? to be called with procedural arguments. > (2) EQ? as used in S&ICP ought to work. > (3) "splitting" was a no-no. > (E.g. (let ((x (lambda () 0))) (eq? x x)) --> #!false is a no-no.) > (4) "coalescing" was ok. > (E.g. (eq? (lambda () 0) (lambda () 0)) --> #!true is ok.) > >These points are in decreasing order of consensus, and in particular >the consensus for (4) was not as strong as for (1) through (3). Thus >(4) is still a major topic for discussion. I've been too preoccupied with other things to give this debate the attention it deserves, but I feel somewhat uncomfortable with this declaration of a consensus, even though I am largely in agreement with Will's sentiments. My concern is that splitting can occur in suprising places in an optimizing compiler, and that more than just LAMBDA expressions are involved. Is there a consensus on coalescing and splitting for bignums and reals? Prohibiting splitting seems to be equivalent to prohibiting constant propagation (beta substitution). Regards, David Bartley -------  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 13 Mar 86 07:38:04 EST Received: from tektronix by csnet-relay.csnet id am19748; 13 Mar 86 6:20 EST Received: from csnet-relay by tektronix.tektronix.CSNET id aw12105; 13 Mar 86 2:27 PST Received: from mc.lcs.mit.edu by CSNET-RELAY.ARPA id a009197; 12 Mar 86 10:13 EST Date: 12 Mar 1986 10:11 EST (Wed) Message-ID: From: Bill Rozas To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Cc: GJS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA, JAR%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA, RRRS-AUTHORS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA Subject: S&I's idea of EQ? (long) In-reply-to: Msg of 11 Mar 1986 16:06-EST from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA While I think that you are right in assuming that no code written depends on coalescing not being used, and I understand the position that being able to coalesce is a desirable goal, your example does not strike me as particularly good. You claim that an optimizing compiler will obtain considerably more perfomance out of the coalesced case than out of the "split" case, but I'm not certain about that: Clearly if map is known (eg. by the use of declarations, like in MIT-Scheme), it can be known that eq? (eqv?) is not used on the procedure values, so coalescing is permitted, but in this case it is irrelevant since you would want to inline code map into a loop, and this would hopefully cause the argument to map (the lambda expression) to be inline coded also, giving you considerably more (time) efficiency. I think the issue is moot in this case. If map is not known, the time to look its value up at runtime is comparable to the time which it takes to close the procedure. In particular, in MIT-Scheme (because of first class environments, etc), looking up map can take considerably longer than closing the lambda-expression, and the latter time is usually negligible. I think that the small time difference which can be gained in this case is not very interesting. It seems from your comment that you are assuming that closing is inordinately expensive. This is not necessarily so (except potentially in space). There are 2 implementations of closures I'm familiar with, and in both cases the above closing can be made very efficient (in time), although admittedly not as efficient as the coalesced case, but probably efficient enough: The "linked environment frame" implementation: Every time a procedure is applied, a new frame with the arguments is created and the environment where the body is evaluated is just the "cons" of this new-frame and the environment where the procedure was closed. To close a lambda expression, a "cons" is made of its code, and the environment object ("list" of frames) where the lambda expression is evaluated. Thus closing is just as expensive as a cons. The "grab what you need" implementation: every time a lambda expression is "evaluated", the locations of its free variables (or the values if it can be proven that no assignments occur) are collected into a closure frame, and the closure is the "cons" of the code corresponding to the lambda expression and this "closure frame". Closing in this model is potentially more expensive since variables may have to be referenced at closing time, and thus a strong desire to "close eagerly" may occur. In your example, the lambda-expression has no free variables (except +, but allow me to assume that this is a known operator), thus closing in this case becomes no more expensive than a cons. Assume now that it does have free variables as in (define (foo x) (map (lambda (n) (+ n y)) x)) where y is defined in some outer context. Then the code can be transformed into (define foo (let ((closure-frame (location-of y))) (lambda (x) (map (enclose (lambda (n) (+ n y)) closure-frame) 3)))) Where enclose and location-of have the obvious meanings. Again closing is just as expensive as consing (which is hopefully cheap). This sort of thing can obviously be extended to more complicated cases, where the general case would be something like (define foo (let ((constant-part (locations-of * + - y w))) (lambda (x z) (map (enclose (lambda (n) (* w (+ y (- n z)))) (frame-cons (location-of z) constant-part)) x)))) for (define (foo x z) (map (lambda (n) (* w (+ y (- n z)))) x)) So the parts that can be done eagerly are done eagerly, and the others (plus a cons) are done dynamically. Note that this solution breaks down if the environment of the lambda-expression is used as a first class environment, but if this is the case, coalescing is obviously disallowed also. This intermediate implementation has wider applicability than coalescing since in the last example the lambda expression could not trivially be coalesced. Thus coalescing seems (to me) like a cute little "trick" which may give you a few instructions (whatever a cons takes) of efficiency in very limited situations, while a very simple optimization that has no impact on the semantics gives you most of it in those cases and more in others. Again, I understand your position, but it seems that the arguments for it are not very strong. There is an argument against coalescing which carries at least as much weight (with me), and has not been presented before: Coalescing breaks the environment model of evaluation explained in S&ICP. This model (which students are shown how to implement in chapter 4) implies that coalescing is not possible, since a "different" procedure object is created every time that a lambda expression is "evaluated". If adherence to this model is important (which it is to a lot of people involved with teaching S&ICP around here), coalescing becomes an optimization which can only be used when the optimizer can prove that the procedures will not be passed to eq? (eqv?), and not usable by default. PS: While I don't feel very strongly about it, I think that the arguments against coalescing should also be heard, expecially since some of the people who think it is a bad idea are not on the mailing list, and the ones who are feel that it is unfriendly and refuse to send messages to it.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 12 Mar 86 06:39:35 EST Received: from tektronix by csnet-relay.csnet id as06699; 12 Mar 86 6:16 EST Received: by tektronix (5.31/5.14) id AA16877; Tue, 11 Mar 86 13:03:48 PST Received: by tekchips (5.31/5.14) id AA26230; Tue, 11 Mar 86 13:06:47 PST Message-Id: <8603112106.AA26230@tekchips> To: RRRS-AUTHORS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA Cc: JAR%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA, GJS%MIT-MC%tektronix.csnet@CSNET-RELAY.ARPA Subject: S&I's idea of EQ? Date: 11 Mar 86 13:06:43 PST (Tue) From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA [We've had some trouble with our network phone lines to the East Coast, so the following may seem out of date.] I, too, am tired of moving targets and would like to have a stable definition of Scheme, but I don't think Gerry should be unhappy with what's going on because the consensus, at least as I perceive it, is consistent with S&ICP and with the various existing implementations. As I saw it, the result of the earlier EQ? discussion was a consensus that (1) it is not an error for EQ? to be called with procedural arguments. (2) EQ? as used in S&ICP ought to work. (3) "splitting" was a no-no. (E.g. (let ((x (lambda () 0))) (eq? x x)) --> #!false is a no-no.) (4) "coalescing" was ok. (E.g. (eq? (lambda () 0) (lambda () 0)) --> #!true is ok.) These points are in decreasing order of consensus, and in particular the consensus for (4) was not as strong as for (1) through (3). Thus (4) is still a major topic for discussion. As I perceived it, Jonathan's poll was primarily intended to determine the semantics of EQV?, and secondarily to determine the status of (4) above. So far as I know, nothing in S&ICP will break if EQV? is made implementation-independent, or if "coalescing" is permitted for EQ?. Jonathan's insight is that although the semantics of EQ? is hopelessly implementation-dependent (primarily because of numbers, and secondarily because of coalescing if coalescing is allowed), it is still possible to make EQV? implementation-independent even in the presence of coalescing. The price to be paid is that it would be an error to call EQV? with procedure arguments -- not "signal an error", but "be an error", so no one would have to change their implementation. If we don't allow coalescing for procedures and EQ?, then there is no price at all. That's why the issue of coalescing for procedures and EQ? is inseparable from the semantics of EQV?. So far as I can tell, however, nothing being considered will break S&ICP or force anyone to change their implementation. If anyone can show me a truly useful piece of existing code that will break if coalescing is allowed, I'd sure like to see it. I would hate to have TWO forms for creating procedures. Yes, I dearly wish that procedure values did not have to be associated with pointers, just as I dearly wish that numbers did not have to be associated with pointers, but our earlier discussion convinced me that it is too late to fix either problem. (In Scheme, that is. I think we can warn designers of new languages not to make the same mistakes.) What Yale, TI, and Tek want in Scheme is not the utopia of pointer-free procedures, but sanction for coalescing. Why do we want coalescing? It allows optimizing compilers to generate more efficient code. Examples have been given before, but some people remain unconvinced so I will repeat one of them. Consider the following procedure: (define (foo x) (map (lambda (n) (+ n 3)) x)) If coalescing is not allowed, then a new closure must be created every time foo is called, and any optimizing compiler that transforms the above into (define foo (let ((bar (lambda (n) (+ n 3)))) (lambda (x) (map bar x)))) is indisputably incorrect. (Unless, of course, the compiler happens to know the value of map and can prove that map will never be assigned, which is pretty unlikely in practice.) I think this sort of transformation ought to be allowed in Scheme, because (1) it makes a fairly big difference in how fast your programs run; (2) it has been allowed in Scheme historically, so disallowing it would be a change in the language; for example, Guy Steele's Rabbit compiler is based on such transformations; and (3) the opponents of coalescing have yet to give any examples to show why coalescing is bad. I didn't understand Jonathan's remark to the effect that Common Lisp requires two different kinds of LAMBDA, one that associates a pointer and another one that doesn't. I also can't tell from CLtL whether the optimization illustrated above is supposed to be legal in Common Lisp. (I assume it is supposed to be legal, but CLtL isn't precise enough to give much guidance and there is no formal semantics for CL that I can turn to for answers to such questions.) At any rate, I hardly think the designers of Common Lisp have devoted more thought to these issues than we have, so I don't see why we should look to Common Lisp for guidance. Thanks for hearing me out. Peace, Will  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 12 Mar 86 06:26:19 EST Received: from tektronix by csnet-relay.csnet id ao06699; 12 Mar 86 6:07 EST Received: by tektronix (5.31/5.14) id AA16207; Tue, 11 Mar 86 12:43:18 PST Received: by tekchips (5.31/5.14) id AA25873; Tue, 11 Mar 86 12:46:16 PST Message-Id: <8603112046.AA25873@tekchips> To: JAR@mc.lcs.mit.edu Cc: RRRS-AUTHORS@mc.lcs.mit.edu, willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Subject: Re: NIHIL EX NIHIL - DON'T SETQ NIL In-Reply-To: Your message of Mon, 10 Mar 86 16:28:47 EST. <[MC.LCS.MIT.EDU].845702.860310.JAR> Date: 11 Mar 86 12:46:13 PST (Tue) From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA An additional argument for making the initial value of NIL be #!false instead of the empty list: The Second Edition of The Little LISPer by Dan Friedman and Matthias Felliesen uses NIL as the expression that evaluates to #!false and () [not '(), unfortunately] as the expression that evaluates to the empty list. Aside from the use of square brackets instead of parentheses and the missing quote before (), the Second Edition looks like straight Scheme to me. If SRA would market that book right, it would introduce more people to Scheme than has S&ICP. In other words: Don't SETQ NIL, (SET! NIL #!FALSE) instead. Peace, Will  Date: Mon, 10 Mar 86 16:28:47 EST From: Jonathan A Rees Subject: NIHIL EX NIHIL - DON'T SETQ NIL To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].845702.860310.JAR> The RRRS says that the value of the (non-essential) variable NIL should be the empty list. But I claim that this is not in accordance with practice, and is ugly besides. My evidence: 1. S&ICP - NIL is used in both ways in the book, but is introduced on page 17 as being false, and not until page 91 as the empty list. Just leafing through the book I found many places where nil was used for false and no places where it was used as empty list. 2. The Art of The Interpreter - I think Steele started using NIL as false and '() as empty list about the time this paper came out. 3. T - liking Steele's convention, we used it in T source and documentation starting in 1981 (and I was using it in my Maclisp code even before that), so it's pretty deeply wired into T "culture" that NIL is false and NOT the empty list. When The T implementation starts distinguishing () from #!false, as I hope it will soon, the sources, and user's code, won't have to change, because this practice has been followed religiously. 4. Having T and NIL as names for true and false is symmetrical with having #!true and #!false, especially once #!null goes away (which no one has complained about, so it will). The manual looks very peculiar the way it is because T and NIL are described together but have apparently unrelated purposes. So unless I hear from someone who has more than about 3 megabytes of source code making the opposite assumption (that was supposed to be funny - other arguments will be entertained of course), I'll change the RRRS to describe NIL as being false, not empty list. Jonathan  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 5 MAR 86 20:57:18 EST Date: Wed, 5 Mar 1986 20:56 EST Message-ID: From: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU To: Jonathan A Rees Cc: RRRS-AUTHORS@MC.LCS.MIT.EDU Subject: #!null In-reply-to: Msg of 5 Mar 1986 18:15-EST from Jonathan A Rees I agree, let's flush it. Another of my ideas, retained despite the fact that I thought '() quite sufficient. I never use #!null at all. I guess I'm going to get a bad rep for thinking up crummy ideas and then not fighting hard enough to make sure they didn't propagate.  Date: Wed, 5 Mar 86 18:15:55 EST From: Jonathan A Rees Subject: #!null To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].840068.860305.JAR> Is anyone out there particularly enamored of the non-essential feature #!NULL? I would like to remove it for these reasons: - It unnecessarily complicates the grammar of 's and 's: - As a , it is redundant with (), and users will be confused about when to use it in preference to (). - As an , it is redundant with '() and (LIST). - If it is the same as () for all purposes, and it is implemented, then we must change the RRRS so that either [a] () is an a legitimate expression, which it is not either according to consensus or according to the RRRS; or [b] #!null is NOT a legitimate expression, which would be inconsistent with the syntactically similar #!true and #!false. [If #!null is not the same as (), then either the parser must differ from the parser, or implementations must be able to represents two different empty lists (one which self-evaluates and another which doesn't).] If you think that you want the #!NULL to exist so that people have a quote-less way to write an expression producing an empty list, I would argue that there are nicer ways to rewrite '() without using '. E.g., (LIST). If this is your ONLY reason and you don't like (LIST), then I would argue to replace the syntax #!NULL with a variable binding, say NULL or THE-EMPTY-LIST, whose value is an empty list.  Received: from ASPEN.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 3 MAR 86 15:24:01 EST Date: Mon, 3 Mar 86 15:23 EST From: Robert Halstead Subject: "splitting" in multiprocessors To: gls@THINK-AQUINAS.ARPA cc: JAR@MIT-MC.ARPA, RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: <860303105236.3.GLS@GUIDO.THINK.COM> Message-ID: <860303152359.6.RHH@ASPEN.LCS.MIT.EDU> Sure. Imagine a multiprocessing extension of SCHEME. (Call it, just for laughs, MULTILISP.) It might be crucial to efficiency to allow making copies of code for distribution to various processors. I wonder what Halstead has to say about this. Well, currently in Multilisp no "splitting" of procedures is discernible -- but in part this is due to the fact that the current implementation doesn't do any local copying of code of the sort you suggest. It is clear that such copying will be a useful strategy in the future, but I have always felt that it would be best to think of it as a form of caching that doesn't affect the naming or identity of objects. In other words, when a processor would look for a piece of code that it has a pointer to, it would first consult a CAM or hash table or some such to see if this is a piece of code of which there is a locally available copy. This is probably more efficient on a suitably designed processor than on a plain-vanilla 68020 or 32032, but I guess that doesn't bother me much. The motivation for this is that, although it is easy enough to talk about copying *code*, it is not so easy to make multiple copies of *environments* or other potentially mutable objects. Since generally the path to a piece of code will be through an environment (and then a closure) for which multiple copies are less likely (unless we have either a low-level snoopy-cache type of protocol or some analysis that proves particular environments to be immutable), I feel it is better for environments to contain object references in a system-wide address space. Processors may then interpret these references differently according to what selection of information they have cached locally, rather than having a collection of distinct references to different copies of objects. If a given object has just one reference (or "address"), no matter how many copies have been made of the object, then it is easy for EQ? to operate as though no splitting has occurred. -b.  Received: from SCRC-STONY-BROOK.ARPA by MC.LCS.MIT.EDU 3 Mar 86 15:23:50 EST Received: from RIO-DE-JANEIRO.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 429074; Mon 3-Mar-86 13:31:42-EST Date: Mon, 3 Mar 86 13:34 EST From: Kent M Pitman Subject: P? To: gls@THINK-AQUINAS.ARPA, KMP@SCRC-STONY-BROOK.ARPA cc: JINX@MIT-AI.ARPA, RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: <860303105850.4.GLS@GUIDO.THINK.COM> Message-ID: <860303133443.5.KMP@RIO-DE-JANEIRO.SCRC.Symbolics.COM> Date: Mon, 3 Mar 86 10:58 EST From: Guy Steele Personally, I am content to say that "?" does not get attached to any predicate which has a root which contains no alphabetic characters. That is, that ">" is allowed to be a contraction for "greater?" rather than just "greater", so ">?" is like saying "greater??". We used a similar reasoning for not putting "!" on the end of "SET" in T, since "SET!" is redundant. Any assignment is by definition destructive and setting an "!" after SET is effectively redundant and just makes code look ugly. I think the proponents of the name "SET!" would say that assignments -should- look ugly, and to that I can only say that I disagree. I think that SET! should be an assignment primitive, and SET should construct a set (maybe meaning the same as the Common Lisp REMOVE-DUPLICATES). This ambiguity about the meaning of "set" in Lisp has always bothered me, and "!" is the best means of disambiguation I have seen yet. Anyway, if that was your only concern, I would say you should call set something else -- BIND, ASSIGN, ATTACH, NAME, or whatever. I guess I kinda like ASSIGN. There are plenty of names that could be chosen. The "!" is ugly and inconsistent on SET because it suggests that it is closely related to the other operators which have "!" in their names. I don't think it's very closely related since its side-effect occurs a level up (in 3-lisp style) from the other "!" functions. I believe that the authors of SETF did us a great disservice by suggesting that SETF should be able to both assign variables and modify structures. Anyone who's ever tried to code SETF knows full well that the variable situation has to be handled by a special case. That special case is the clue that tells you that something is wrong in the model which tries to unify the two ideas.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 3 MAR 86 12:31:00 EST Date: 3 Mar 1986 10:39 EST (Mon) Message-ID: From: Bill Rozas To: Dan Friedman Cc: rrrs-authors@MC.LCS.MIT.EDU Subject: rec In-reply-to: Msg of 2 Mar 1986 22:59-EST from Dan Friedman I understand perfectly well what you can do with REC. What I disagree with is precisely the usage that you are advocating. It is hard to explain, once your example has been shown, why things like (rec list-of-as (cons 'a list-of-as)) don't work, and any explanation will seem like a "cop-out". I'd much rather say that the only essential behavior is when beta is a lambda-expression, and let individual implementations do whatever they please. A similar objection (with which I agree) was raised against internal DEFINE. The simplest (only?) way to guarantee that a procedure with internal DEFINEs behaves as if they were all done in parallel is to disallow anything but procedural DEFINEs internally. While the implementation may allow for further "fancy" manipulations, requiring them in the standard only opens a whole new can of worms. Since I disagree that (rec box (cons a (lambda () (let ([v b]) (set-car! box (lambda () v)) v))) should work in the standard, I see no difference between REC and NAMED-LAMBDA.  Received: from GODOT.THINK.COM by MC.LCS.MIT.EDU 3 Mar 86 11:21:36 EST Received: from guido by GODOT.THINK.COM via CHAOS; Mon, 3 Mar 86 10:50:44 est Date: Mon, 3 Mar 86 10:52 EST From: Guy Steele Subject: EQ?, again To: JAR@MC.LCS.MIT.EDU, gls@THINK-AQUINAS.ARPA Cc: RRRS-AUTHORS@MC.LCS.MIT.EDU, gls@THINK-AQUINAS.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].834417.860228.JAR> Message-Id: <860303105236.3.GLS@GUIDO.THINK.COM> Date: Fri, 28 Feb 86 18:26:57 EST From: Jonathan A Rees Date: Fri, 28 Feb 86 16:40 EST From: Guy Steele Date: Wed, 19 Feb 86 21:37:49 EST From: Jonathan A Rees I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" I 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" T 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) I 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) I 6. (EQ? #\X #\X) It took me a while to figure out how to reconcile your answer to 3 with your answer to 4. I take it you want procedures to know their identity, but that you imagine an implementation would still want the liberty to make (identity-preserving) copies, so EQ? might fail to recognize two procedures being the same, even though EQV? could succeed? Sure. Imagine a multiprocessing extension of SCHEME. (Call it, just for laughs, MULTILISP.) It might be crucial to efficiency to allow making copies of code for distribution to various processors. I wonder what Halstead has to say about this.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 3 Mar 86 00:26:15 EST Received: from indiana by csnet-relay.csnet id af11085; 2 Mar 86 23:37 EST Date: Sun, 2 Mar 86 22:59:36 est From: Dan Friedman To: rrrs-authors@mc.lcs.mit.edu Subject: rec I do not understand what all the confusion is about. (rec alpha beta) is a special form which says evaluate beta in an environment which associates the symbol alpha with the meaning of beta. This allows for such familiar expressions (rec ! (lambda (n) (if (zero? n) 1 (* n (! (sub1 n)))))). This builds a recursive object. We can then associate another symbol such as fact with this object. Hence we may define fact as (define fact (rec ! (lambda (n) (if (zero? n) 1 (* n (! (sub1 n))))))). Now if that were all we could do with rec then I'd just as soon use the applicative-order Y combinator and be done with it. However, because rec does not know that beta is a function we can do fancier things. For example, we can define lazy-cons (just in the cdr) as (lazy-cons a b) ==> (rec box (cons a (lambda () (let ([v b]) (set-car! box (lambda () v)) v))) as a macro. We have created a recursive box. By the time we need for the box identifier to be bound to the right piece it will be. As far as I can tell this is what Will was trying to communicate but it did not seem to be getting through. For those who need reminding the current marco for rec is (rec alpha beta) ==> (let ([alpha '*]) (set! alpha beta) alpha) Dan  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 2 Mar 86 02:33:16 EST Received: from ti-csl by csnet-relay.csnet id a003250; 2 Mar 86 2:34 EST Received: by tilde id AA22080; Sat, 1 Mar 86 21:32:58 cst Received: from mc.lcs.mit.edu by CSNET-RELAY.ARPA id a001625; 1 Mar 86 21:43 EST Date: 1 Mar 1986 21:36 EST (Sat) Message-Id: From: Bill Rozas To: David Bartley Cc: RRRS-AUTHORS%MC.LCS.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA Subject: EQ? non-essential? In-Reply-To: Msg of 26 Feb 1986 11:34-EST from David Bartley Received: from csnet-relay by ti-csl; 1 Mar 86 21:31:40-CST (Sat) I also disagree strongly with JAR's latest suggestion that EQ? be made non-essential. Doing so doesn't solve the problem, it just passes the buck! I disagree. The point of the agreement that JAR and I reached (I am the unnamed flamer) was that implementors would have the choice of having EQ? or a compiler which can split without thinking about the consequences. The essence of my (and GJS's) position is that if splitting is allowed, the language has changed to the point we don't feel comfortable with it anymore. The difference in our positions is that I'm willing to do away with EQ? as long as if it is implemented it has the properties that I consider essential. GJS disagrees since he argues that there is no way to obtain that power if it is not provided. When I originally answered JAR's poll, I answered I to the relevant questions. After rhh sent his message, I revised my position after thinking about it again. One of the things which I've always liked about Scheme is the ease with which one can do things like that, which I believe are intuitive and natural. Adding locations explicitely into the code just to make EQ? work (as was suggested to me) seems artificial and no better than emulating closures in languages which lack them (like C).  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 1 MAR 86 02:23:13 EST Date: 1 Mar 1986 02:21 EST (Sat) Message-ID: From: Bill Rozas To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Cc: rrrs-authors@MC.LCS.MIT.EDU Subject: named-lambda, etc In-reply-to: Msg of 26 Feb 1986 19:16-EST from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA The only advantages I see to REC over NAMED-LAMBDA are the ability to say things like (REC SELF (VECTOR (LAMBDA () (LET ((X (FOO))) (VECTOR-SET! SELF 0 (LAMBDA () X)) X)))) I hope that we are not requiring implementations to support the above. It gives the illusion that REC (LETREC) can do something (solving simultaneous equations) which in fact it (they) cannot do (at least I do not know how to make them do that, short of changing to normal order evaluation). If we are not requiring this behavior, the argument is moot. and the mnemonically useful relationship between REC and LETREC. (On a close vote we chose the names REC and LETREC over LABEL and LABELS.) REC sounds like WRECK and RECORD to me, not like LETREC. Note that since LETREC expects a list as its second subform, it syntax could be trivially extended to support the common case of just one name, and REC would become unnecessary (also helping GJS's goal of reducing the number of special forms). Now really, I don't see why people object to NAMED-LAMBDA (which is not essential) so violently. I don't love it, but I don't dislike it, and it seems to me that people have just decided to pick on it for no good reason, when REC is just as random and considerably less intuitive.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 1 MAR 86 02:07:58 EST Date: 1 Mar 1986 02:06 EST (Sat) Message-ID: From: Bill Rozas To: Jonathan A Rees Cc: JMiller%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU, RRRS-AUTHORS@MC.LCS.MIT.EDU Subject: named-lambda, etc In-reply-to: Msg of 28 Feb 1986 14:56-EST from Jonathan A Rees Date: Friday, 28 February 1986 14:56-EST From: Jonathan A Rees To: JMiller at OZ.AI.MIT.EDU cc: RRRS-AUTHORS at MC.LCS.MIT.EDU Re: named-lambda, etc Why not NAMED-CONS, NAMED-VECTOR, NAMED-+, NAMED-READ-CHAR, NAMED-CALL-WITH-CURRENT-CONTINUATION, NAMED-WITH-INPUT-FILE, etc.? Come on, now. These objects have no good reason to have names for themselves since they could use them in no way that I can see. The case is obviously different for procedures.  Received: from SCRC-STONY-BROOK.ARPA by MC.LCS.MIT.EDU 28 Feb 86 19:08:15 EST Received: from RIO-DE-JANEIRO.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 428075; Fri 28-Feb-86 18:19:15-EST Date: Fri, 28 Feb 86 18:20 EST From: Kent M Pitman Subject: P? To: gls@THINK-AQUINAS.ARPA cc: JINX@MIT-AI.ARPA, RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: <860228145630.7.GLS@GUIDO.THINK.COM> Message-ID: <860228182054.2.KMP@RIO-DE-JANEIRO.SCRC.Symbolics.COM> For what it's worth, in the original spec for the T dialect that I worked up in 1981, "?" had an assigned pronunciation. It was explicitly stated that it should be pronounced "pee". That way, when we said "zero?" out loud in conversations with non-T-programmers, they would `accidentally' understand us. After I left, someone (Jonathan?) took this out, probably thinking it was gratuitous silliness, but in fact I agree with you that pronounceability is very important to a language and that definition wasn't put there on a whim. Along the same lines, "!" was given the pronunciation " destructively". Those who doubt that pronunciation is important should notice that most natural languages are transmitted primarily verbally (that is, that changes in language are frequently best described at the acoustic rather than written level), that psychologists seem to believe that indexing of objects in the brain is done more on the basis of auditory rather than visual keys (based on studies of how people mis-remember things), etc. Those who insist that the morpheme "pee" is somehow less meaningful than some other token that could have been chosen should ask themselves not about how English words relate to Latin, but how the creators of Latin came up with their morphemes. I believe that a language which cannot conveniently be discussed allowed [... uh, "aloud" I mean -- i guess I'll leave that typo in to support my claim two paragraphs above] is doing its prospective programmers a great disservice... but I think ZERO? looks lots nicer than ZEROP, so my vote is for changing the pronunciation and leaving the written word as it is. By the way, T also originally defined very carefully the rules of contraction so that users would be able to choose consistent function names. It specified, for example that a function xxx which did (define (xxx . args) (apply fn EQ? args)) should be called fnQ, and that one that did (define (xxx . args) (apply fn EQV? args)) should be called fnV and that one that did (define (xxx . args) (true? (apply fn args))) should be called fn? . In the case of a "?" ending up in the middle of a token, it was to migrate to the end, and in the case of a "!" and "?" ending up in a token (ie, a destructive predicate), the "?" was to precede the "!". Among other things, this meant that the function which did (MEM EQ? x y) could be written (MEMQ x y) and the function which did (MEM EQV? x y) could be written (MEMV x y). Further, (TRUE? (MEM EQV? x y)) = (TRUE? (MEMQ x y)) = (MEM? EQ? (MEMQ? x y)) = (MEMQ? x y). Personally, I am content to say that "?" does not get attached to any predicate which has a root which contains no alphabetic characters. That is, that ">" is allowed to be a contraction for "greater?" rather than just "greater", so ">?" is like saying "greater??". We used a similar reasoning for not putting "!" on the end of "SET" in T, since "SET!" is redundant. Any assignment is by definition destructive and setting an "!" after SET is effectively redundant and just makes code look ugly. I think the proponents of the name "SET!" would say that assignments -should- look ugly, and to that I can only say that I disagree. Sorry this message got as long as it did, but this is the sort of topic people don't seem to be able to vocalize their feelings about, so I figured it was worth doing so just for the record. -kmp  Date: Fri, 28 Feb 86 19:01:52 EST From: Jonathan A Rees Subject: S&I's idea of EQ? To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].834462.860228.JAR> I know Sussman is too shy to send out messages, so I'll repeat what he told me recently, which is that being able to use EQ? to recognize particular procedures (i.e. (let ((p (lambda ...))) (eq? p p)) => true) is deeply and explicitly wired into his book, and into the way he teaches and programs. For example, see the constraint system, pages 234-240 of Structure & Interpretation, and in particular the definition of the subprocedure ME of MULTIPLIER, page 237. Therefore he would be very sad if Scheme changed out from underneath him. He is sick of moving targets and would like a stable language compatible with what he's been calling Scheme all these years, even if it has serious problems. As another source of argument against the LAMBDA-as-behavior position, I should say that my reading of the Common Lisp manual is that (let ((p (lambda ...))) (eq? p p)) must definitely return true. Only numbers are permitted to fail to be EQ to themselves. So if you believe that Scheme shouldn't gratuitously differ from CL, then all objects besides numbers should be "EQ to themselves". As both Sussman and Halstead have suggested, one option is to have TWO forms for creating procedures, one which creates a callable pair, like the current Scheme and Common Lisp LAMBDA, and another which only returns the operational behavior, which is the change to LAMBDA that the Yale, TI, and Tek contingents want to institute. This would be needed anyhow in those implementations interested in emulating Common Lisp or a subset of it. I would guess that this would be easy to do in most existing implementations. If this is agreed to, then we have a naming problem on our hands, argument over which, unfortunately, might do this idea in (which one gets the name LAMBDA, and what should the other one be called?). Jonathan  Date: Fri, 28 Feb 86 18:26:57 EST From: Jonathan A Rees Subject: EQ?, again To: gls@AQUINAS.THINK.COM cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of Fri 28 Feb 86 16:40 EST from Guy Steele Message-ID: <[MC.LCS.MIT.EDU].834417.860228.JAR> Date: Fri, 28 Feb 86 16:40 EST From: Guy Steele Date: Wed, 19 Feb 86 21:37:49 EST From: Jonathan A Rees I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" I 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" T 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) I 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) I 6. (EQ? #\X #\X) It took me a while to figure out how to reconcile your answer to 3 with your answer to 4. I take it you want procedures to know their identity, but that you imagine an implementation would still want the liberty to make (identity-preserving) copies, so EQ? might fail to recognize two procedures being the same, even though EQV? could succeed?  Date: Fri, 28 Feb 86 17:11:21 EST From: Alan Bawden Subject: numeric predicates To: gls@AQUINAS.THINK.COM cc: RRRS-AUTHORS@MC.LCS.MIT.EDU, JINX@OZ.AI.MIT.EDU In-reply-to: Msg of Fri 28 Feb 86 14:56 EST from Guy Steele Message-ID: <[MC.LCS.MIT.EDU].834331.860228.ALAN> Date: Fri, 28 Feb 86 14:56 EST From: Guy Steele ... I think that putting "?" in the names of all the predicates is a TERRIBLE idea. Why? Because they make it much harder to read code out loud.... Hmmm... Perhaps we could make it an official convention that the trailing "?" in many predicate names is -pronounced- "pea"!  Received: from GODOT.THINK.COM by MC.LCS.MIT.EDU 28 Feb 86 16:39:31 EST Received: from guido by GODOT.THINK.COM via CHAOS; Fri, 28 Feb 86 16:39:05 est Date: Fri, 28 Feb 86 16:40 EST From: Guy Steele Subject: EQ?, again To: JAR@MC.LCS.MIT.EDU, RRRS-AUTHORS@MC.LCS.MIT.EDU Cc: gls@THINK-AQUINAS.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].823678.860219.JAR> Message-Id: <860228164046.1.GLS@GUIDO.THINK.COM> Date: Wed, 19 Feb 86 21:37:49 EST From: Jonathan A Rees I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" I 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" T 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) I 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) I 6. (EQ? #\X #\X) Let me qualify those five "I" answers. If two things are EQ?, then they should be operationally identical. (One consequence is that two things that are EQ? should also be EQV?.) If they are not EQ?, then they might or might not be operationally identical. The contrapositive is: if two things are proved in the course of a program execution to be operationally distinct, then EQ? must consistently be false of them (in both the future and the past). --Guy  Received: from GODOT.THINK.COM by MC.LCS.MIT.EDU 28 Feb 86 14:55:09 EST Received: from guido by GODOT.THINK.COM via CHAOS; Fri, 28 Feb 86 14:54:43 est Date: Fri, 28 Feb 86 14:56 EST From: Guy Steele Subject: numeric predicates To: JINX%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU, rrrs-authors@MC.LCS.MIT.EDU Cc: gls@THINK-AQUINAS.ARPA In-Reply-To: Message-Id: <860228145630.7.GLS@GUIDO.THINK.COM> Date: 24 Feb 1986 20:56 EST (Mon) From: Bill Rozas How come we have 2 versions of every possible predicate (> and >?, etc.)? It seems silly to me. Could we drop one of the two? Does anybody feel strongly about this? This is a good excuse for me to put in my two cents' worth, a few years too late: I think that putting "?" in the names of all the predicates is a TERRIBLE idea. Why? Because they make it much harder to read code out loud. The "-P" convention is at least pronounceable, if visually bizarre (but I would do away with that too if I could). I will not entertain counterarguments of the form "But English uses `?' and not `P'" because English does NOT lump the "?" in with the spelling of the verb; it is a marker on the entire sentence. --Guy  Received: from GODOT.THINK.COM by MC.LCS.MIT.EDU 28 Feb 86 14:45:34 EST Received: from guido by GODOT.THINK.COM via CHAOS; Fri, 28 Feb 86 13:47:48 est Date: Fri, 28 Feb 86 13:49 EST From: Guy Steele Subject: publication of RRRS in SIGPLAN Notices To: RRRS-AUTHORS@mc.lcs.mit.edu Cc: gls@THINK-AQUINAS.ARPA In-Reply-To: <8602201836.AA08527@tekchips> Message-Id: <860228134934.2.GLS@GUIDO.THINK.COM> If RRRS is to be published in SIGPLAN Notices, I beg that a short introduction be added explaining to the general readership why it is important. I am extremely annoyed at SIGPLAN Notices for publishing lots of little language definitions that spend two pages explaining the syntax of arithmetic expressions that turns out to be the same as everyone else's syntax but spend no prose explaining why I should be interested in this language. --Guy  Date: Fri, 28 Feb 86 14:56:34 EST From: Jonathan A Rees Subject: named-lambda, etc To: JMiller@OZ.AI.MIT.EDU cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of 28 Feb 1986 06:55-EST from JMILLER at MIT-OZ Message-ID: <[MC.LCS.MIT.EDU].834126.860228.JAR> Why not NAMED-CONS, NAMED-VECTOR, NAMED-+, NAMED-READ-CHAR, NAMED-CALL-WITH-CURRENT-CONTINUATION, NAMED-WITH-INPUT-FILE, etc.?  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 28 FEB 86 06:54:34 EST Date: 28 Feb 1986 06:55-EST Sender: JMILLER@MIT-OZ Subject: Re: named-lambda, etc From: JMILLER@MIT-OZ Reply-To: JMiller%OZ@MC To: rrrs-authors@MC Message-ID: <[MIT-OZ]28-Feb-86 06:55:45.JMILLER> In-Reply-To: <8602270016.AA15130@tekchips> I vowed never to send a message to this forum, but Will finally got me. I have been teaching Scheme out of S&ICP for 3 years now and find that introducing NAMED-LAMBDA simultaneously with LAMBDA in fact makes most people feel far more comfortable. I introduce it by explaining that it is sometimes nice for a procedure to have "a name for itself" as opposed to "a name other things call it" and use the analogy of nicknames. Most students sigh a large sigh of relief at this point, since it seems that it fulfills some need they have for procedures to have names -- I don't know why they are so firmly convinced that a nameless procedure is execrable, but it is so! This is not an argument either way about REC, etc. -- although the notion of a set of names for each other shared by a group is just a tad harder to explain. --Jim  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 27 Feb 86 18:22:21 EST Received: from tektronix by csnet-relay.csnet id aj22696; 27 Feb 86 17:54 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: JINX%OZ.AI.MIT.EDU@xx.lcs.mit.edu Cc: rrrs-authors@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 26 Feb 86 16:15:36 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA15130; Wed, 26 Feb 86 16:16:47 PST Message-Id: <8602270016.AA15130@tekchips> Subject: Re: named-lambda, etc In-Reply-To: Your message of 24 Feb 1986 13:32 EST (Mon)., Date: 26 Feb 86 16:16:42 PST (Wed) The only advantages I see to REC over NAMED-LAMBDA are the ability to say things like (REC SELF (VECTOR (LAMBDA () (LET ((X (FOO))) (VECTOR-SET! SELF 0 (LAMBDA () X)) X)))) and the mnemonically useful relationship between REC and LETREC. (On a close vote we chose the names REC and LETREC over LABEL and LABELS.) I see no such advantage to NAMED-LAMBDA over REC. I use REC fairly often myself, more often than I use CASE, LET*, or internal DEFINE, but I'd be willing to give it up and use LETREC instead if that's what it takes to get NAMED-LAMBDA out of the RRRS. ---------------- I support the idea of collecting stuff specific to S&ICP in an appendix. ---------------- I am probably the person who most likes What I object to is having define by default use REC or NAMED-LAMBDA > rather than LAMBDA (p 18. MIT AI-Memo 848 version). This is a real problem when introducing someone to Scheme, because people who see the (DEFINE (FOO X) ...) syntax quickly infer that it is the same as (DEFINE FOO (LAMBDA (X) ...)), and you have to tell them that the two are slightly different but that you can't explain the difference until they've learned more of the language. This defeats the very positive feelings of understanding and mastery that beginning Scheme programmers would otherwise have. The main argument that I see for having the alternative DEFINE syntaxes use REC is that it makes it easier for people to define procedures in such a way that self-recursive calls can be compiled as simple jumps. I don't consider this to be a very good argument. Peace, William Clinger  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 26 Feb 86 16:12:30 EST Received: from ti-csl by csnet-relay.csnet id ag08802; 26 Feb 86 14:40 EST Received: by tilde id AA11416; Wed, 26 Feb 86 10:40:12 cst Date: Wed 26 Feb 86 10:34:34-CST From: David Bartley Subject: Re: EQ? non-essential? To: JAR%MC.LCS.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA, RRRS-AUTHORS%MC.LCS.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA Cc: Bartley%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].830587.860225.JAR> Message-Id: <12186469182.45.BARTLEY@CSL60> Here are some comments on recently raised issues... (1) I agree with JAR that OBJECT-HASH should be flushed. (2) I still believe strongly that EQ? should amount to a pointer test and that the standard should specify the domain over which it is defined---symbols, booleans, (), and mutable objects (pairs, vectors, strings, and ports). It should be undefined (implementation-dependent but not an error) for numbers, characters, closures, and continuations. Bert asks whether the optimization this allows is worth the loss of the ability to compare procedures which represent abstract data. I have to say yes for two reasons. First, some other predicate than EQ? makes better sense for such tests. Second, the inhibition on the optimizing compiler is pretty strong since the EQ? test may not be lexically visible. The compiler can't always know when it is safe to split and when it isn't. Thus, I oppose the idea that (EQ? X X) should always be true. I also disagree strongly with JAR's latest suggestion that EQ? be made non-essential. Doing so doesn't solve the problem, it just passes the buck! >The effect of this is that while all existing implementations would stay >the same and implement EQ? as always, [...] Not so! I've explained why I want EQ? and why ``implementing EQ? as always'' is inconsistent with the semantics that (EQ? X X) ==> #!TRUE when you have an optimizing compiler. (3) I agree with JAR and others that EQV? should be independent of the implementation. As I understand it, this amounts to saying the use of EQV? is an error except for a specified domain. My previous position was that it should be implementation-dependent over all but the (same?) domain. I subscribe to the statement ``(EQ? X Y) implies (EQV? X Y) whenever (EQV? X Y) is defined.'' I'd like to see EQV? extend EQ? to encompass exact numbers and characters but not closures and continuations. (4) Ramsdell has raised the notion that (BEGIN ...) is redundant with (LET () ...). This is not true for those of us that support the MIT-style embedded DEFINE and first-class environments. Regards, David Bartley -------  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 26 Feb 86 15:29:24 EST Received: from ti-csl by csnet-relay.csnet id ab08802; 26 Feb 86 14:36 EST Received: by tilde id AA08980; Wed, 26 Feb 86 09:00:17 cst Date: Wed 26 Feb 86 08:49:44-CST From: David Bartley Subject: Re: Things used in S&ICP To: JAR%MC.LCS.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA, GJS%OZ.AI.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA Cc: RRRS-AUTHORS%MC.LCS.MIT.EDU%ti-csl.csnet@CSNET-RELAY.ARPA, Bartley%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].830307.860225.JAR> Message-Id: <12186450098.45.BARTLEY@CSL60> > Mail-From: MAIL created at 26-Feb-86 02:08:59 > From: Jonathan A Rees > > Date: Mon 24 Feb 86 22:22:37-EST > From: Gerald Jay Sussman > > I feel strongly that we should have >, <, =, because I used them in > S&I. I see no reasons why we should not also have other people's > favorites. > > Speaking of what's in S&I, there certainly are many things used in the > book that AREN'T described in the RRRS. Namely, these special forms: > ... > Suggestion: document them all, each one very briefly, in an appendix, as > non-essential features. I like this suggestion. Like most implementors, we will want to have a ``compatibility package'' for use with the book, but it seems unnecessary to ``grandfather'' all of its forms and functions in the standard itself. Recognizing the book's variants in a non-binding appendix will give notice that those names have unofficial but widely observed meanings. I also sense that the Scheme community may be ready to eliminate some of the historical oddities in the compromise worked out at Brandeis. I worry, though, that other dialects may feel slighted. Do we want to list Scheme84's or T's variants in an appendix as well? I don't think so -- I feel that S&ICP is a unique case -- but there may be disagreement. Thus, I vote (mildly) for the `=?' names rather than the `=' names. I also intend to continue to support the S&ICP names. Regards, David Bartley -------  Date: Tue, 25 Feb 86 19:57:37 EST From: Jonathan A Rees Subject: EQ? non-essential? To: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of 20 Feb 1986 00:20 EST (Thu) from Bill Rozas Message-ID: <[MC.LCS.MIT.EDU].830587.860225.JAR> We are still deadlocked on the 2 or 3 questions surrounding EQ?. No consensus is arising, except that implementations may decide on the result of (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)), and that characters should have the same status as numbers. A long argument between myself and a unnamed proponent of the ability to test object identity resulted in the following proposal which would make us both happy: make EQ? be a non-essential feature of the language. Require that implementations which have it guarantee that it behave as described in the current RRRS (e.g. (EQ? X X) would always hold). Implementations which wish to do clever things with procedures in violation of the implied invariant, and documentors who do not want to be embarrassed by having something so unclean and ill-specified in such a prominent position in the language, may simply choose to not implement (describe) EQ?. A denotational semantics of the essential subset would not have to adjoin locations to numbers and procedures, although one which wanted to describe EQ? invariants would. If, as Sussman once pointed out to me, an "object model" of the world is deeply embedded in S&ICP and the way Scheme is taught, then this would be better than making (EQ? X X) be implementation-dependent, although I would still favor the latter if EQ? were essential. The effect of this is that while all existing implementations would stay the same and implement EQ? as always, experimental implementations of the language which take liberties with procedures will still be allowed to call themselves "scheme" rather than "a variant of scheme not having EQ?". 6.001-style courses using such implementations will have to get by without it, somehow. (Obviously this would also make MEMQ and ASSQ also non-essential.)  Date: Tue, 25 Feb 86 15:16:56 EST From: Jonathan A Rees Subject: Things used in S&ICP To: GJS@OZ.AI.MIT.EDU cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of Mon 24 Feb 86 22:22:37-EST from Gerald Jay Sussman Message-ID: <[MC.LCS.MIT.EDU].830307.860225.JAR> Date: Mon 24 Feb 86 22:22:37-EST From: Gerald Jay Sussman I feel strongly that we should have >, <, =, because I used them in S&I. I see no reasons why we should not also have other people's favorites. Speaking of what's in S&I, there certainly are many things used in the book that AREN'T described in the RRRS. Namely, these special forms: COLLECT CONS-STREAM DELAY DEFINE-MACHINE MAKE-ENVIRONMENT and these bindings: ATOM? EMPTY-STREAM? ERROR EVAL EXPLODE FORCE GET HEAD MAPC IMPLODE MAPCAR PRINC PRINT PUT TAIL THE-EMPTY-STREAM USER-INITIAL-ENVIRONMENT I would like to hear a principled argument for including or not including these various things in the RRRS. (Not a separate argument for each one, please!) I don't much care what position is taken so long as it's pretty much consistent. The current RRRS is not consistent, since it includes SEQUENCE (non-essentially) only because it's in S&I, but doesn't even include the trivial but important special forms DELAY and CONS-STREAM. Suggestion: document them all, each one very briefly, in an appendix, as non-essential features.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 25 FEB 86 08:22:46 EST Date: 25 Feb 1986 08:23 EST (Tue) Message-ID: From: Bill Rozas To: Gerald Jay Sussman Cc: JINX%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU, rrrs-authors@MC.LCS.MIT.EDU Subject: numeric predicates In-reply-to: Msg of 24 Feb 1986 22:22-EST from Gerald Jay Sussman I feel strongly that we should have >, <, =, because I used them in S&I. I see no reasons why we should not also have other people's favorites. I think in the case of procedures the issue is not as important as in the case of special forms, since any portable code could at the very beginning define all the preferred names in terms of the standard ones. It seems silly, however, to require both sets of names, thus I was wondering whether anybody cares for either set. Given that you care for the non-question mark flavor, could we drop the question mark variety?  Received: from mitre-bedford.ARPA by MC.LCS.MIT.EDU 25 Feb 86 07:51:48 EST Organization: The MITRE Corp., Bedford, MA Received: by linus.MENET (4.12/4.7) id AA03869; Tue, 25 Feb 86 07:52:23 est Date: Tue, 25 Feb 86 07:52:23 est From: John D. Ramsdell Posted-Date: Tue, 25 Feb 86 07:52:23 est Message-Id: <8602251252.AA03869@linus.MENET> To: rrrs-authors@mit-mc.ARPA Subject: named-lambda If we decide to dump NAMED-LAMBDA in favor of REC, why not dump BEGIN and SEQUENCE in favor of LET()? I worry about trying to decide issues already agreed upon, instead of clarifing nebulous issues in RRRS. On one of those issues, EQ? and EQV? semantics, I vote the straight JAR line. John  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 24 FEB 86 22:21:23 EST Date: Mon 24 Feb 86 22:22:37-EST From: "Gerald Jay Sussman" Subject: Re: numeric predicates To: JINX%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU, rrrs-authors@MC.LCS.MIT.EDU In-Reply-To: Message-ID: <12186062869.11.GJS@OZ.AI.MIT.EDU> I feel strongly that we should have >, <, =, because I used them in S&I. I see no reasons why we should not also have other people's favorites. -------  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 24 FEB 86 20:55:46 EST Date: 24 Feb 1986 20:56 EST (Mon) Message-ID: From: Bill Rozas To: rrrs-authors@MC.LCS.MIT.EDU Subject: numeric predicates How come we have 2 versions of every possible predicate (> and >?, etc.)? It seems silly to me. Could we drop one of the two? Does anybody feel strongly about this?  Date: Mon, 24 Feb 86 20:52:56 EST From: Jonathan A Rees Subject: flush object-hash To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].829433.860224.JAR> I think the section on "the object table" should be flushed. It's apologetic and unmotivated. It is a fait accompli unless someone objects [sic] immediately AND rewrites the section.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 24 FEB 86 14:31:01 EST Date: 24 Feb 1986 13:32 EST (Mon) Message-ID: From: Bill Rozas To: rrrs-authors@MC.LCS.MIT.EDU Subject: named-lambda Maybe somebody can explain why REC is superior to NAMED-LAMBDA. I'm not particularly attached to NAMED-LAMBDA, but I just don't see any difference between both (except that I consider REC considerably uglier). They are special forms anyway, not essential special forms, and as I understand it, the only effect of having them on the report is that the name can mean nothing else (reserved for that purpose if at all). What I object to is having define by default use REC or NAMED-LAMBDA rather than LAMBDA (p 18. MIT AI-Memo 848 version).  Received: from ZERMATT.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 24 FEB 86 11:43:11 EST Received: from ASPEN.LCS.MIT.EDU by MIT-ZERMATT.ARPA via CHAOS with CHAOS-MAIL id 25392; Mon 24-Feb-86 11:43:40-EST Date: Mon, 24 Feb 86 11:43 EST From: Robert Halstead Subject: EQ?, again To: JAR@MIT-MC.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].823678.860219.JAR> Message-ID: <860224114336.2.RHH@ASPEN.LCS.MIT.EDU> Something that has not been mentioned in the debate over EQ? and EQV? as applied to procedures is that one can use procedures as data abstractions, including mutable data abstractions. Thus if it's useful to be able to distinguish two cons cells or vectors using EQ? and/or EQV?, so that one can determine whether the effects of an operation on one will necessarily be visible in the other, then why isn't it equally reasonable to use EQ? and/or EQV? to distinguish items returned by the following procedure? (define (procedural-cons a b) (define (the-cons op . rest) (cond ((eq? op 'car) a) ((eq? op 'cdr) b) ((eq? op 'set-car!) (set! a (car rest)) the-cons) ...)) the-cons) Stating that it is an error or implementation-dependent to use EQ? and EQV? on such procedures will complicate using procedures to do object-oriented programming where you want to be able to check the identity of objects (it doesn't rule it out altogether because you could still wrap the procedure representing the real object in a cons cell whose purpose was to satisfy identity checks!). Are the kinds of optimizations we want to do on procedures really important enough to warrant throwing away this convenience? Alternatively, would specifying a tighter semantics for EQ? and EQV? make the optimizations effectively impossible, or would it just require a little more data flow analysis during compilation to determine that the procedures in question would never find their way to be operands of EQ? Perhaps somebody who has thought about such optimizations and the kinds of programs in which we would like to make them could comment. In the absence of such input, I think my answers to questions 1-4 are therefore F 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" F 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) T 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" T 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) Questions 5 and 6 get to the question of whether certain immutable data types, notably numbers and characters, are subject to splitting when viewed by EQ? or not. I am coming around to the viewpoint expressed by others that the distinction between EQ? and EQV? is a historical artifact that should probably be removed. The major expense associated with that would be that, in most implementations, EQ? could no longer be a simple pointer comparison, but would now have to investigate the type of the objects being compared. Thus I can see the argument for keeping EQ? for those cases where especially efficient execution is desired and the programmer knows that none of the data types that can "split" when viewed by EQ? are involved. But the price of this is the kind of proliferation of other functions exemplified by the MEMQ/MEMV/MEMBER family, which I think is unfortunate. Anyhow, I think my first vote would be to flush EQ? and just keep EQV? (or else rename EQV? to EQ?). But if we keep EQ? as a separate function, then in line with the rationale expressed above, I think it should be an error to use EQ? on things that may appear to split when viewed by EQ?. Therefore: E 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) For (6) the question is whether we want to require all implementations to treat characters in a way that is never subject to splitting. I don't really care much about this, though it seems that keeping characters from splitting probably isn't too hard. Therefore: X 6. (EQ? #\X #\X) where the X really represents either E or T. I realize that my answers regarding EQ? on functions may make it substantially more difficult to do obvious, simple optimizations that avoid consing up a closure when, e.g., a function with no free variables appears inside another function. If this is too distasteful, perhaps we should think about having two flavors of LAMBDA: ordinary lambda-expressions such as (LAMBDA (X) X) would produce values which would be an error to feed to EQ? or EQV?, but (CONSED-LAMBDA (X) X) would produce values whose behavior under EQ? and EQV? would be dependable. I'm not exactly wild about this idea, but it might offer a way out and I thought I'd pass it along. I think I prefer it to the EQ? alternatives that have been more frequently suggested in response to JAR's survey. Regarding NAMED-LAMBDA, I too would be happy to flush it in favor of the REC or LETREC forms. -Bert  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 22 Feb 86 06:26:17 EST Received: from tektronix by csnet-relay.csnet id ak05897; 22 Feb 86 6:07 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: JAR@mc.lcs.mit.edu Cc: RRRS-AUTHORS@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 21 Feb 86 11:11:17 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA19306; Fri, 21 Feb 86 11:12:33 PST Message-Id: <8602211912.AA19306@tekchips> Subject: Re: EQ?, again In-Reply-To: Your message of Wed, 19 Feb 86 21:37:49 EST., <[MC.LCS.MIT.EDU].823678.860219.JAR> Date: 21 Feb 86 11:12:30 PST (Fri) Executive summary: 1-I 2-E 3-I 4-E 5-I 6-X (i.e., the straight Jonathan party ticket) I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" E 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" E 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) I 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) X 6. (EQ? #\X #\X) Notes: Re example 2, Kent Pitman writes: I think it important to say that EQV? cannot return false when EQ? has returned true. This would make EQV? implementation-dependent, just like EQ?. If EQV? of two procedures is an error, however, then it would be fair to have an implementation note suggesting that if EQV? does not actually report an error when applied to procedures, it should return whatever EQ? returns. Such a note would not be part of the semantics of EQV?, because there is no way anyone could rely on it to write implementation-independent code. Re example 5, I nearly said I didn't care. The main reason I care is that I think a T vote for example 5 should imply a T vote for (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X (1+ (-1+ X)))) and I suspect that people who vote T on example 5 won't agree. Peace, Will  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 22 Feb 86 06:22:52 EST Received: from tektronix by csnet-relay.csnet id av05897; 22 Feb 86 6:12 EST From: adams%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: JAR@mc.lcs.mit.edu Cc: RRRS-AUTHORS@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 21 Feb 86 16:00:22 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA24786; Fri, 21 Feb 86 16:01:47 PST Message-Id: <8602220001.AA24786@tekchips> Subject: Re: survey - JAR's answers Date: 21 Feb 86 16:01:41 PST (Fri) I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" E 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" E 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) U 5. (LET ((X ... exact number ...)) (EQ? X X)) X 6. (EQ? #\X #\X) I agree with your answers and reasons on 1-4. 5. As I understand it, specifying that this expression returns true might preempt certain optimizations in numerical code. (Like not consing intermediate results by using unboxed number representations that are not available to the user.) I would agree with Jonathan (with an 'I'), but I don't feel too strongly about it, so I could be swayed. 6. This seems the same as 5, except that characters are less likely candidates for optimizations. Since it doesn't affect the compiler, and its easy to implement either (any) way, I don't care. -Norman  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 22 Feb 86 02:45:53 EST Received: from ti-csl by csnet-relay.csnet id ah04008; 22 Feb 86 1:40 EST Received: by tilde id AA00860; Thu, 20 Feb 86 10:15:10 cst Date: Thu 20 Feb 86 10:10:56-CST From: David Bartley Subject: EQ? Survey Entry To: RRRS-AUTHORS%MIT-MC%csnet-relay@CSNET-RELAY.ARPA Cc: Bartley%ti-csl.csnet@CSNET-RELAY.ARPA Message-Id: <12184892016.40.BARTLEY@CSL60> 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" ==> I 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) ==> I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" ==> I 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) ==> I 5. (LET ((X ... any expression evaluating to an exact number ...)) ==> I (EQ? X X)) 6. (EQ? #\X #\X) ==> I My position is quite simple: EQ (EQ?) has historically been implementation-dependent and is the wrong function to be associated with any kind of formal ``identity'' predicate. Let's define EQ? to be a predicate on representations, not abstract types, and let's decide which representational identities we're going to require of a conforming implementation. The reason I want such an EQ? is two-fold: (1) We need an identity predicate for symbols, and (2) we need to know when we have the same *instance* of a mutable object so we know which one we are modifying. Thus, I suggest that EQ? be defined only for symbols, booleans, (), and mutable objects (pairs, vectors, strings, and ports). It should be undefined (implementation-dependent but not an error) for numbers, characters, closures, and continuations. Thus (EQ? X X) is not necessarily #!TRUE unless X is appropriately typed. I believe this position is consistent with Will's. I agree with Jonathan that EQV? should also be defined over an incomplete domain. I'd like to see it extend EQ? to encompass exact numbers and characters but not closures and continuations. EQUAL? should be like EQV?, but also knows how to descend into subcomponents of pairs and vectors. Again, it should not be defined for closures and continuations. Perhaps we should define an EQL? if someone really wants a predicate that somehow tries to do the ``right thing'' for closures and continuations, but I don't have any need for it. Regards, David Bartley -------  Date: Fri, 21 Feb 86 17:02:55 EST From: Jonathan A Rees Subject: proposing changes to RRRS To: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of 20 Feb 86 16:58:57 PST (Thu) from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Message-ID: <[MC.LCS.MIT.EDU].826238.860221.JAR> Date: 20 Feb 86 16:58:57 PST (Thu) From: willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Non-controversial changes We seem to have opposing ideas of what's controversial and what's non-controversial. Hate to spoil the harmony with my remarks, but I feel compelled. I don't want to be embarrassed to have this thing appear in SIGPLAN. (Will that remark help convince people that I'm ornery?) ---------------------------------------------------------------- Pages 7, 12, 48, and 54: The context free grammar omits characters and vectors as constants, and is therefore inconsistent with the language describing the quote special form and the notations for characters and vectors. This should be fixed. Note, however, that the # notations for characters and vectors are optional. I think this depends on the outcome of the debate on self-evaluation of characters and vectors, which may overturn what page 12 says. E.g. if vectors aren't going to be self-evaluating, then they probably needn't be mentioned on page 7, since they won't belong to the language but rather to the language , and it is stated that the syntax of is spread throughout the manual. (My perception is that consensus will be for characters to self-evaluate and vectors not.) But regardless of how this comes out, I have a suggestion to make: on page 7, add the following explicit grammar for : ::= | | | #!true | #!false | | | Then mention that the syntax of , , etc. is found elsewhere, and maybe give page numbers. Add a grammar for on page 26 which includes #!null, ', and ` (or `, ,, and ,@, see below). Add the obvious grammar for on page 54, for completeness. I think that the description of READ should mention that what it does is recognize the nonterminal, and return an object which represents the parse tree in a natural way (the same way as QUOTE). Also, if we can get the backquote problem ironed out, a statement should be made somewhere to the effect that the language is a (strict) subset of the language and that this while to the uninitiated this might appear to be a confusing coincidence, actually the uninitiated are right. Controversial changes [flush named-lambda, change do, change eq? and eqv?] I agree with all of these changes. Please note that in debating the EQ? and EQV? changes, the two questions, essentially (a) should eqv? be implementation-dependent and (b) should procedures and numbers have associated locations, can and probably should be treated independently.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 21 FEB 86 16:13:11 EST Date: Fri, 21 Feb 1986 15:16 EST Message-ID: From: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Cc: RRRS-AUTHORS@MC.LCS.MIT.EDU Subject: proposing changes to RRRS In-reply-to: Msg of 20 Feb 1986 19:58-EST from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Date: Thursday, 20 February 1986 19:58-EST From: willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Controversial changes ---------------------------------------------------------------- Page 17: I'd still like to see named-lambda removed from the language definition. Any implementation capable of handling (named-lambda (var ...) ...) specially should be capable of handling (rec var (lambda (...) ...)) specially. I agree, despite the fact that I invented this one in MIT Scheme (I wish I hadn't, this special form proliferation is driving me nuts!) I would go one step farther, in proposing to flush REC as well, since such an implementation could also handle the special case of LETREC just as easily (however, I bet there are folks out there, unlike me, who use REC all the time.) ---------------------------------------------------------------- Page 19: In talking about do, RRRS says that "the results of the step expressions are stored in the bindings of the vars". I would rather change the semantics so that the step expressions are evaluated, the vars are bound to fresh locations, and the results of the step expressions are stored into those locations. The proposed semantics is the one that has been used in Scheme up until the RRRS. MacScheme and T3 still use it in violation of RRRS; others may also. Yes! I never even looked at it closely enough to notice that the MIT Scheme implementation also uses the proposed semantics rather than the RRRS semantics.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 21 Feb 86 08:29:30 EST Received: from tektronix by csnet-relay.csnet id aa03527; 21 Feb 86 8:01 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: RRRS-AUTHORS@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 20 Feb 86 16:57:39 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA05866; Thu, 20 Feb 86 16:59:00 PST Message-Id: <8602210059.AA05866@tekchips> Subject: proposing changes to RRRS Date: 20 Feb 86 16:58:57 PST (Thu) Here is a list of changes I (Will Clinger) would like to see made in the RRRS. Only three changes have enough substance to be controversial. Page numbers refer to MIT AI Memo 848, August 1985. Non-controversial changes ---------------------------------------------------------------- Pages 7, 12, 48, and 54: The context free grammar omits characters and vectors as constants, and is therefore inconsistent with the language describing the quote special form and the notations for characters and vectors. This should be fixed. Note, however, that the # notations for characters and vectors are optional. ---------------------------------------------------------------- Page 7: Backquote doesn't fit into the context free grammar. A correct treatment of backquote syntax may be more trouble than it's worth in an informal document, however. ---------------------------------------------------------------- Page 7: Variables are referred to as objects, but they're not. ---------------------------------------------------------------- Page 22: It says that "pairs (and therefore lists)" count as true. It should say "pairs (and therefore nonempty lists)", or else just "pairs". ---------------------------------------------------------------- Page 25: eqv? should work on characters also. ---------------------------------------------------------------- Page 30: People have requested that the last argument to append not be required to be a proper list. ---------------------------------------------------------------- Page 43: The grammar is ambiguous as to whether (for example) #x1e4 represents decimal 10000 or decimal 484. Following Common Lisp, let's say that "interpretation as a digit is preferred to interpretation as" an exponent. ---------------------------------------------------------------- Page 43: There are no rules about the exactness or precision to be used on input if none is specified. I think that's just fine until we get more experience with the RRRS number system, but a sentence to that effect would be helpful. ---------------------------------------------------------------- Page 49: The character set should not have to be infinite, so the order isomorphisms should be between the set of characters and a *subset* of the integers. ---------------------------------------------------------------- Controversial changes ---------------------------------------------------------------- Page 17: I'd still like to see named-lambda removed from the language definition. Any implementation capable of handling (named-lambda (var ...) ...) specially should be capable of handling (rec var (lambda (...) ...)) specially. ---------------------------------------------------------------- Page 19: In talking about do, RRRS says that "the results of the step expressions are stored in the bindings of the vars". I would rather change the semantics so that the step expressions are evaluated, the vars are bound to fresh locations, and the results of the step expressions are stored into those locations. The current semantics is compatible with Common Lisp, but the proposed semantics is better not only because it has no side effects but also because it is likely to run faster when compiled by a sophisticated compiler. I don't think many people will notice the difference, and the few who are sophisticated enough to notice will probably prefer the proposed semantics. The proposed semantics is the one that has been used in Scheme up until the RRRS. MacScheme and T3 still use it in violation of RRRS; others may also. The change was due to my thoughtless acceptance of the Common Lisp semantics. ---------------------------------------------------------------- Pages 24-25: We have agreed that eq? is a low-level notion of object identity. I would like to adopt Jonathan's proposal that eqv? be defined independently of eq?. I believe the main effect of this is that the result of eqv? when given procedures as arguments would not be specified.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 20 Feb 86 20:06:24 EST Received: from tektronix by csnet-relay.csnet id ac16010; 20 Feb 86 18:48 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: RRRS-AUTHORS@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 20 Feb 86 10:34:52 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA08527; Thu, 20 Feb 86 10:36:12 PST Message-Id: <8602201836.AA08527@tekchips> Subject: publication of RRRS in SIGPLAN Notices Date: 20 Feb 86 10:36:07 PST (Thu) Last month, at the ACM POPL conference, representatives of ACM SIGPLAN requested that we submit the RRRS for publication in ACM SIGPLAN Notices. This would make the RRRS available to a much wider audience, give it more influence, and would satisfy editors who dislike citations of technical reports. ACM SIGPLAN has begun a policy of publishing programming language manuals for interesting new languages, and recent issues of SIGPLAN Notices have contained definitions of Poly, Modcap, and B, among others. Jonathan Rees has agreed to fix up a few minor bugs in the RRRS and to reformat it for more compact publication. If no one objects, he will submit the result to SIGPLAN Notices in a few weeks. People haven't been flaming me for errors in RRRS, so I've asked Jonathan not to list me as editor. Is this ok with everybody? Peace, Will Clinger willc%tekchips@tektronix.csnet  Date: Thu, 20 Feb 86 17:50:28 EST From: Jonathan A Rees Subject: further clarification To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].824794.860220.JAR> On talking with Bill Rozas about EQV?, I think I should clarify my position, in order to help people understand it. I haven't been very clear. Suppose that X and Y are expressions which evaluate to procedures or numbers. Then I want the following two rules to hold. (A) If (EQ? X Y) returns true, then the values of X and Y have the same meaning with respect to all operations, including implementation-dependent ones. (Note that the converse is not true. I think we all agree about this rule.) (B) If (EQV? X Y) is defined, then it should return true or false depending on whether or not the values of X and Y have the same meaning with respect to all implementation-independent Scheme operations. If not, then implementations may choose to take any action appropriate to situations that "are errors": return true, false, or some other value, or signal an error, or do something else. (So far everyone has disagreed with me about this.) Thus (EQ? X Y) implies (EQV? X Y) whenever (EQV? X Y) is defined. Also, since the only thing you can do with a procedure is call it, then EQV? on two procedures would have to determine whether two procedures do the same thing, which is in general impossible. So in order for rule (B) to hold, if X and Y are both procedures, then (EQV? X Y) must be undefined. I want to be able to use EQ? to do things like caching (memoization); some implementations may choose to invariably return false when given two procedures, but if they return true sometimes (subject to rule A) then my programs might run faster. If I want to convince myself that my program which uses EQ? is portable, I may have to do some rather sophisticated reasoning, allowing for the fact that results may be indeterminate. On the other hand, I want to know that if my program which uses EQV? never passes two procedures to EQV?, and is otherwise in no way implementation-sensitive, then I can have complete confidence that it will run the same way in any correct Scheme implementation. In particular, if I convince myself that the program works in my implementation (by testing it or using any other technique), then I will have no trouble convincing myself that it will work when run by any other implementation. The set of implementation-independent primitives is already pretty large, and I want to include EQV? in that set, to permit very naive portability criteria (like "EQ? doesn't occur"). I DON'T want programs which use EQV? to gratuitously work just because I happen to be using some particular implementation. I might fool myself into thinking the program will work in others. So I want to allow my implementation to signal an error. [I guess my criterion (B) is pretty strong. I hope that this doesn't precipitate a discussion of the meaning of QUOTE. Please don't even think about mentioning that, before we get this issue straightened out.] Sorry to be long-winded. Please either shoot me down or bear with. Jonathan.  Date: Thu, 20 Feb 86 14:57:27 EST From: Jonathan A Rees Subject: EQV? note (tiny correction) To: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of Thu 20 Feb 86 14:36:55 EST from Jonathan A Rees Message-ID: <[MC.LCS.MIT.EDU].824565.860220.JAR> I meant to say "implementation-independent," not "machine-independent."  Date: Thu, 20 Feb 86 14:48:08 EST From: Jonathan A Rees Subject: survey - JAR's answers To: JAR@MC.LCS.MIT.EDU cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of Wed 19 Feb 86 21:37:49 EST from Jonathan A Rees Message-ID: <[MC.LCS.MIT.EDU].824547.860220.JAR> Here are my own answers. I 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" E 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) I 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" E 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) I 5. (LET ((X ... exact number ...)) (EQ? X X)) X 6. (EQ? #\X #\X) These are the only possible choices given the conjunction of the following constraints, which I find desirable: (a) I want to give implementations a lot of leeway in the optimizations they do on closures. (b) I want closures to not have associated locations (in the sense of denotational semantics), i.e. I want to equate a closure with its behavior when called. (c) I want EQV? to behave in a machine-independent manner, without requiring that it solve the halting problem. Jonathan.  Date: Thu, 20 Feb 86 14:36:55 EST From: Jonathan A Rees Subject: EQV? note To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].824527.860220.JAR> Let me repeat for about the third time my request that EQV? have machine-independent semantics. I believe that this is a logical consequence of the intended "spirit" of EQV? which (I thought) is that it be a *clean* fine-grained equality predicate. I feel a little insulted that people haven't even taken this suggestion seriously enough to argue against it. Judging form the number of "I" answers to the survey's EQV? questions, most people seem to disagree with this, although they won't say why. I'll assume people have just forgotten that I suggested this, but I would like some reaction.  Received: from SCRC-STONY-BROOK.ARPA by MC.LCS.MIT.EDU 20 Feb 86 10:13:36 EST Received: from RIO-DE-JANEIRO.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 421054; Thu 20-Feb-86 09:58:05-EST Date: Thu, 20 Feb 86 10:00 EST From: Kent M Pitman Subject: EQ?, again To: JAR@MC.LCS.MIT.EDU cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-Reply-To: <[MC.LCS.MIT.EDU].823678.860219.JAR> Message-ID: <860220100001.2.KMP@RIO-DE-JANEIRO.SCRC.Symbolics.COM> Executive Summary: 1-O, 2-O, 3-U, 4-U, 5-T, 6-T 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" O. Mathematically equivalent functions should be allowed to return true, but not required to EXCEPT when the implementation provides any operation which could distinguish the resulting objects using non-functional means. That is, if an operation (DISCLOSE (LAMBDA (X) X)) returns (LAMBDA (X) X) then (EQ? X Y) would not be allowed return true for mathematically equivalent functions unless (EQ? (DISCLOSE X) (DISCLOSE Y)) would also return true, and so on. In the special case where such operations were provided and the compiler could prove that they would never be used, I have no objection to optimizing the two closures to be eq. That is, (DEFINE FOO (LAMBDA (X) X)) (DEFINE BAR (LAMBDA (Y) Y)) (EQ? FOO BAR) should be required to return false in implementations which support debugging operations that could distinguish FOO and BAR, but (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) in the same implementation should be permitted to return true. My point being that EQ? is currently an essential primitive in system design and it would be a shame to let it get turned into something which thwarted debugging. 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) O. I think it important to say that EQV? cannot return false when EQ? has returned true. The combination of this point and my arguments in 1 above lead to the conclusion that this may return true sometimes but may not (for reasons related to halting problems) not be required to do so. 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" U. My intuition says that this should always return true, but Jonathan has convinced me to think harder on the issue before insisting that the intuition is valid. I would like to reserve the right to speak later on this one. 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) U. My intuition is also that this should return true, but I am not currently taking a stand pending a resolution on item 3. 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) T. I believe my views on this are on record already. 6. (EQ? #\X #\X) T. If there is some argument from Will or Jonathan as to why this would clutter the semantics, I don't think I've heard it. Depending on my decision about item 3, I may be willing to change my views on this if I heard a argument for it.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 20 Feb 86 01:08:54 EST Received: from tektronix by csnet-relay.csnet id ab06340; 20 Feb 86 0:47 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: JAR@mc.lcs.mit.edu Cc: willc%tektronix.csnet@CSNET-RELAY.ARPA, RRRS-AUTHORS@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 19 Feb 86 10:05:12 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA24330; Wed, 19 Feb 86 10:06:16 PST Message-Id: <8602191806.AA24330@tekchips> Subject: Re: quotation In-Reply-To: Your message of Mon, 17 Feb 86 22:47:24 EST., <[MC.LCS.MIT.EDU].821043.860217.JAR> Date: 19 Feb 86 10:06:11 PST (Wed) I know I'm stubborn and downright ornery, but I'm surprised to learn that Jonathan is too. He's been fooling me all this time. Exactly right. I'd be happy with the judgment of the RRRS on this matter if it were made internally consistent, and I think most others would also. I think you're asking for an awful lot, certainly not something that has been achieved elsewhere in the language design. All I meant is that the wording in one part of the document should be made consistent with the wording in other parts of the document. Peace, Will  Date: Wed, 19 Feb 86 21:37:49 EST From: Jonathan A Rees Subject: EQ?, again To: RRRS-AUTHORS@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].823678.860219.JAR> I hate to drag this out of the closet, but as I remember, there was no agreement on the question of the semantics of EQ? and EQV?, and in fact we seemed deadlocked. So I would like to take a survey. I think most of us have pretty strong opinions, so I'm not sure how much good more arguing will do right now, but I'm interested in finding out what everyone thinks. I will be happy to collate and summarize the results of the survey. You may send your answers either to RRRS-AUTHORS or just to me. I believe that we all agree on what EQ? and EQV? do when presented with cons cells, vectors, symbols, booleans, null, and strings. I think we also agree on what EQV? does when it's presented with exact numbers and characters. I don't think we agree on what EQ? does with numbers and characters, or what either procedure does with procedures. I won't even ASK about ports and continuations; we can discuss that later. If you can & care to summarize your opinion or motivation in two or three short sentences, feel free, but I'd like to see how the survey goes before we start flaming again. If you've forgotten the debate and would like me to re-send any of the main position messages on this topic (Pitman, Steele, and Clinger are the ones I remember) just ask and I'll do so. Here's the survey. If you object to it, please invent a better one. ----- For each expression, specify what you think its value or meaning should be, one of: T - Always true F - Always false I - Implementation-dependent but NOT AN ERROR E - An error (undefined effect) X - I don't care U - Undecided O - Other (specify) Here are the expressions: 1. (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) ;"Coalescing" 2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) 3. (LET ((X (LAMBDA (Z) Z))) (EQ? X X)) ;"Splitting" 4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X)) 5. (LET ((X ... any expression evaluating to an exact number ...)) (EQ? X X)) 6. (EQ? #\X #\X)  Date: Mon, 17 Feb 86 22:47:24 EST From: Jonathan A Rees Subject: quotation To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of 17 Feb 86 13:45:54 PST (Mon) from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Message-ID: <[MC.LCS.MIT.EDU].821043.860217.JAR> Date: 17 Feb 86 13:45:54 PST (Mon) From: willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA No, no. Self-evaluation of numbers and strings is also unnecessary, so no intermediate position can be justified by Occam's razor or any other principle. It is important that we acknowledge the lack of a principled justification, because some of us can be stubborn and downright ornery if we think we're arguing about principles rather than personal tastes. OK, I freely admit it: I'm making arguments based not on principle but on taste. And, as everyone knows by now, I am both ornery and stubborn, no matter what I'm talking about. The principled position which I would have liked to take, of making NOTHING self-evaluate (except (let ((x '`(let ((x ',x)) ,x))) `(let ((x ',x)) ,x)) and things of that ilk), is out of the question, since it would be a grossly incompatible change, besides being politically doomed. It is true that distinctions become arbitrary here, but in each case we can apply judgement the same way that we applied judgement in the inclusion of any language feature.... Exactly right. I'd be happy with the judgment of the RRRS on this matter if it were made internally consistent, and I think most others would also. Maybe you misunderstood me; I meant to argue that each different expression type (vector, character, etc.) should be considered on its own merit, and that it didn't make much sense to come up with a general rule to decide all such questions; they don't necessarily have anything in common with each other. (The term "self-evaluation" is really inappropriate to our description method anyhow, which takes pains to distinguish objects from expressions.) But I won't press this point. Apparently I need to find a way to construe my desire that numbers, strings, and characters self-evaluate, and vectors and () not self-evaluate, as being internally consistent. (I never wanted #!true or #!false to self-evaluate, but won't fight a decision that's already been made.) I think you're asking for an awful lot, certainly not something that has been achieved elsewhere in the language design. But before arguing further against your terms of argument (which I'll try next if this attempt fails), let me hazard one more attempt at internal consistency: "Atomic" objects (other than symbols) self-evaluate, and "composite" objects don't. For present purposes "composite" objects comprise vectors and lists, and lists include (). Since we haven't established a meaning for vectors or (), they are not (yet) syntactically valid expressions, much less self-evaluating ones. "Random" objects don't much bear on this discussion because they don't have external syntax, and the language has neither EVAL nor macros. But if there were any way to evaluate or compile an unreadable object, I think we all agree that the appearance of one within a program would be an error.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 17 Feb 86 19:49:44 EST Received: from tektronix by csnet-relay.csnet id bd01120; 17 Feb 86 19:02 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: RRRS-AUTHORS@mc.lcs.mit.edu Cc: willc%tektronix.csnet@CSNET-RELAY.ARPA, JAR@mc.lcs.mit.edu Received: from tekchips by tektronix with smtp ; 17 Feb 86 13:44:55 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA09024; Mon, 17 Feb 86 13:46:02 PST Message-Id: <8602172146.AA09024@tekchips> Subject: Jonathan Rees for editor In-Reply-To: Your message of Thu, 13 Feb 86 15:58:41 EST., <[MC.LCS.MIT.EDU].817505.860213.JAR> Date: 17 Feb 86 13:45:54 PST (Mon) Jonathan Rees once volunteered to do the next edition of the RRRS, fixing a number of minor problems such as the inconsistencies that have been noted concerning constants. There's certainly no one I'd trust more to do it, and so I move that we take him up on his offer. Are you still willing, Jonathan? ---------------------------------------------------------------- By the way, I have a few remarks in response to Jonathan's observations on quote. Jonathan's words are indented. For example, suppose there's a bug in a macro, and it returns a closure instead of a lambda-exparession.... I've been burned by that in MacScheme, where quotes are optional except for symbols and non-empty lists. So I think that an intermediate position between 2 and 3 *can* be justified by the principle of Occam's razor, and we should disallow self-evaluation of vectors, procedures, and other "random" objects because it's unnecessary.... No, no. Self-evaluation of numbers and strings is also unnecessary, so no intermediate position can be justified by Occam's razor or any other principle. It is important that we acknowledge the lack of a principled justification, because some of us can be stubborn and downright ornery if we think we're arguing about principles rather than personal tastes. It is true that distinctions become arbitrary here, but in each case we can apply judgement the same way that we applied judgement in the inclusion of any language feature.... Exactly right. I'd be happy with the judgment of the RRRS on this matter if it were made internally consistent, and I think most others would also. peace, Will  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 15 FEB 86 02:01:20 EST Date: Sat, 15 Feb 1986 01:59 EST Message-ID: From: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU To: Kent M Pitman Cc: JAR@MC.LCS.MIT.EDU, kend%tekla.uucp@CSNET-RELAY.ARPA, RRRS-AUTHORS@MC.LCS.MIT.EDU, willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Subject: Which constants must be quoted? In-reply-to: Msg of 14 Feb 1986 19:37-EST from Kent M Pitman I agree with you completely.  Received: from SCRC-STONY-BROOK.ARPA by MC.LCS.MIT.EDU 14 Feb 86 19:39:03 EST Received: from RIO-DE-JANEIRO.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 417951; Fri 14-Feb-86 19:36:23-EST Date: Fri, 14 Feb 86 19:37 EST From: Kent M Pitman Subject: Which constants must be quoted? To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA cc: KMP@SCRC-STONY-BROOK.ARPA, JAR@MIT-MC.ARPA, RRRS-AUTHORS@MC.LCS.MIT.EDU, kend%tekla.uucp@CSNET-RELAY.ARPA In-Reply-To: <[MC.LCS.MIT.EDU].817505.860213.JAR> Message-ID: <860214193710.4.KMP@RIO-DE-JANEIRO.SCRC.Symbolics.COM> I agree with Jonathan that unevaluable data errors are useful in debugging and we shouldn't define all random objects to self-evaluate or we'll just delay the error-detection process to no good end. For cleanliness reasons, I wish we could make numbers, etc. not self-evaluate. I guess we can't do that without distinguishing numbers from numerals, etc. and I guess none of us are ready to do anything that radical. So as a compromise condition, I think we should limit the set of self-evaluating data to not include things for which there is no re-readable print syntax. The rationale would be that things which you can type in can reasonably be expected to occur in programs and things which you cannot are likely to be errors. QUOTE, of course, could be used to say "I really mean this". So strings, numbers, true, false, etc. would therefore be self-evaluating. The empty list is arguably a special case because its type is already defined to refer to combinations and it is empty, so I think it should be an error and you should have to say '().  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 14 FEB 86 12:35:28 EST Date: 14 Feb 1986 12:30 EST (Fri) Message-ID: From: Bill Rozas To: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU Cc: Jonathan A Rees , kend%tekla.uucp@CSNET-RELAY.ARPA, RRRS-AUTHORS@MC.LCS.MIT.EDU, willc%tektronix.csnet@CSNET-RELAY.ARPA Subject: Which constants must be quoted? In-reply-to: Msg of 14 Feb 1986 10:03-EST from CPH%OZ.AI.MIT.EDU at XX.LCS.MIT.EDU I don't think vectors should self-evaluate: I think that making back-quote (comma) work with them would be very useful: `#(1 2 3 ,x 4 5 ,y) equivalent to (vector 1 2 3 x 4 5 y) It would be somewhat weird if the above were true but #(1 2 3 4 5) required no quoting.  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 14 FEB 86 10:12:21 EST Date: Fri, 14 Feb 1986 10:03 EST Message-ID: From: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU To: Jonathan A Rees Cc: kend%tekla.uucp@CSNET-RELAY.ARPA, RRRS-AUTHORS@MC.LCS.MIT.EDU, willc%tektronix.csnet@CSNET-RELAY.ARPA Subject: Which constants must be quoted? In-reply-to: Msg of 13 Feb 1986 15:58-EST from Jonathan A Rees Date: Thursday, 13 February 1986 15:58-EST From: Jonathan A Rees So I think that an intermediate position between 2 and 3 *can* be justified by the principle of Occam's razor, and we should disallow self-evaluation of vectors, procedures, and other "random" objects because it's unnecessary. That is, the class of legitimate programs should be as small as is reasonable. I disagree with Chris's statement that it is "Lisp tradition" for random objects to self-evaluate. I agree with you about this -- I wasn't thinking about objects like procedures and such. In fact, when I wrote that I had in mind only objects for which there is read syntax, specifically: lists, symbols, vectors, strings, characters, numbers, and booleans. I believe that tradition does conform to my statement with this restriction. All of the other things should cause errors, as you seem to suggest. I don't care whether vectors are self-evaluating, but I think that strings and numbers should be. As for characters and booleans: I prefer that they self-evaluate, because I use them often and there seems no very good reason to require that they be quoted.  Date: Thu, 13 Feb 86 15:58:41 EST From: Jonathan A Rees Subject: Which constants must be quoted? To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MC.LCS.MIT.EDU, kend%tekla.uucp@CSNET-RELAY.ARPA, willc%tektronix.csnet@CSNET-RELAY.ARPA In-reply-to: Msg of 10 Feb 86 10:25:17 PST (Mon) from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA Message-ID: <[MC.LCS.MIT.EDU].817505.860213.JAR> Date: 10 Feb 86 10:25:17 PST (Mon) From: willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA ... As for vectors, I think we decided not to decide at Brandeis, and so the draft I sent off to MIT did not mention characters or vectors on page 12. They were added at MIT as part of the final editing performed (I assume) by Gerry Sussman and Hal Abelson.. I agree with the change, but some might question the decision on vectors. There seem to be approximately two principled answers to the question of which values must be quoted (answer 1: everything; answer 2: only symbols and nonempty lists). Any other answer will include arbitrary distinctions, about which no definitive argument can be made. I think that I and other people associated with T are the holdouts here. In a language based on evaluation instead of normalization, answer 1 is the "right thing," but unfortunately, if we're to be Lisp-like, we should permit omitting the ' in the case of numbers and strings. 2 is unacceptable because it makes the class of permissible programs larger than necessary, and therefore reduces opportunities for error checking. For example, suppose there's a bug in a macro, and it returns a closure instead of a lambda-exparession. Or suppose that due to an FTP or editor bug, or even a typo, a stray # was inserted into a source file just before a parenthesis. These situations would be very difficult to debug if closures and vectors were legitimate expressions. So I think that an intermediate position between 2 and 3 *can* be justified by the principle of Occam's razor, and we should disallow self-evaluation of vectors, procedures, and other "random" objects because it's unnecessary. That is, the class of legitimate programs should be as small as is reasonable. Programming language tradition has numbers and strings self-evaluating; everything else is dubious. In the case of composite objects like vectors and procedures, we should not lock out the future possibility that a meaning other than self-evaluation might be assigned in the future. For example, I don't see any a priori reason why vectors should be treated differently from lists; seems to me that #(CAR '(A B)) has as much right as (CAR '(A B)) to evaluate to A. Similarly, perhaps a procedure would want to be invoked (or "sent a message"?) to produce a value. I don't care much about booleans and characters; they could go either way as far as I'm concerned. I guess I only advise updating the manual to state clearly that characters self-evaluate. It is true that distinctions become arbitrary here, but in each case we can apply judgement the same way that we applied judgement in the inclusion of any language feature. I think it's hoping for too much to look for definitive arguments; much of the language would suffer severely under such scrutiny. I disagree with Chris's statement that it is "Lisp tradition" for random objects to self-evaluate. I have seen messages like #73556 UNEVALUABLE DATUM in Maclisp and other dialects more times than I can remember, and I have always been happy to see them, because they have always put me on the road to eliminating yet one more bug. Jonathan.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 13 Feb 86 14:48:43 EST Received: from indiana by csnet-relay.csnet id a014137; 13 Feb 86 14:34 EST Date: Thu, 13 Feb 86 12:30:07 est From: Kent Dybvig To: cph@oz.ai.mit.edu Subject: Re: Which constants must be quoted? Cc: RRRS-AUTHORS@mc.lcs.mit.edu, kend%tekla.uucp@csnet-relay.arpa I agree with all your comments except the rationale for quoting the empty list. There is no such thing as an "empty combination", the RRRS not withstanding. There is a practical reason for allowing unquoted (). If #!null and #!false are just reader syntax for (), then requiring a quote on () is tantamount to requiring a quote on #!null and #!false (this is especially bothersome within the code for a syntactic extension or other program transformation, where two quotes are required to produce a quoted () in the expansion). On the other hand, there sems to be no practical reason to require the quote. I can think of only two valid reasons to require the quote: (1) to simplify the semantics of the language, or (2) to allow the compiler to catch a common programming error. I don't believe either reason applies here. Perhaps () should be recognized as the "empty special form" rather than the "empty combination". Of course, the empty special form always evaluates to (). Kent  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 12 FEB 86 19:43:43 EST Date: Wed, 12 Feb 1986 19:38 EST Message-ID: From: CPH%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA Cc: kend%tekla.uucp@CSNET-RELAY.ARPA, RRRS-AUTHORS@MC.LCS.MIT.EDU Subject: Which constants must be quoted? In-reply-to: Msg of 10 Feb 1986 13:25-EST from willc%tekchips%tektronix.csnet at CSNET-RELAY.ARPA I would like to suggest: quoting should be necessary only for symbols and lists (including the empty list.) My reasoning: 1. Only symbols and lists can represent non-constant s-expressions. Therefore it is not necessary to quote other objects. 2. While there is definitely a split of opinion on whether or not the remaining objects should require quoting (i.e. should we try to be more like 3-lisp or not,) I favor not requiring quoting since this is traditional lisp practice. I don't see that such a decision affects the semantics of the language much, so there seems little reason to break with tradition here (not to mention the fact that I have a considerable investment in this tradition!) 3. I favor quoting the empty list, because I don't think that it should be distinguished from lists in general. As RRRS states, the unquoted empty list represents an empty combination, and should be considered an error.  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 11 Feb 86 07:21:20 EST Received: from tektronix by csnet-relay.csnet id ad18218; 11 Feb 86 6:45 EST From: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA To: kend%tekla.uucp@CSNET-RELAY.ARPA, RRRS-AUTHORS@mc.lcs.mit.edu Cc: willc%tektronix.csnet@CSNET-RELAY.ARPA Received: from tekchips by tektronix with smtp ; 10 Feb 86 10:24:39 PST Comment: Message received over unauthenticated port at tektronix Received: by tekchips (5.31/5.14), id AA04677; Mon, 10 Feb 86 10:25:22 PST Message-Id: <8602101825.AA04677@tekchips> Subject: Which constants must be quoted? In-Reply-To: Your message of 10 Feb 86 09:10:50 PST (Mon)., <8602101710.AA23029@tekla> Date: 10 Feb 86 10:25:17 PST (Mon) I (Will Clinger) received the following from Ken Dickey of Tektronix: ---------------------------------------------------------------- Will, The Revised Revised Report is somewhat ambiguous on what computational objects are self evaluating. For example the grammar on page 7 does not list as a constant, but there is a statement in section II.7 that with the "#\" notation characters are self evaluating. Likewise both character and vector constants are mentioned as self evaluating on page 12 (under quote), but section II.9, on vectors, does not state whether vectors (which use the "#(" notation) are self evaluating. [Note that the T manual I have does not state whether vectors are self evaluating either]. It would seem that any object denoted by the sharp would tend by usage to be self evaluating ("constant"). However, this is nowhere stated and the comments on page 6 (Other notations) seem to indicate that # is or may be simply a read macro. The specific question I started out with is "Are vectors self evaluating?". Are they? Also, are characters always self evaluating, or is this strictly connected with the "#\" notation? Pax & Thanks, -Somewhat Confused (tekla!kend) ---------------------------------------------------------------- I am a little confused also, but here's what I think happened. After Brandeis, TI requested over the net that characters written in the #\ notation should be guaranteed to evaluate to themselves, and no one objected. I forgot to change the grammar on page 7 or to mention characters on page 12. As for vectors, I think we decided not to decide at Brandeis, and so the draft I sent off to MIT did not mention characters or vectors on page 12. They were added at MIT as part of the final editing performed (I assume) by Gerry Sussman and Hal Abelson.. I agree with the change, but some might question the decision on vectors. There seem to be approximately two principled answers to the question of which values must be quoted (answer 1: everything; answer 2: only symbols and nonempty lists). Any other answer will include arbitrary distinctions, about which no definitive argument can be made. I'll send your message and this reply on to RRRS-AUTHORS@MIT-MC. The inconsistencies you pointed out should be fixed. Peace, Will Clinger willc@tekronix.csnet  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 25 Jan 86 14:56:07 EST Received: from indiana by csnet-relay.csnet id aa06542; 25 Jan 86 14:34 EST Date: Sat, 25 Jan 86 13:04:04 est From: Kent Dybvig To: JAR@mc.lcs.mit.edu Subject: Re: quasiquote implementation Cc: rrrs-authors@mc.lcs.mit.edu Your implementation, which is very similar to the T and MIT Scheme quasiquote imeplementations, has the powerful advantage over mine of conciseness, but it has the serious flaw that it makes the expansion of nested quasiquote forms visible to the user, thus making it difficult to write portable code to manipulates such forms. For example, (EQUAL? '`(A ,B) ``(A ,B)) doesn't hold in your implementation. This seems to defeat my original aim in suggesting that we standardize the external syntax of `, which was to allow people to write portable program-manipulating programs. Jonathan Yes, I see what you mean. My goal in making the forms standardized was to to allow portable code such as an interpreter to make sense out of what the function "read" returns, and also to provide for the proper pretty-printing of source programs. I hadn't thought of the double-quasiquote issue. When I read your first article, I thought that you wanted '`(A ,B) = ``(A ,B) => (list (quote a) b) which is relatively easy (but not desirable, since I didn't want to reach inside the quote). Instead, I see you want '`(A ,B) = ``(A ,B) => `(A ,B) which I can certainly sympathize with. This explains the need for the "level" variable in your code. For anyone who cares, here is another (less powerful) version of my code with a level-tracking argument added. I have tested it only briefly. (let ((check (lambda (x) (unless (and (pair? (cdr x)) (null? (cddr x))) (ferror (car x) "invalid form ~s" x)))) (quasicons (lambda (a d) (if (pair? d) (if (eq? (car d) 'quote) (if (and (pair? a) (eq? (car a) 'quote)) `'(,(cadr a) . ,(cadr d)) (if (null? (cadr d)) `(list ,a) `(list* ,a ,d))) (if (memq (car d) '(list list*)) `(,(car d) ,a ,@(cdr d)) `(list* ,a ,d))) `(list* ,a ,d))))) (define-macro! quasiquote (x) (recur f ((x x) (n 0)) (cond ((not (pair? x)) `',x) ((eq? (car x) 'quasiquote) (check x) (quasicons ''quasiquote (f (cdr x) (1+ n)))) ((eq? (car x) 'unquote) (check x) (if (zero? n) (cadr x) (quasicons ''unquote (f (cdr x) (1- n))))) ((eq? (car x) 'unquote-splice) (check x) (if (zero? n) (ferror 'unquote-splice "invalid context for ,@~s" (cadr x)) (quasicons ''unquote-splice (f (cdr x) (1- n))))) ((and (zero? n) (pair? (car x)) (eq? (caar x) 'unquote-splice)) (check (car x)) (let ((d (f (cdr x) n))) (if (equal? d '(quote ())) (cadar x) `(append ,(cadar x) ,d)))) (else (quasicons (f (car x) n) (f (cdr x) n))))))) (define-macro! unquote (x) (ferror 'unquote "unquote form ,~s not valid outside of quasiquote" x)) (define-macro! unquote-splice (x) (ferror 'unquote "unquote-splice form ,@~s not valid outside of quasiquote" x))  Date: Fri, 24 Jan 86 15:53:28 EST From: Jonathan A Rees Subject: quasiquote implementation To: dyb%indiana.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MC.LCS.MIT.EDU In-reply-to: Msg of Fri 24 Jan 86 01:01:14 est from Kent Dybvig Message-ID: <[MC.LCS.MIT.EDU].794952.860124.JAR> Your implementation, which is very similar to the T and MIT Scheme quasiquote imeplementations, has the powerful advantage over mine of conciseness, but it has the serious flaw that it makes the expansion of nested quasiquote forms visible to the user, thus making it difficult to write portable code to manipulates such forms. For example, (EQUAL? '`(A ,B) ``(A ,B)) doesn't hold in your implementation. This seems to defeat my original aim in suggesting that we standardize the external syntax of `, which was to allow people to write portable program-manipulating programs. Jonathan  Received: from CSNET-RELAY.ARPA by MC.LCS.MIT.EDU 24 Jan 86 10:49:26 EST Received: from indiana by csnet-relay.csnet id a022299; 24 Jan 86 10:24 EST Date: Fri, 24 Jan 86 01:01:14 est From: Kent Dybvig To: rrrs-authors@mc.lcs.mit.edu Subject: quasiquote implementation Back when JAR first suggested making quasiquote standard, I transcribed my quasiquote implementation from the C-coded reader into Scheme-coded syntactic-extensions. I promised to send the code to David Bartley at TI and figured some of the rest of you might be interested as well. I believe that this gives different results from JAR's, because it can actually fold up explicit calls to "list" and "list*" (for better or for worse). It also insists that quasiquote, unquote, and unquote-splice forms be well-formed, rather than ignoring those that aren't. As with JAR's, nested quasiquotes work properly. Because quasiquote and company are expanded at compile time rather than read time, it is reasonable to write code that produces quasiquote forms. "list*" (Common Lisp's name) is the same as JAR's "cons*". The meaning of everything else should be obvious. (let ((check (lambda (x) (unless (and (pair? (cdr x)) (null? (cddr x))) (ferror (car x) "invalid form ~s" x))))) (define-macro! quasiquote (x) (recur f ((x x)) (cond ((not (pair? x)) `',x) ((eq? (car x) 'quasiquote) (check x) (f (f (cadr x)))) ((eq? (car x) 'unquote) (check x) (cadr x)) ((eq? (car x) 'unquote-splice) (ferror 'unquote-splice "invalid context for ~s" x)) ((and (pair? (car x)) (eq? (caar x) 'unquote-splice)) (check (car x)) (let ((d (f (cdr x)))) (if (equal? d '(quote ())) (cadar x) `(append ,(cadar x) ,d)))) (else (let ((a (f (car x))) (d (f (cdr x)))) (if (pair? d) (if (eq? (car d) 'quote) (if (and (pair? a) (eq? (car a) 'quote)) `'(,(cadr a) . ,(cadr d)) (if (null? (cadr d)) `(list ,a) `(list* ,a ,d))) (if (memq (car d) '(list list*)) `(,(car d) ,a ,@(cdr d)) `(list* ,a ,d))) `(list* ,a ,d)))))))) (define-macro! unquote (x) (ferror 'unquote "unquote form ,~s not valid outside of quasiquote" x)) (define-macro! unquote-splice (x) (ferror 'unquote "unquote-splice form ,@~s not valid outside of quasiquote" x))  Date: Sun, 15 Dec 85 22:22:11 EST From: Jonathan A Rees Subject: backquote To: ALAN@MIT-MC.ARPA, RRRS-AUTHORS@MIT-MC.ARPA Message-ID: <[MIT-MC.ARPA].755421.851215.JAR> Date: Sun, 15 Dec 85 06:10:14 EST From: Alan Bawden ... Also a minor point about your quasi-quote proposal: You notate the three magic objects that are used for recognition as #!QUASIQUOTE, #!UNQUOTE and #!UNQUOTE-SPLICE. I claim that in a language specification it would be better to specify the names of three variables whose -values- are the three magic markers. This is because it is dangerous for those objects to appear directly as constants in source code. (In one message you point out this danger yourself, I seem to remember.) Instead of (IF (EQ? (CAR FORM) '#!UNQUOTE-SPLICE) ...) it is better to write (IF (EQ? (CAR FORM) UNQUOTE-SPLICE-MARKER) ...). And instead of (DEFINE-READER-SYNTAX #/` (LAMBDA (STREAM) `(#!QUASIQUOTE ,(READ STREAM)))) (which doesn't work as intended) it is -certainly- better to write (DEFINE-READER-SYNTAX #/` (LAMBDA (STREAM) `(,QUASIQUOTE-MARKER ,(READ STREAM)))). In fact, I cannot imagine any situation in which I would want to type one of the three magic markers directly. Conceivably a language specification should specify how the quasi-quote marker -looks-, so that you can recognize it if you happen to see it, but the user should not be encouraged to risk typing it in himself. (Were I in charge I might consider specifying the names of the three variables, what the objects look like when PRINTed, and that is is an error if that printed representation is ever seen by READ.) I definitely agree with most of this (I have run into exactly this bug, and I suspect other people have too). I only disagree with the parenthetical remark at the end. I think it's important that the markers, or at least forms so marked, be rereadable. One of the things I was worried about was the problem of being able to print a program out from one implementation and read it back in in another. One solution would be for PRINT to generate ` - , - ,@ syntax properly, and have isolated occurences of the markers print nonrereadably. Another solution would be to require user code to always expand the backquote form in some way before printing it, but that's unacceptable since your target implementation might have some spiffy implementation of backquote, and you'd like to be able to take advantage of it without knowing what it is (or writing your own version of PRINT). I tend to agree with Kent Dybvig's suggestion that the asymmetry with QUOTE should be eliminated somehow. I suggested a long time ago, certainly on one of the T mailing lists and probably to the RRRS authors, that 'FOO should read DIFFERENTLY from the list (QUOTE FOO), either as a list with a funny marker in its CAR or as an object of a different datatype, but no one seemed to like that suggestion. Since it's too late to change that, I think the markers should be symbols with long names (no name is ever obscure enough, I know, but what can you do?), since it's a pain both in some implementations and in the manual to introduce new datatypes. We would then have to agree on what the name is, so that users can be sure to avoid using it. Probably having variables whose values are these symbols, or maybe even inventing little S&ICP-style data abstractions (constructor, accessor, and predication functions), is desirable too. But that would be redundant, not bare-bones the way the rest of scheme is...  Date: Fri, 13 Dec 85 21:56:36 EST From: Jonathan A Rees Subject: EQ? again, already To: gls@AQUINAS.THINK.COM cc: RRRS-AUTHORS@MIT-MC.ARPA Message-ID: <[MIT-MC.ARPA].753459.851213.JAR> Typo in previous message (result of electronic editing): flush the two words "More importantly, ".  Date: Fri, 13 Dec 85 21:53:08 EST From: Jonathan A Rees Subject: EQ? again, already To: gls@AQUINAS.THINK.COM cc: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of Fri 13 Dec 85 16:31 EST from Guy Steele Message-ID: <[MIT-MC.ARPA].753457.851213.JAR> Date: Fri, 13 Dec 85 16:31 EST From: Guy Steele I feel it is a mild pity that RRRS has made this same mistake. If EQ? were banished to the realm of "implementation-dependent" and satisfying few portable axioms (e.g. it could look at the machine state when given two EQV? numbers), and EQV? were considered the portable equality primitive, what would you have it do with procedures? More importantly, I would like to see evaluation of LAMBDA-expressions not be a side-effect, but not only is a contradictory assumption wired into Abelson & Sussman's book (according to Sussman), the problem is undecidable if by equality you mean what compilers would like equality to mean, that is extensional equality ("f computes the same return value and machine state as g given the same input arguments and machine state"). Which would you prefer: (a) to define EQV? to "be an error" when both arguments are procedures (this implies that implementations would be encouraged to signal errors, of course), (b) to define LAMBDA to be a side-effect, (c) to rule out optimizations which would inhibit "intensional" comparison of procedures, or (d) make EQV? to be implementation-dependent (and thus put it in the same flea-bitten bag as EQ?)? I think those are the only alternatives. Actually, now that I think of it, option (c), intensional procedure comparison, could be made to work, conceivably. E.g. EQV? on two procedures could use EQV? to compare bignums representing some canonical form of the source (or object?) code (alpha- and maybe beta-converted to some sort of normal form, and maybe other things) and the presumably cons-cell-like cells which hold values of variables. Then indeed (EQV? (LAMBDA (X) (LIST X Z)) (LAMBDA (Y) (LIST Y Z))) would be defined to return true, and we could intern object code, do CSE, and so on. I haven't though about this possibility, but I suspect it would be very complicated to implement, and would inhibit some important optimizations. I echo my previous simple request (to which no one has replied - I guess that everyone agrees): permit implementations of EQV? to signal an error when given arguments that are both procedures.  Received: from GODOT.THINK.COM by MIT-MC.ARPA 13 Dec 85 16:34:33 EST Received: from jehosephat by GODOT.THINK.COM via CHAOS; Fri, 13 Dec 85 16:31:46 est Date: Fri, 13 Dec 85 16:31 EST From: Guy Steele Subject: EQ? again, already To: rrrs-authors@MIT-MC.ARPA Cc: gls@THINK-AQUINAS.ARPA Message-Id: <851213163145.5.GLS@THINK-JEHOSEPHAT.ARPA> From a message I received by papermail (for obscure reasons) from Will Clinger, I infer that I may have confused a lot of people out there. When I said that I take EQ? as my axiomatic notion of identity, I neglected to say that I reject the further definition of EQ? in such implementational terms as comparison of pointers. I expect EQ? to behave more like EQV? (or like the Common Lisp EQL), so it might have been more clear if I had used EQV? everywhere instead of EQ? in my last long message. To put it another way, I constrain the meaning of my identity operation by positing not only such requirements as that (EQ? X X) always be true for X any name, but also that (EQ? (+ X Y) (+ X Y)), (EQ? (* X Y) (* X Y)), and so on, so that numbers of the same type and value are always EQ? (or EQV?, if you insist). I argued for changing Common Lisp EQ to have the semantics EQL now has, but there was too much implementation tradition behind EQ and so EQ remained unchanged. I feel it is a mild pity that RRRS has made this same mistake. --Guy  Received: from GODOT.THINK.COM by MIT-MC.ARPA 12 Dec 85 17:24:40 EST Received: from desiderius by GODOT.THINK.COM via CHAOS; Thu, 12 Dec 85 16:45:07 est Date: Thu, 12 Dec 85 16:45 EST From: Guy Steele Subject: [Postmaster@SCRC-STONY-BROOK.ARPA: Unable to deliver letter] To: KMP@SCRC-STONY-BROOK.ARPA Cc: rrrs-authors@MIT-MC.ARPA, gls@THINK-AQUINAS.ARPA In-Reply-To: <851209230127.8.KMP@RIO-DE-JANEIRO.SCRC.Symbolics.COM> Message-Id: <851212164505.4.GLS@THINK-DESIDERIUS.ARPA> [GLS -- This went to RRRS-AUTHORS and was explicitly CC'd to you but didn't succeed in sending directly to you so I'm retrying; you may have seen it already on RRRS-AUTHORS, but in case that failed, too, I'm risking redundancy. -kmp] Hmm, let me pose a few concrete examples to test that you and I agree. You called this splitting: [1] (LET ((X C) (Y C)) (EQ? X Y)) You and I seem to agree that evaluating this expression should always yield true. Further, I suggest that the following are also in the same category and assume that we both agree that these should -always- return true as well: [1a] (LET ((F (LAMBDA () C))) (EQ? (F) (F))) [1b] (LET ((F (LAMBDA () '(ANY EXPRESSION)))) (EQ? (F) (F))) You called this coalescing: [2] (LET ((F (LAMBDA () (LAMBDA () 3)))) (EQ? (F) (F))) I agree with you that it is reasonable for either of {true,false} to be returned from evaluating this. I suggest that the following are special cases of the same situation, and that it may not be possible to predict which of {true,false} will be returned: [2a] (LET ((X '(ANY EXPRESSION)) (Y '(ANY EXPRESSION))) (EQ? X Y)) [2b] (LET ((F (LAMBDA () '(ANY EXPRESSION))) (G (LAMBDA () '(ANY EXPRESSION)))) (EQ? (F) (G))) Among other things, 2b is important because of code constructed by expressions like: `(LET ((F (LAMBDA () ',EXP)) (G (LAMBDA () ',EXP))) (EQ? (F) (G))) We would like to constrain the semantics of the resulting expression to not be perturbed by first writing the expression to an intermediate file and then re-reading it. Put another way -- with the exception of objects like symbols (which are intentionally interned), sharing of memory should not affect the semantics of an expression (though it may affect its perceived behavior). On the other hand, certain identifiable patterns of data flow should be defined to preserve sharing in a well-defined way, in order to ease certain programming tasks and also (for not reasons that are not unrelated) to allow various theorems, transformations, proofs, and optimizations to be conveniently written about those patterns. I agree with you in all these details. --Guy  Date: Tue, 26 Nov 85 15:03:10 EST From: Jonathan A Rees Subject: EQV? and procedures etc To: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of 25 Nov 85 16:29:22 PST from Will Clinger Message-ID: <[MIT-MC.ARPA].733803.851126.JAR> While we're stirring up the EQ? muck, I would like to stir up some EQV? muck. I would like it if the description of EQV? did NOT appeal to the definition of EQ?. In fact, I would like it to be deterministic. That is, while the meaning of (EQ? x y) is a function of an implementation, the meaning of (CAR x) is not, and I would like EQV? to have the same deterministic status as CAR. A consequence of this is that unlike EQ?, the domain of EQV? would be incomplete, that is, it might be an error to call it on certain combinations of arguments. For example, if both arguments are procedures, or both arguments are inexact numbers, it should be undefined (and implementations are permitted to signal an error). Examples: (EQV? 'A CAR) => false - different types (EQV? (CONS 1 2) (CONS 1 2)) => false (LET ((Z (FOO))) (EQV? Z Z)) => true this would hold for any expression (FOO) (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y)) => wrong - returning a consistent answer in all implementations would impose horrible constraints on implementations (EQV? CAR CDR) => wrong - similarly (EQV? #i1 #i1) => wrong - similarly Of course, the exact behavior of EQV? in undefined situations is unspecified; implementations may find it more efficient to fail to detect domain errors than to detect them. In other words, I want EQV? to strive to be the true mathematical equality-of-denotation predicate which we would like to have but can't (both for implementation and decidability reasons). It would necessarily then have an incomplete domain. If people seriously object to this I won't bother trying to make it more precise; if they don't, and someone wants it to be made more precise, I'll try to do so.  Received: from THINK-AQUINAS.ARPA by MIT-MC.ARPA 26 Nov 85 11:58:33 EST Received: from FAUSTINUS.THINK.COM by THINK-AQUINAS.ARPA via CHAOS with CHAOS-MAIL id 2959; Tue 26-Nov-85 11:58:57-EST Date: Tue, 26 Nov 85 11:57 EST From: Guy Steele Subject: EQ? and procedures etc To: willc%tekchips%tektronix.csnet@CSNET-RELAY.ARPA, RRRS-AUTHORS@MIT-MC.ARPA cc: gls@THINK-AQUINAS.ARPA In-Reply-To: The message of 25 Nov 85 19:29-EST from Will Clinger Message-ID: <851126115745.2.GLS@FAUSTINUS.THINK.COM> Will has raised some very good points. I need to be more precise about what I mean. Will: ... Guy's informal distinction is probably based on some preconceived semantics for EQ?. That would be ok if we could formalize that preconceived semantics and then sanction particular classes of deviations from it, which is in effect what I propose above. ... (EQ? + +) --> #!FALSE implies no more of a side effect than (EQ? 1000000 1000000) --> #!FALSE or ( #!FALSE or (+ 3 3) --> 6. All we're talking about is the value returned by a procedure. Since procedures can't be side effected, we can't look to the primitive mutators for guidance, and we must fall back on the rule that indistinguishable objects should be EQ?. Indeed, I have a preconceived semantics for EQ?, namely the semantics attributed to "=" in logic: the identity function. A logic may have "=" in it or not, but it has to be included specially. I view EQ? as a primitive notion: it is the very definition of what it means for two things to be the same. I want other parts of the language to satisfy certain relationships with respect to EQ?. I define the notion of side effect in terms of EQ?. If I take two formally identical computations (that is, expressions that are identical in every way), and use them as the two arguments to EQ?, then if EQ? returns #!FALSE I say that a side effect has occurred. (If EQ? returns true then that result says nothing one way or the other about whether a side effect has occurred.) One of the properties I want in my language is that name-reference be free of side effects. That is, if "x" is any name, I expect evaluation of the form (EQ? x x) to produce #!TRUE. Note that I have not yet insisted that (EQ? f f) return #!TRUE for any kind of form f other than a name. In particular, I have not insisted that (EQ? (LAMBDA (X) X) (LAMBDA (X) X)) return #!TRUE. This is very different from saying that (LET ((Z (LAMBDA (X) X))) (EQ? Z Z)) must return #!TRUE, and indeed I do insist on this latter case. Note too that I do not insist that beta-conversion be a valid transformation (and this is why we must be careful to distinguish (EQ? (LAMBDA (X) X) (LAMBDA (X) X)) and (LET ((Z (LAMBDA (X) X))) (EQ? Z Z)), for example: we have not yet established substitutability of expressions for names that represent their values). Rather, one of the requirements for beta-conversion is that the resulting program fragment be externally indistinguishable from the original with respect to side effects (as defined above). However, I do have a rule for LET: if x and y are two names (possibly the same), then (LET ((x y)) (EQ? x y)) returns #!TRUE. (That is, LET-binding does not "split". There is an analogous rule for binding of LAMBDA-parameters. This rule can be generalized for arbitrary side-effect-free expressions y.) Once again, I insist that referring to a name be free of side effects. It is also convenient to insist that referring to a constant (numbers and quoted objects, for example) be free of side effect, but for now we sidestep the question of whether the expression (+ 6 6) contains the "same" constant 6 twice or two different constants representing the mathematical value 6. I insist that the operation of function calling per se be free of side effect; if evaluation of a function-call form has a side effect, then necessarily either the body of the invoked function has a side effect or one of the argument forms has a side effect. (This is why I want references to constant to be side-effect-free: so that I can claim that if (+ 6 6) has a side effect, it must be in the body of + because it doesn't come from the references to the name + or to the constant(s) 6.) Now, what things in the language shall be permitted to have side effects? We could consistently allow a call to + to result in a side effect, with the result that (EQ? (+ 6 6) (+ 6 6)) might return #!FALSE. (Since I so far allow the 6's to be distinct, perhaps it is more to the point to say that (LET ((X 6)) (EQ? (+ X X) (+ X X))) might return #!FALSE.) We could interpret this in familiar implementational terms by saying that + allocates a new location associated with the result, but the results do have the same mathematical values, and EQUAL? will compare these values without regard to the location. But because numerical equality is decidable we might as well say that (EQ? m n) iff (EQUAL? m n) for numbers m and n and let it go at that. Such a definition will maintain the desired properties of naming with respect to EQ?. On the other hand, because equality of functions is not decidable in general, we cannot take such an easy way out. We insist that for two functions f and g, (EQ? f g) must return #!FALSE if f and g are operationally distinguishable (that is, if they have differences of behavior that are detectable within the language). On the other hand, it must be the case that (LET ((X f)) (EQ? x x)) returns #!TRUE for any f, even a function. Consider the dilemma of poor EQ?: confronted with two mathematical functions (for which operational equality is undecidable in general), it could take the safe way out for the first condition by returning #!FALSE if it can't decide, or it could take the safe way out for the second condition by returning #!TRUE if it can't decide; but it cannot do both. One solution, as Will has pointed out, is to package a "hint" with every function object, in the form of a "location". One should perhaps think of this not as indicating a location in *memory*, in which the closure is somehow stored, but rather as an indication of the *(dynamic) location in the code* whose evaluation produced the function object. Two compare two closures, then, one first compares the locations; if they match, then they resulted from the same evaluation, and so EQ? must return #!TRUE to satisfy the naming properties. Otherwise, EQ? looks at that mathematical functions themselves; and it may return #!TRUE if it can prove the functions are operationally indistinguishable, but it is always safe to return #!FALSE in this case. To summarize: I take EQ? as primitive; I define the notion of side effect in terms of EQ?; I require the language to obey certain properties defined in terms of EQ? (and this is why EQ? is special: because structural properties of the language, such as naming, are specified in terms of it); and I show that, because of the undecidability of equality of functions, the use of pure mathematical functions to represent closures results in a semantically inadequate model because two requirements on the language cannot be satisfied simultaneously by the implementation without additional information (such as a location). So the use of locations is not a wart, but critical to the semantics. Locations are not necessary for mathematical objects whose equality is decidable, such as numbers. Thus if EQ? is supposed to express something like total behavioral equivalence, then it does a much better job with pairs than with procedures. Hence the argument for making EQ? work on procedures is weaker than the corresponding argument for pairs. I propose that two things might happen to be behaviorally equivalent without being EQ?. It is precisely because of the undecidability problem that EQ? cannot determine equivalence; I say only that EQ? returning #!TRUE implies behavioral equivalence. (This is why splitting is bad: I might determine (EQ? X Y) => #!TRUE and later find that X and Y behave differently, possibly only with respect to EQ? itsefl!) In case anyone is worried, my proposal that EQ? can say that two procedure values are identical whenever their E* --> K --> C parts are identical is sufficient to ensure that (EQ? (LET ((N 0)) (LAMBDA () (SET! N (1+ N)) N)) (LET ((N 0)) (LAMBDA () (SET! N (1+ N)) N))) is false (ignoring the fact that a compiler could determine that neither procedure is going anywhere). It would not guarantee that (LET ((N 0)) (EQ? (LAMBDA () (SET! N (1+ N)) N) (LAMBDA () (SET! N (1+ N)) N))) is false (with the same caveat). In other words, my new proposal seems to me to do the right thing with respect to object-oriented programming. I think this is exactly right. --Guy  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 26 Nov 85 03:51:04 EST Received: from tektronix by csnet-relay.csnet id aa07476; 26 Nov 85 3:38 EST From: Will Clinger To: RRRS-AUTHORS@mit-mc.ARPA Received: from tekchips by tektronix with smtp ; 25 Nov 85 16:35:13 PST Date: Monday, 25 Nov 85 16:29:22 PST Subject: EQ? and procedures etc Ok, it's pretty clear that people want (EQ? X X) to be true. I bow to the majority, but I want to discuss some of the points that have been raised. This is a long message. ---------------------------------------------------------------- Will: The current wording of RRRS forces a formal semantics of Scheme to associate locations with procedure values and numbers. These locations have no purpose other than to make the semantics of EQ? work as in RRRS, and they make the semantics much uglier. The ugliness of the semantics results in more complex and less effective optimizing compilers. Kent: I have yet to see a convincing example of this claim.... However, for now perhaps you could just enumerate some of the troublesome cases for me to respond to more specifically. Will: I made about four claims in the paragraph Kent cites, all of which are debatable, and I'm not sure which one(s) he doesn't believe so I will run through all of them. First of all, neither the Muchnik-Pleban semantics nor the semantics I wrote for my 1984 Lisp conference paper can handle the semantics of EQ? as in RRRS. The problem is that the domain equation for expressed values is something like E = {#!true, #!false, #!null} + F + ; procedure values R + ; numbers Q + ; symbols LxL + ; pairs L* + ; vectors L* ; strings where L is the domain of locations and + is a disjoint union. The equation for procedure values is F = E* --> K --> C, so the EQ? procedure sees a procedure as a higher order (mathematical) function, not as a pointer to a data structure encoding the higher order function. The obvious fix is to change the equation for procedure values to be F = L x (E* --> K --> C). This fix is necessary in order that an implementation be *allowed* to say for example that (EQ? (lambda (x) x) (lambda (y) y)) is false. If EQ? does indeed say that the two identity procedures are different, then we have an interesting example of Guy's "splitting" (which he claims to dislike): both arguments are the identity procedure, and in all respects they behave exactly the same except that EQ? says they're different. Now, however, if I naively go through the semantics and make the changes implied by the new domain equation for F, I find that implementations are *required* to say that (EQ? (lambda (x) x) (lambda (y) y)) is false -- so splitting is forced upon us. That naive semantics can be fixed, provided that we agree on a precise formulation of the circumstances in which an implementation can say that two procedures are EQ?. My suggestion that EQ? be left unspecified on procedure values was prompted by a disagreement over precisely what those circumstances should be. Since people don't like that, I now propose that it's ok to say that two procedure values are EQ? iff the second components (i.e., elements of E* --> K --> C) are equal. This is of course undecidable in general but implementations should be allowed to take advantage of whatever simple cases they can recognize. (I expect the MIT philosophy will disagree.) Similarly R must have a structure akin to R = L x (mathematical numbers), but this isn't as troublesome because we seem to agree that it's ok for EQ? to return true on numbers whenever the difference of the two numbers is zero, and that EQ? must return false whenever the difference is nonzero. I think the location-ridden semantics is uglier, but that's a matter of taste. I admit that the location-ridden semantics doesn't affect compilers provided my new suggestion is adopted (that EQ? can use the second component of procedure values), because most of us were doing precisely the same thing with the old semantics. If on the other hand optimizations such as those raised by David Bartley are disallowed, then compilers are going to be less effective. Furthermore the ambitious compilers will be more complex, because they will still attempt those optimizations whenever the compiler can prove that the procedures never get passed to the EQ? procedure. (To the objection that the compiler could never prove such a thing because the break key provides access to internals at arbitrary times, I reply that the compiler has the freedom to define critical sections during which such interrupts are deferred.) ---------------------------------------------------------------- Kent: ...I cite as precedent the case of CALL-WITH-CURRENT-CONTINUATION. Everyone seemed to agree that it was easier to compile/optimize code which doesn't allow returning these upward. The argument for pushing it through was not that it made the implementation simpler, since in fact it complicated it. It was instead the fact that it was (a) at least possible to implement and (b) much simpler to some write programs when this feature was available. It seems to me that the situation is similar for EQ? -- I do not want to give up this feature. You (the compiler writer) have to solve this problem only once. I (the programmer) will have to solve it repeatedly if you do not solve it once for me. Will: First class escape procedures greatly simplified the semantics. Their semantics is so beautiful you don't even have to see them. Downward-only escape procedures have a really ugly semantics. The same holds for first class procedures as opposed to downward-only procedures. Semantic simplicity often does not correlate with implementational simplicity. My complaint about the current definition of EQ? is not that it is hard to implement -- it isn't -- but that I think its semantics are ugly. I haven't always thought that -- I too once thought that (EQ? X X) should always be true. (I stand by my comment about semantic ugliness turning into compiler complexity in this particular case, but in truth I said that for the sake of people who think of me as a semantic purist who doesn't care about implementations. If people now think of me as a hacker who doesn't care about semantics, well, that's great, my disguise must be working.) ---------------------------------------------------------------- Guy: ....may two things that you might expect to be different turn out to be the same? (Call this property "coalescence".) ....may one thing turn out to be two different things? (Call this property "splitting".) Will: The problem is that we don't have a precise and independent definition of what we mean by "expect to be different" and "one thing" until we have a semantics for EQ?, so we can't use these notions to define EQ?. (Likewise, when the RRRS says of certain objects that they "are identical to themselves and are different from everything else...", it says nothing at all.) My earlier reference to (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) --> #!FALSE as a splitting was intended in part to demonstrate that Guy's informal distinction is probably based on some preconceived semantics for EQ?. That would be ok if we could formalize that preconceived semantics and then sanction particular classes of deviations from it, which is in effect what I propose above. ---------------------------------------------------------------- Guy: I feel that splitting is a semantically bad thing to have in a LISP-like language, because I think name-reference should be free of side effects! Will: (EQ? + +) --> #!FALSE implies no more of a side effect than (EQ? 1000000 1000000) --> #!FALSE or ( #!FALSE or (+ 3 3) --> 6. All we're talking about is the value returned by a procedure. Since procedures can't be side effected, we can't look to the primitive mutators for guidance, and we must fall back on the rule that indistinguishable objects should be EQ?. ---------------------------------------------------------------- Guy: If two objects cannot be distinguished, then it should be legitimate to merge them. If there are no RPLACA or RPLACD operations on CONS cells, then hash-consing is a legitimate optimization (and, indeed, perhaps EQ should be required to behave as EQUAL on CONS cells in the absence of such side effects!)... Will: Hence (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) --> #!TRUE should be ok, but we can't require it (as we could for hashed pairs) because the problem is undecidable. The challenge lies in writing a semantics that will allow it but not require it. I have stated my belief that the cleanest such semantics leaves EQ? unspecified on procedures, numbers, etc, but I am willing to adopt what I believe is an uglier semantics if that's what people want. ---------------------------------------------------------------- Gerry: I think it is essential that (EQ? x x) be true of procedure values. We should be able to think of compound data structures as LAMBDA calculus entities (as at the end of section 3.3.1 of Abelson and Sussman -- "Mutation is just assignment" -- p 207), and surely we want (EQ? x x) to be true of CONSs (else assignment to components such as SET-CAR! will make no sense). Will: The lambda calculus has no side effects and no primitive or definable (in the language) equality predicate that satisfies the rule that equal objects are behaviorally indistinguishable. My feeling is that if the lambda calculus can get along without EQ?, then Scheme should be able to get along without it also except for those objects that can be stored into. Nonetheless I think Gerry has expressed the best argument for making EQ? do something reliable on procedures. Let me observe, however, that the value of (LET ((FOO FOO)) (LAMBDA ARGS (APPLY FOO ARGS))) is behaviorally identical to the value stored in FOO when that value is a procedure object, but there is no similar way to take a pair and get from it a behaviorally identical but non-EQ? pair without redefining all the primitive operations on pairs. Thus if EQ? is supposed to express something like total behavioral equivalence, then it does a much better job with pairs than with procedures. Hence the argument for making EQ? work on procedures is weaker than the corresponding argument for pairs. In case anyone is worried, my proposal that EQ? can say that two procedure values are identical whenever their E* --> K --> C parts are identical is sufficient to ensure that (EQ? (LET ((N 0)) (LAMBDA () (SET! N (1+ N)) N)) (LET ((N 0)) (LAMBDA () (SET! N (1+ N)) N))) is false (ignoring the fact that a compiler could determine that neither procedure is going anywhere). It would not guarantee that (LET ((N 0)) (EQ? (LAMBDA () (SET! N (1+ N)) N) (LAMBDA () (SET! N (1+ N)) N))) is false (with the same caveat). In other words, my new proposal seems to me to do the right thing with respect to object-oriented programming.  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 25 NOV 85 17:20:22 EST Date: 25 Nov 1985 17:17 EST (Mon) Message-ID: From: Bill Rozas To: Kent M Pitman Cc: gls@AQUINAS.THINK.COM, RRRS-AUTHORS@MIT-MC.ARPA, willc%tekchips%tektronix@CSNET-RELAY.ARPA Subject: EQ? and procedures, numbers, etc In-reply-to: Msg of 25 Nov 1985 14:24-EST from Kent M Pitman I don't know if GLS agrees with you, but I certainly do.  Received: from SCRC-STONY-BROOK.ARPA by MIT-MC.ARPA 25 Nov 85 14:31:44 EST Received: from DUPAGE.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 362279; Mon 25-Nov-85 14:22:54-EST Date: Mon, 25 Nov 85 14:24 EST From: Kent M Pitman Subject: EQ? and procedures, numbers, etc To: gls@THINK-AQUINAS.ARPA cc: KMP@SCRC-STONY-BROOK.ARPA, willc%tekchips%tektronix@CSNET-RELAY.ARPA, RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: <851122165853.8.GLS@THINK-YON.ARPA> Message-ID: <851125142400.0.KMP@DUPAGE.SCRC.Symbolics.COM> Hmm, let me pose a few concrete examples to test that you and I agree. You called this splitting: [1] (LET ((X C) (Y C)) (EQ? X Y)) You and I seem to agree that evaluating this expression should always yield true. Further, I suggest that the following are also in the same category and assume that we both agree that these should -always- return true as well: [1a] (LET ((F (LAMBDA () C))) (EQ? (F) (F))) [1b] (LET ((F (LAMBDA () '(ANY EXPRESSION)))) (EQ? (F) (F))) You called this coalescing: [2] (LET ((F (LAMBDA () (LAMBDA () 3)))) (EQ? (F) (F))) I agree with you that it is reasonable for either of {true,false} to be returned from evaluating this. I suggest that the following are special cases of the same situation, and that it may not be possible to predict which of {true,false} will be returned: [2a] (LET ((X '(ANY EXPRESSION)) (Y '(ANY EXPRESSION))) (EQ? X Y)) [2b] (LET ((F (LAMBDA () '(ANY EXPRESSION))) (G (LAMBDA () '(ANY EXPRESSION)))) (EQ? (F) (G))) Among other things, 2b is important because of code constructed by expressions like: `(LET ((F (LAMBDA () ',EXP)) (G (LAMBDA () ',EXP))) (EQ? (F) (G))) We would like to constrain the semantics of the resulting expression to not be perturbed by first writing the expression to an intermediate file and then re-reading it. Put another way -- with the exception of objects like symbols (which are intentionally interned), sharing of memory should not affect the semantics of an expression (though it may affect its perceived behavior). On the other hand, certain identifiable patterns of data flow should be defined to preserve sharing in a well-defined way, in order to ease certain programming tasks and also (for not reasons that are not unrelated) to allow various theorems, transformations, proofs, and optimizations to be conveniently written about those patterns.  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 23 NOV 85 00:03:20 EST Date: 23 Nov 1985 00:01 EST (Sat) Message-ID: From: Bill Rozas To: David Bartley Cc: RRRS-Authors@MIT-MC.ARPA Subject: Multiple LAMBDA evaluations In-reply-to: Msg of 22 Nov 1985 10:28-EST from David Bartley I guess the main difference is how the system is viewed. We view the system as an interpreted system primarily, and compilation is something we put up with only for the sake of efficiency on the kinds of machines we unfortunately have to work with. Ideally compilation preserves the behaviour even as far as things like TRACE or PROCEDURE-ENVIRONMENT are concerned, thus those optimizations would not be allowed in the absence of declarations. In practice, given the limitations of the hardware we deal with, we allow some of these conceptual declarations to be implicit in the compiler (otherwise there would be no use to compiling in the first place). Thus we tolerate potential incompatibilities between compiled and interpreted code because of efficiency constraints. We view compilation as something that happens only after the code has being debugged significantly, so the above procedures are unlikely to be used at this stage.  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 22 Nov 85 23:40:19 EST Received: from indiana by csnet-relay.csnet id a009014; 22 Nov 85 23:28 EST Date: Fri, 22 Nov 85 15:32:26 est From: Kent Dybvig To: Bartley%csl60%ti-csl.csnet@CSNET-RELAY.ARPA Subject: Re: backquote proposal Cc: rrrs-authors@mit-mc.arpa Chez Scheme performs the optimization you refer to, in the case where no free variables are referenced in the lambda expression. I see nothing wrong with it, since closing is not an explicit allocation operation in the same sense as cons or make-vector. Kent  Received: from THINK-AQUINAS.ARPA by MIT-MC.ARPA 22 Nov 85 17:00:06 EST Received: from THINK-YON.ARPA by THINK-AQUINAS.ARPA via CHAOS with CHAOS-MAIL id 2185; Fri 22-Nov-85 16:59:24-EST Date: Fri, 22 Nov 85 16:58 EST From: Guy Steele Subject: EQ? and procedures, numbers, etc To: KMP@SCRC-STONY-BROOK.ARPA, willc%tekchips%tektronix@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA, gls@THINK-AQUINAS.ARPA In-Reply-To: <851122135902.0.KMP@DUPAGE.SCRC.Symbolics.COM> Message-ID: <851122165853.8.GLS@THINK-YON.ARPA> KMP is right in identifying the question of object identity as a key issue. But there are two separate issues here: (1) Are two evaluations, of the same LAMBDA-expression, that are distinct in time permitted to produce objects that are not distinct as determined by "EQ?"? In simpler terms, may two things that you might expect to be different turn out to be the same? (Call this property "coalescence".) (2) Once a single object has been produced and named, is the implementation allowed to duplicate it at will in such a way as to cause two distinct references to its name to result in values that are distinct to "EQ?"? In simpler terms, may one thing turn out to be two different things? (Call this property "splitting".) Now Common Lisp permits splitting for characters and numbers with respect to "EQ" (but not with respect to "EQL"). It permits coalescing for numbers, characters, and closures of lambda expressions. I feel that splitting is a semantically bad thing to have in a LISP-like language, because I think name-reference should be free of side effects! It was introduced into Common Lisp only for the sake of certain implementational efficiencies. If one removes EQ from the language and leaves only EQL, then there is no splitting. I would have preferred for EQ to have the current semantics of EQL, but it was too ingrained into everyone's mind (including mine) that EQ means "compare the pointers", which is an implementation notion rather than a language notion. On the other hand, I feel that coalescing is a perfectly legitimate optimization. If two objects cannot be distinguished, then it should be legitimate to merge them. If there are no RPLACA or RPLACD operations on CONS cells, then hash-consing is a legitimate optimization (and, indeed, perhaps EQ should be required to behave as EQUAL on CONS cells in the absence of such side effects!). It is important to have some means of generating distinct objects, but it is not clear that lambda-expressions need to be that means. It is not always legitimate to coalesce two closures, but frequently a static analysis can determine that no distinguishing side effects are possible and that therefore two closures would be operationally identical. In short, my feelings are: splitting, no: (EQ? X X) should always be true; coalescing, maybe: it can be a legitimate optimization. --Guy  Received: from SCRC-STONY-BROOK.ARPA by MIT-MC.ARPA 22 Nov 85 14:02:00 EST Received: from DUPAGE.SCRC.Symbolics.COM by SCRC-STONY-BROOK.ARPA via CHAOS with CHAOS-MAIL id 360602; Fri 22-Nov-85 13:58:06-EST Date: Fri, 22 Nov 85 13:59 EST From: Kent M Pitman Subject: EQ? and procedures, numbers, etc To: willc%tekchips@tektronix.CSNET cc: RRRS-AUTHORS@MIT-MC.ARPA In-Reply-To: The message of 21 Nov 85 16:36-EST from Will Clinger Message-ID: <851122135902.0.KMP@DUPAGE.SCRC.Symbolics.COM> Date: Thursday, 21 Nov 85 13:36:39 PST From: Will Clinger RRRS says that (EQ? x x) is always true, but it would be better if the value returned by EQ? were undefined when applied to procedure values and numbers. I strongly disagree. The current wording of RRRS forces a formal semantics of Scheme to associate locations with procedure values and numbers. These locations have no purpose other than to make the semantics of EQ? work as in RRRS, and they make the semantics much uglier. The ugliness of the semantics results in more complex and less effective optimizing compilers. I have yet to see a convincing example of this claim. Note, however, that I consider object identity to be one of the most indispensible things about symbolic processing and would not be likely to be willing to give up the only function which predicates such identity regardless of whether such an example could be constructed. However, for now perhaps you could just enumerate some of the troublesome cases for me to respond to more specifically. I thought that it would be a cleaner semantics if (EQ? x x) always returned true, but I was wrong. It is actually a cleaner semantics if the value of EQ? is unspecified when both its arguments are procedures and when both its arguments are numbers. When strings are immutable (as in the essential subset of Scheme) EQ? should be unspecified when both its arguments are strings. Similarly EQ? should be unspecified when both its arguments are characters (unless we want to insist that either (1) characters are represented uniquely (e.g. immediates) or (2) EQ? does something complicated). I bet it makes things easier to be able to ignore this distinction. I don't know if programming ease and code size are always the same as cleanliness. They are clearly related. It's possible that even O(size) is proportional to O(clean), but I doubt that size is proportional to clean. Eg, consider: (DEFINE SUCCESSOR (X) (+ X 1)) (DEFINE SUCCESSOR (X) (UNLESS (NUMBERP X) (ERROR "Don't know the successor of ~S" X)) (+ X 1)) Certainly the first program is smaller. It is probably simpler to write. It is not be simpler to use and may be harder to use. Whether it is cleaner seems to me to be in a gray area. I cite as precedent the case of CALL-WITH-CURRENT-CONTINUATION. Everyone seemed to agree that it was easier to compile/optimize code which doesn't allow returning these upward. The argument for pushing it through was not that it made the implementation simpler, since in fact it complicated it. It was instead the fact that it was (a) at least possible to implement and (b) much simpler to some write programs when this feature was available. It seems to me that the situation is similar for EQ? -- I do not want to give up this feature. You (the compiler writer) have to solve this problem only once. I (the programmer) will have to solve it repeatedly if you do not solve it once for me.  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 22 Nov 85 13:12:13 EST Received: from ti-csl by csnet-relay.csnet id ad04001; 22 Nov 85 12:55 EST Date: 22 Nov 1985 0928-CST From: David Bartley Subject: Re: Multiple LAMBDA evaluations To: JINX%MIT-OZ@mit-mc.arpa, Bartley%CSL60%ti-csl.csnet@csnet-relay.arpa cc: RRRS-Authors@mit-mc.arpa, Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: Your message of 21-Nov-85 1817-CST Received: from csl60 by ti-csl; Fri, 22 Nov 85 10:34 CST > Your counter examples do not disprove my claim. In all cases no > lambda expression is evaluated twice in the same environment. > [...] I guess that the difference is that I view environments as > objects (probably influneced because of our dialect), and > conceptually a new one is created every time a lambda expression > is applied, even if this lambda expression has no bindings. Sorry, I've tripped up on another assumption that needs to be discussed! In my first example (reproduced below), I assumed that the DO-SOMETHING in the body of the inner lambda was an expression that did not reference MSG. Thus, in a pragmatic sense, it didn't have to be closed over that part of the environment. This is the analysis that permits a compiler to transform (define foo (lambda (msg) (case msg ... ((XXX) (lambda (args) do-something)) ...))) into (define foo (let ((xxx-handler (lambda (args) do-something))) (lambda (msg) (case msg ... ((XXX) xxx-handler) ...)))) since I'm moving the lambda relative to its environment. So, I agree with your point, but I'm shifting the question a bit: Is there a consensus that these kinds of optimizations are permitted? To my knowledge, the only way to know whether a procedure object has closed over the entire ``visible'' environment or just the part of it which it references is to have something like MIT Scheme's PROCEDURE-ENVIRONMENT. This function returns the environment over which a procedure was closed. Our implementation of Scheme supports first-class environments in the MIT sense and has this function, but we consider it a debugging tool only, and do not guarantee that it returns the entire environment unless the compilation took place in ``debug mode.'' So, I consider the environment closed over to be an abstraction with different implementations possible. Thus, if the environments may or may not be EQ?, the closures may or may not be EQ?. Regards, David Bartley -------  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 21 NOV 85 23:01:44 EST Date: 21 Nov 1985 22:58 EST (Thu) Message-ID: From: Bill Rozas To: David Bartley Cc: RRRS-Authors@MIT-MC.ARPA Subject: Multiple LAMBDA evaluations In-reply-to: Msg of 21 Nov 1985 19:17-EST from David Bartley Your counter examples do not disprove my claim. In all cases no lambda expression is evaluated twice in the same environment. The two environments (as with JAR's example) may be isomorphic (same variables and same values), but not identical. In the absence of incremental definition, or if there is no handle to these environments, the distinction is moot, but in the presence of these "features" they are not the same. I guess that the difference is that I view environments as objects (probably influneced because of our dialect), and conceptually a new one is created every time a lambda expression is applied, even if this lambda expression has no bindings. I think that the only cases which force re-evaluation in the same environment (the way I view them) are contorted examples like (define fact (let ((there&then '())) (set! there&then (call-with-current-continuation (lambda (here&now) here&now))) (let ((y-like (lambda (x) (there&then x)))) ;HERE (y-like (lambda (f) (lambda (n) (if (= n 0) 1 (* n ((f f) (-1+ n)))))))))) Note that the 2 lambdas marked by HERE (the explicit one, and the implicit one in the LET expression) are evaluated twice in the environment created by the outermost LET.  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 21 Nov 85 22:26:54 EST Received: from ti-csl by csnet-relay.csnet id a028471; 21 Nov 85 22:17 EST Date: 21 Nov 1985 1817-CST From: David Bartley Subject: Re: Multiple LAMBDA evaluations To: JINX%MIT-OZ@mit-mc.arpa, Bartley%CSL60%ti-csl.csnet@csnet-relay.arpa cc: RRRS-Authors@mit-mc.arpa, Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: Your message of 21-Nov-85 1045-CST Received: from csl60 by ti-csl; Thu, 21 Nov 85 19:14 CST From JINX: > We should certainly not require that both evaluations return EQ? > objects. I'm more concerned about whether I am ALLOWED to return values that are EQ?, not that I might be REQUIRED to. > Note: Is it true that the only way that a lambda expression can be > re-evaluated in the same environment is by using > call-with-current-continuation? No. Consider the following... (define foo (lambda (msg) (case msg ... ((XXX) (lambda (args) do-something)) ...))) Here, FOO returns a closure every time it is called with argument XXX. I expect that most implementations would ``cons up'' a new closure object every time. The following code does the same thing, but the closures returned are always EQ? to each other. (define foo (let ((xxx-handler (lambda (args) do-something))) (lambda (msg) (case msg ... ((XXX) xxx-handler) ...)))) My motivation is to allow an optimizing compiler to map code like the first into code like the second. This is related to the traditional compiler optimization which moves invariant computations out of loops. Here's another example: (define bar (lambda (x) (map (lambda (y) (+ y 13)) x))) I would like to transform this into... (define bar (let ((temp (lambda (y)(+ y 13)))) (lambda (x) (map temp x)))) Regards, David Bartley -------  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 21 Nov 85 20:52:49 EST Received: from tektronix by csnet-relay.csnet id ao26285; 21 Nov 85 18:45 EST From: Will Clinger To: RRRS-AUTHORS@mit-mc.ARPA Received: from tekchips by tektronix with smtp ; 21 Nov 85 14:46:45 PST Date: Thursday, 21 Nov 85 13:36:39 PST Subject: EQ? and procedures, numbers, etc RRRS says that (EQ? x x) is always true, but it would be better if the value returned by EQ? were undefined when applied to procedure values and numbers. The current wording of RRRS forces a formal semantics of Scheme to associate locations with procedure values and numbers. These locations have no purpose other than to make the semantics of EQ? work as in RRRS, and they make the semantics much uglier. The ugliness of the semantics results in more complex and less effective optimizing compilers. I thought that it would be a cleaner semantics if (EQ? x x) always returned true, but I was wrong. It is actually a cleaner semantics if the value of EQ? is unspecified when both its arguments are procedures and when both its arguments are numbers. When strings are immutable (as in the essential subset of Scheme) EQ? should be unspecified when both its arguments are strings. Similarly EQ? should be unspecified when both its arguments are characters (unless we want to insist that either (1) characters are represented uniquely (e.g. immediates) or (2) EQ? does something complicated). Will Clinger  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 21 Nov 85 16:06:05 EST Received: from indiana by csnet-relay.csnet id ae24742; 21 Nov 85 15:25 EST Date: Wed, 20 Nov 85 18:32:57 est From: Kent Dybvig To: rrrs-authors@mit-mc.arpa Subject: re: backquote One potential functionality change, at least to Chez Scheme, is that backquote errors are found at compilation time rather than at read time. This is because it will not be possible (all right, reasonable) for the reader to catch backquoteless commas if the (quasiquote ...) syntax is allowable as a replacement for backquote. In any case, this strikes me as a feature, not a bug, since the compiler can generally produce more informatitve error messages than the reader. Not only that, but we might find some meaning for (unquote x) not inside a quasiquote. How about (eval x)? Kent  Date: Thu, 21 Nov 85 12:22:56 EST From: Jonathan A Rees Subject: Multiple LAMBDA evaluations To: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of 20 Nov 1985 1825-CST from David Bartley Message-ID: <[MIT-MC.ARPA].726762.851121.JAR> Date: 20 Nov 1985 1825-CST From: David Bartley (1) Are issues like the following best raised in the general Scheme community (e.g., SCHEME@MIT-MC) or here among the RRRS authors? I'm a little confused about the intended difference when it comes to language issues. I figure that sending specific, technical implementation and language design questions only to RRRS-Authors is appropriate since generally most of us don't have the time to educate the 150 SCHEME list members about things which will probably confuse or mislead them, and which they probably don't care much about anyhow. Messages to RRRS-Authors are easier to compose since we can sassume more sophistication on the part of the recipients. Also, the smaller group can come to consensus on complicated questions much faster and much less painfully than the larger group. I don't mean this to sound elitist, only pragmatic. (2) When a given lambda expression is evaluated more than once, is it required/allowed/disallowed that the procedure objects returned be EQ? to each other? ... Does anyone object to adding similar language to the RRRS to clarify our intent? This is a good idea; the wording in the CL manual is pretty good. We definitely want to permit, but not require, things like (LET ((Z ...)) (LET ((F (LAMBDA () (LAMBDA () Z)))) (EQ? (F) (F)))) and (EQ? (LAMBDA (X) X) (LAMBDA (Y) Y)) and even (LET ((F (LAMBDA (X) (LAMBDA () X)))) (EQ? (F 'A) (F 'A))) to return true. The usual rule is that if there's any way to distinguish two objects, EQ? of them must return false, but if there ISN'T, EQ? is permitted to return true, and in certain special cases (like that of symbols) is required to return true. This applies to closures just as it does to numbers. Maybe a 2nd edition of the RRRS is in order, say, for next summer? I'll even volunteer to oversee its production, if no one else steps forward, since the number of changes should be pretty small.  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 21 NOV 85 10:50:22 EST Date: 21 Nov 1985 10:45 EST (Thu) Message-ID: From: Bill Rozas To: David Bartley Cc: RRRS-Authors@MIT-MC.ARPA Subject: Multiple LAMBDA evaluations In-reply-to: Msg of 20 Nov 1985 19:25-EST from David Bartley We should certainly not require that both evaluations return EQ? objects. Note: Is it true that the only way that a lambda expression can be re-evaluated in the same environment is by using call-with-current-continuation?  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 20 Nov 85 20:44:39 EST Received: from ti-csl by csnet-relay.csnet id ae16553; 20 Nov 85 20:13 EST Date: 20 Nov 1985 1825-CST From: David Bartley Subject: Multiple LAMBDA evaluations To: RRRS-Authors@mit-mc.arpa cc: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA Received: from csl60 by ti-csl; Wed, 20 Nov 85 18:38 CST I have two questions: (1) Are issues like the following best raised in the general Scheme community (e.g., SCHEME@MIT-MC) or here among the RRRS authors? I'm a little confused about the intended difference when it comes to language issues. (2) When a given lambda expression is evaluated more than once, is it required/allowed/disallowed that the procedure objects returned be EQ? to each other? The Common Lisp book addresses this issue on page 88: ``In situations where a closure of a lambda-expression over the same set of bindings may be produced more than once, the various resulting closures may or may not be EQ, at the discretion of the implementation.'' This allows certain useful optimizations and clarifies the semantics of the language a bit. My reading of the RRRS is that this interpretation is implicitly supported for Scheme; however, there may be other opinions. Does anyone object to adding similar language to the RRRS to clarify our intent? Regards, David Bartley -------  Date: Wed, 20 Nov 85 16:34:57 EST From: Jonathan A Rees Subject: Backquote algorithm To: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of 20 Nov 1985 0950-CST from David Bartley Message-ID: <[MIT-MC.ARPA].725723.851120.JAR> Date: 20 Nov 1985 0950-CST From: David Bartley I'd appreciate it if you'd send me a copy of your quasiquote expander. Actually I think it'll be easier if I send it to everyone. I hope people who don't care about this don't mind if I clutter their mailboxes with it. I won't guarantee that this 100% works, but I've tested it a little. - Jonathan ;;; An expansion of `x or (#!quasiquote x) is obtained by calling ;;; the procedure EXPAND-QUASIQUOTE with argument x. ;;; The expansion involves only QUOTE forms and calls to LIST, APPEND, ;;; and CONS*. CONS* is the only nonstandard procedure, and its ;;; semantics could be given by ;;; (define (cons* x . rest) ;;; (if (null? rest) x (cons x (apply cons* rest)))) ;;; It should be pretty easy to eliminate the use of CONS*, if that's ;;; desirable. (define (expand-quasiquote x) (define quasiquote-marker '#!quasiquote) (define unquote-marker '#!unquote) (define splice-marker '#!unquote-splice) (define (finalize-quasiquote mode arg) (cond ((eq? mode 'quote) `',arg) ((eq? mode 'unquote) arg) ((eq? mode 'splice) (error ",@ in illegal context" arg)) (else `(,mode ,@arg)))) ;; The continuation argument c is passed two values, mode and arg. ;; These are interpreted as follows: ;; mode arg meaning ;; QUOTE x 'x ;; UNQUOTE x x ;; LIST (x1 x2 ...) (LIST x1 x2 ...) ;; CONS* (x1 x2 ...) (CONS* x1 x2 ...) ;; APPEND (x1 x2 ...) (APPEND x1 x2 ...) (define (descend-quasiquote x level c) (cond ((not (pair? x)) (c 'quote x)) ((interesting-to-quasiquote? x quasiquote-marker) (descend-quasiquote-pair x (1+ level) c)) ((interesting-to-quasiquote? x unquote-marker) (cond ((= level 0) (c 'unquote (cadr x))) (else (descend-quasiquote-pair x (- level 1) c)))) ((interesting-to-quasiquote? x splice-marker) (cond ((= level 0) (c 'splice (cadr x))) (else (descend-quasiquote-pair x (- level 1) c)))) (else (descend-quasiquote-pair x level c)))) ;; It would be simple to make this generate only a correct expansion; ;; most of the complexity here is in order to generate an ;; "optimized" expansion. (define (descend-quasiquote-pair x level c) (descend-quasiquote (car x) level (lambda (car-mode car-arg) (descend-quasiquote (cdr x) level (lambda (cdr-mode cdr-arg) (cond ((and (eq? car-mode 'quote) (eq? cdr-mode 'quote)) (c 'quote x)) ((eq? car-mode 'splice) (cond ((and (eq? cdr-mode 'quote) (null? cdr-arg)) (c 'unquote car-arg)) ((eq? cdr-mode 'append) (c 'append (cons car-arg cdr-arg))) (else (c 'append (list car-arg (finalize-quasiquote cdr-mode cdr-arg)))))) ((and (eq? cdr-mode 'quote) (null? cdr-arg)) (c 'list (list (finalize-quasiquote car-mode car-arg)))) ((or (eq? cdr-mode 'list) (eq? cdr-mode 'cons*)) (c cdr-mode (cons (finalize-quasiquote car-mode car-arg) cdr-arg))) (else (c 'cons* (list (finalize-quasiquote car-mode car-arg) (finalize-quasiquote cdr-mode cdr-arg)))))))))) (define (interesting-to-quasiquote? x marker) (and (pair? x) (eq? (car x) marker) (pair? (cdr x)) (null? (cddr x)))) (descend-quasiquote x 0 finalize-quasiquote))  Date: Wed, 20 Nov 85 16:26:09 EST From: Jonathan A Rees Subject: backquote proposal To: dyb%indiana.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of Wed 20 Nov 85 14:05:35 est from Kent Dybvig Message-ID: <[MIT-MC.ARPA].725715.851120.JAR> Date: Wed, 20 Nov 85 14:05:35 est From: Kent Dybvig I like the proposed backquote change, especially since it will allow source pretty-printers to print the original source more accurately. However, I do not like the use of "special objects" as special form keywords, and would prefer to use ordinary identifier names instead. The reason is that, at least in my implementation, #! objects are not symbols, and special form keywords are. I would very much like to retain this property, for selfish reasons and because I think it helps anyone writing program-manipulating programs to know that special forms always begin with a symbol. Also, it would not make sense that backquote special forms are written with the #! if the quote special form is not. OK, I sympathize with this. What do other people think? Note that I never said that #!QUASIQUOTE was NOT a symbol. Things would work just fine if it was, although it would have to print the same way it reads. Therefore, I propose that we use the special forms quasiquote, unquote, and unquote-splice, dropping the #!. I never suggested making #!UNQUOTE and #!UNQUOTE-SPLICE be special forms, so I don't think this argument applies to them, except through guilt by association (i.e. consistency). I think that flagging the printed representation of these markers with #! is good since it helps indicate that something peculiar is going on. There is always the danger that someone will write something like (memq z '(quote unquote quasiquote)) and that someone else will lift that code and put it into backquote form somewhere, say (let ((form `(cond ((memq z '(quote unquote quasiquote)) ,value)))) (compute form)) and then get the error message "QUASIQUOTE unbound variable" and wonder what on earth happened. This violation of referential transparency is unavoidable with non-local features like backquote, but the #!'s help mitigate it since they make the violation more voluble.  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 20 Nov 85 15:15:56 EST Received: from indiana by csnet-relay.csnet id ab14216; 20 Nov 85 15:03 EST Date: Wed, 20 Nov 85 14:05:35 est From: Kent Dybvig To: JAR@mit-mc.arpa Subject: Re: backquote proposal Cc: RRRS-AUTHORS@mit-mc.arpa I like the proposed backquote change, especially since it will allow source pretty-printers to print the original source more accurately. However, I do not like the use of "special objects" as special form keywords, and would prefer to use ordinary identifier names instead. The reason is that, at least in my implementation, #! objects are not symbols, and special form keywords are. I would very much like to retain this property, for selfish reasons and because I think it helps anyone writing program-manipulating programs to know that special forms always begin with a symbol. Also, it would not make sense that backquote special forms are written with the #! if the quote special form is not. Therefore, I propose that we use the special forms quasiquote, unquote, and unquote-splice, dropping the #!. Kent  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 20 Nov 85 12:02:56 EST Received: from ti-csl by csnet-relay.csnet id ab12530; 20 Nov 85 11:48 EST Date: 20 Nov 1985 0943-CST From: David Bartley Subject: Re: backquote proposal To: JAR%mit-mc.arpa@csnet-relay.arpa, RRRS-AUTHORS%mit-mc.arpa@csnet-relay.arpa cc: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA In-Reply-To: Your message of 19-Nov-85 1712-CST Received: from csl60 by ti-csl; Wed, 20 Nov 85 10:03 CST From JAR: > I haven't heard any objection to the backquote proposal. Unless > you guys reply soon, it will become part of the next edition of the > RRRS, and I'll complain when my programs don't run in your > implementations, so speak up. > ... > (Don't worry, I'll send out a proposed precise description > before a new edition of the RRRS happens.) Is this tongue-in-cheek or are you suggesting/presuming another round of ``standardization'' for Scheme? David Bartley -------  Date: Tue, 19 Nov 85 16:30:10 EST From: Jonathan A Rees Subject: backquote proposal To: RRRS-AUTHORS@MIT-MC.ARPA Message-ID: <[MIT-MC.ARPA].724256.851119.JAR> I haven't heard any objection to the backquote proposal. Unless you guys reply soon, it will become part of the next edition of the RRRS, and I'll complain when my programs don't run in your implementations, so speak up. Small change: I think the name #!unquote-splice is a little better than #!splice. If anyone would like to suggest better names, you're welcome to. Reminder: the proposal is that `x is identical to (#!quasiquote x) ,x is identical to (#!unquote x) ,@x is identical to (#!unquote-splice x) for the purposes of QUOTE and READ. Also, (#!quasiquote ) is a new special form which means the same thing as ` , and within a one may write (#!unquote ) instead of ,; similarly for #!unquote-splice. (Don't worry, I'll send out a proposed precise description before a new edition of the RRRS happens.) I have an implementation of a post-read-time #!quasiquote macro expander (written in scheme, of course) which I'll send to anyone requesting it. One interesting feature of it is that it does the "right" thing with nested backquotes, that is, e.g. (equal? '`(a ,b) ``(a ,b)) returns a true value, which isn't the case in most backquote implementations that I know of. I don't know whether this behavior is dictated by my backquote proposal, but it seems like desirable behavior to me. Jonathan  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 15 NOV 85 00:41:14 EST Date: Fri, 15 Nov 1985 00:37 EST Message-ID: From: CPH%MIT-OZ@MIT-MC.ARPA To: Jinx%MIT-OZ@MIT-MC.ARPA Cc: JAR@MIT-MC.ARPA, Bartley%CSL60%TI-CSL.CSNet@CSNET-RELAY.ARPA, RRRS-Authors@MIT-MC.ARPA Subject: Syntactic extensions to Scheme I would like to point out that I believe that MIT Scheme has more power in its syntactic extension mechanism than is necessary or desirable. In particular, I have found that there are only two general scenarios for macro use (please correct or expand on this): ---------------------------------------------------------------------- The first case is the local use of a macro to eliminate syntactic repetition. For example, suppose one wanted to say something like: (define (lambda (x) (vector-ref x 1))) (define (lambda (x) (vector-ref x 2))) ... (define (lambda (x) (vector-ref x n))) in this case a local macro could be used profitably, as in (MIT Scheme): (let-syntax ((make-ref (macro (var index) `(DEFINE ,var (LAMBDA (X) (VECTOR-REF X ,index)))))) (make-ref 1) (make-ref 2) ... (make-ref n)) The combination of LET-SYNTAX and MACRO used here gives sufficient power to do local definitions like this, and also to have defined the macro somewhere else, say, and access it later by name, as in (let-syntax ((make-ref the-make-ref-macro)) ...) Note that this makes no explicit use of the concept of syntax table, despite the fact that it is implemented using that mechanism. The issue of what environment in which the binding value of the LET-SYNTAX form is evaluated is important here, and in fact is specified externally by passing an explicit environment to the syntax expander program as described by Jinx earlier. But in practice, local macros such as these require only the normal global environment, and any local procedures can be defined using something like: (let-syntax ((make-ref (let () (define ...) (define ...) (macro ...)))) ...) A particular difference between this mechanism and the DEFINE-SYNTAX mechanisms described by JAR and Jinx is that, psychologically, it is less likely that someone would be confused about what environment and at what time the binding value of the LET-SYNTAX would be evaluated. Such confusion is still possible, but I believe it is easier to think of the binding value as being separate from the rest of the code than in the case where the macro definition appears as just another top level definition. ---------------------------------------------------------------------- The second case is the definition of a whole language of syntactic extensions, which will be used over a large body of code. An example of this usage is my recent implementation of the Edwin editor, in which a number of macros for defining editor commands, variables, and modes were used throughout much of the source code. I believe that this case is the one that normally provides much of the trouble. I have taken the following conservative approach (I suspect that this will draw some good criticism; my only response is that it has proven adequate to date): * I have assumed that each type of syntactic "language" will correspond directly to a particular syntax-table. The single-inheritance mechanism used by syntax tables is sufficient for many needs. * I have assumed that definition of the syntactic extensions is syntactically separate from their use. This implies that the syntax defined by the extensions is NOT available for use at the time that the definitions of the extensions are processed by the syntactic expander. (I hope that this wording is sufficiently general that it does not restrict my comments to MIT Scheme's implementation.) In practice, this means the following thing: that the syntax definitions reside in a separate file(s) from the source code that refers to them, and that the definitions do not become effective until the file(s) is loaded. Here is an example of how one would define some syntax: (define edwin-syntax-table ;; This means the parent of this syntax table is the "normal" one. (make-syntax-table system-global-syntax-table)) (syntax-table-define edwin-syntax-table 'define-command (macro ...)) (syntax-table-define edwin-syntax-table 'define-variable (macro ...)) etceta. Supposing the above expressions to be evaluated in some environment, then the following code could be evaluated (or compiled without evaluation): (using-syntax edwin-syntax-table (define-command ("^R Forward Character" argument) ...) (define-variable "Comment Multi Line" ...) ... ) The major advantage of using this approach is that the scoping problems associated with macros are sidestepped, since the environments in which everthing happens are eval-time environments, rather than syntax-time environments. Also, the confusion caused by mixing syntactic and semantic definitions in the code is eliminated. It can (easily) be argued that this separation is a drawback as well. I think that this is more a problem of presentation than anything else. In particular, given the standard file-oriented methods of maintaining systems, there is only one ordering or definitions, which applies for both editing and evaluation. More sophisticated systems, which I believe many people are now developing, would allow the editing presentations to be separate from the evaluation order. ---------------------------------------------------------------------- In summary, I think that the DEFINE-SYNTAX and DEFINE-MACRO mechanisms in both T and MIT Scheme are not particularly useful, and in fact are confusing. Their convenience I believe to be dubious, although I will understand if there is disagreement on this point. I do not see that they add any real power that is not available in the cases I have outlined above through the other mechanisms I describe. In passing, I would like to note that I DO NOT believe that scoping of syntactic keywords should be related to scoping of environment variables. I think that the only reason that this has ever been considered a reasonable idea is because of the (perhaps regrettable) choice to make the notation used for syntactic forms and combinations identical. However, I would be curious to see if KMP's semantics sheds new light on this issue. The other issue, about the use of SCode, is not so clear. In general, I have tried to assume that syntactic extensions map from one domain to another. Assuming that the two domains are separate is a generality that turns out to have advantages in implementation, as it allows the use of SCode, while not precluding purely source-to-source macroexpansion, nor T's method of "absolute special forms". One of the examples that Jinx presented violated this principle (see the use of WHILE-XFORM in the second example). I am not really sure how to handle this problem, but the separation of the domain and range of the syntax mapping seems like a good principle to start with.  Received: from MIT-OZ by MIT-MC.ARPA via Chaosnet; 14 NOV 85 18:36:00 EST Date: 14 Nov 1985 18:28 EST (Thu) Message-ID: From: Bill Rozas To: Jonathan A Rees Cc: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA, RRRS-AUTHORS@MIT-MC.ARPA Subject: Syntactic extensions to Scheme (long message) In-reply-to: Msg of 14 Nov 1985 13:19-EST from Jonathan A Rees Just a few comments explaining a little about how syntax tables in MIT Scheme currently work. Scheme programs are translated into SCode (1) before either the evaluator or the compiler touch them, thus macros do not have to be handled by either but by this translator, which we call the syntaxer. The syntaxer has one required argument (the expression to translate to Scode), and two optional arguments. One of them is the syntax table to use in the translation. The other is the syntax environment, which is the environment in which to evaluate any expressions which must be evaluated at syntax time, e.g. lambda expressions which result in macro expanders. If these arguments are not given explicitly, default values are used. These default values can be manipulated with the procedures current-syntax-table, set-current-syntax-table!, with-syntax-table, and the analogous ones for syntax environments. There are some operations defined on syntax tables: (syntax-table-define ) (syntaxt-table-ref
) -> and some less interesting ones. A translator is supposed to translate an s-expression into Scode, but source to source macro facilities are also provided. Some syntax time syntactic sugar is also provided, thus (define-syntax ) Does a syntax-time syntax-table-define on the current syntax table, (let-syntax (( )...) . ) is analogous to LET, creating a new syntax table, (using-syntax . ) is the syntax-time with-syntax-table, and (macro . ) is analogous to LAMBDA but "returns" a source level translator, similar in effect to traditional Lisp Macros, except that they are expanded at syntax time. We do not provide "reader" syntax for obtaining translators or denoting expansions, we instead invoke them explicitely to insure that the correct expander is used. Thus if WHILE is defined by (define-syntax while (macro (pred . exps) `(let ((pred (lambda () ,pred)) (body (lambda () ,@exps))) (letrec ((loop (lambda () (if (pred) (begin (body) (loop)))))) (loop))))) The following 2 definitions have different effects, depending on whether WHILE is redefined (define-syntax for (macro (index low high . exps) `(let ((,index ,low)) (while (<= ,index ,high) ; Dynamic use of WHILE ,@exps (set! ,index (1+ ,index)))))) (define-syntax for (let ((while-xform ; The translator (syntax-table-ref (current-syntax-table) 'WHILE))) (macro (index low high . exps) `(let ((,index ,low)) ,(while-xform ; Static use `(while (<= ,index ,high) ,@exps (set! ,index (1+ ,index)))))))) Note that the traditional Lisp DEFMACRO can be obtained by (define-syntax defmacro (macro (pattern . body) (let ((the-name (car pattern)) (the-translator `(macro ,(cdr pattern) ,@body))) `(begin (syntax-table-define (current-syntax-table) ; Runtime ',the-name ,the-translator) (define-syntax ,the-name ,the-translator))))) ; Syntax (compile) time In the example that Jonathan gives we would also have problems, since the default is that the current syntax environment is the same as the read-eval-print loop environment, thus the code would "work" interpreted but not compiled. Changing current syntax environment is a possibility which we should probably do. We do not object to using Scheme for syntactic extensions. As a matter of fact, I never write syntactic expressions which do not end up being macros (after grabbing the appropriate stuff), only the bottom level (the core special forms) translate to Scode, but there is no reason not to make this an option in our system. I agree with Jonathan in that syntax should not be used for optimizations, and also in that lambda-bound names should take precedence over syntactic items. Thus in his binding-of-IF example, I think that '(1 2 3) should be returned. We probably will change our system to do this, but we have not yet done that. (1) Scode can be thought of as a small, explicit version of Scheme, where every expression is an explicit special form, thus it has VARIABLE and COMBINATION explicit special forms.  Date: Thu, 14 Nov 85 13:19:18 EST From: Jonathan A Rees Subject: Syntactic extensions to Scheme To: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA cc: RRRS-AUTHORS@MIT-MC.ARPA In-reply-to: Msg of 7 Nov 1985 1112-CST from David Bartley Message-ID: <[MIT-MC.ARPA].717960.851114.JAR> Date: 7 Nov 1985 1112-CST From: David Bartley ... I'm particularly interested in experiments such as T's with extensive support for syntax tables. You ask hard questions. Therefore this message is very long. Before answering the questions, I'll summarize how T syntax tables work: The evaluator and compiler each take a syntax table argument. A syntax table maps symbols to objects called "syntax descriptors". A syntax descriptor is basically either a primitive token, for primitive special forms like QUOTE and LAMBDA, or it's a macro expansion function. Syntax descriptors print as #[Syntax ...], where ... somehow identifies the descriptor (name or whatever). When the compiler processes a form whose car is a symbol, the first thing it does is look up the symbol in the syntax table. If there is an entry in the table, say (syntax-table-entry table 'foo) => #[Syntax BAR], the form is processed as if the form (#[Syntax BAR] ...) had been seen. When a form whose car is a syntax descriptor is processed, the appropriate thing happens; if the descriptor is a macro expander, the form is expanded; if it's primitive, the form is interpreted as a QUOTE or LAMBDA or whatever expression. Users can create syntax tables, fetch and store entries in them, and pass them to the evaluator or compiler. A macro (macro-expander ...) creates a descriptor for a macro expander. If no syntax table argument is supplied to EVAL or LOAD, a syntax table is obtained from the environment argument by accessing the value of a lexical environment variable with a special internal name (let's suppose the name is &&syntax-table&&; in principle it's something untypable). There is a special form DEFINE-SYNTAX which basically does a RUNTIME side-effect of storing a descriptor into &&syntax-table&&. (DEFINE-SYNTAX has no effect at compile time.) In addition, DEFINE-LOCAL-SYNTAX defines a macro local to a file or expression, analogous to MACROLET in Common Lisp. This has no effect at runtime. (DEFINE-MACRO (FOO ...) ...) is a horrible dual-purpose form which is the same as (BEGIN (DEFINE-SYNTAX (FOO ...) ...) (DEFINE-LOCAL-SYNTAX (FOO ...) ...)). End of summary, now for the questions. What are their strengths and weaknesses? Strength: the ability to make "absolute references" to standard syntactic forms - i.e. (#[Syntax LAMBDA] (X) (+ X 5)) - is a boon to embedded incompatible language implementation. E.g. suppose you wanted to define a syntax table where DEFINE was a macro which expanded into something which wanted to use the usual Scheme DEFINE. It wouldn't work to do (define-syntax (define ...) `( ... (define ...) ...)) since this is circular. But you could do (define-syntax (define ...) `( ... (#[Syntax DEFINE] ...) ...)) where #[Syntax DEFINE] denotes the standard syntax descriptor for DEFINE. This is how I was able to write implementations of RRRSS and even a sort of bastard Common Lisp in T, even though there were multiple incompatibilities. And because everything is scoped, one environment's macros aren't seen in different environments, so one can easily debug programs different parts of which are written using completely different syntax. The standard system macros all do the absolute reference trick, and the effect is that a macro can expand a form into a new form which uses a different macro, without having to worry about whether the macro in the expansion was defined in the syntax table being used to process the expansion. Thus descriptors can be moved from one syntax table to another without worrying too much about how they're defined - they act almost like closures. (Note that this problem doesn't come up in Common Lisp because one "closes" over an environment (package) at read time. Macros in Scheme have to be more complicated since Scheme did the right thing with symbols.) Weaknesses: there are the usual kinks. Best to give an illustration. Suppose a Scheme program (file) contains the following forms: (define-macro (foo form) (bar form)) (define (bar form) `(#[syntax lambda] () (baz ,form))) (define (baz obj) (cons (cdr obj) (car obj))) ((foo (cons 'a 'b))) If your model of macro expansion is that it's part of what the evaluator does, then this "works:" the last form evaluates to (b . a). This is the simplest explanation of what macos do, and it's what I think most naive users believe happens. Indeed it is what happens in many Lisp interpreters. However, the code is obviously sensitive to what happens at compile time and what happens at run time. Presumably the compiler does not load the file. So when the FOO form gets expanded by the file compiler, the expander tries to call BAR, which is undefined. T is badly engineered in the sense that this code will run interpreted and not compiled. This is because the environment in which the macro expansion function is closed is the same as the one in which the expansion will run. (Common Lisp has the same problem.) Users should have to work much harder than this to find inconsistencies like this. A small improvement would be if LOAD "macroexpanded" the whole file before running any of it; then FOO would be undefined at expand time. But this isn't a very general solution. What if the forms occur in different files? What if the code is reloaded into the same environment - will it work the second time, using the definition of FOO from the first LOAD? In the best of all possible worlds, macros and the auxiliary functions they call to perform expansions should only have to exist for the purposes of compilation (and EVAL, debugging, etc.); at runtime they can go away or be garbage collected. (Consider using a Scheme compiler to build a small stand-alone system which wants to run in minimal address space. Expansion functions are superfluous at runtime.) This suggests that macros should always inhabit separate modules in their own separate environments. These modules can be loaded for compilation purposes, if a client requests, but needen't exist at runtime. Users should have to go to the extra trouble of announcing that they are defining a compilation-support module to cause DEFINE-SYNTAX itself to become available at all. I haven't even talked about the problem of the scoping of BAZ. As it is, any client of the FOO macro must know that the variable BAZ occurs free in the expanded form, and suffer the consequences. I don't see any way around this if out-of-core compilation is going to work. A listing of what variables are free (in principle one needs at most 1 such) must simply be part of the documentation of FOO. (One can play tricks with system primitives like CAR, but the problem remains when the expansion needs a user-defined function.) Whew... obviously this is much too complicated. I have sympathies for the trstricted Indiana approach to macros, although I'm not sure it gets around all the various problems, and I would resist giving up the ability to write arbitrary expansion procedures. But I am certainly not asking anyone to adopt T's syntax mechanism, since the specification is so bug-ridden. What do they cost in terms of compiler or runtime complexity? In spite of all this wind, the implementation is very straightforward. One just puts hooks into EVAL and the compiler in the right place, and passes the syntax tables around as appropriate. Do users make effective use of them? As long as macros exist, users will abuse them. But I think they have their place, and a small number of people know how to use them well. It has been suggested that a licensing agency be established. As far as syntax tables go, I don't know of anyone who has done anything with them quite as hairy as what I've described, but I think that the people who exploit multiple environments (there aren't many) implicitly use the fact that macros get scoped appropriately; they know there won't be name conflicts, the same way they know that there won't be name conflicts for variables. How flexible are they---would they suffice for major language changes? I think what I said above about embedded languages answers this for the most part - in the affirmative. Clearly they wouldn't be effective e.g. to change scoping or evaluation order rules, but they work well to change the meaning of (LAMBDA ...). Should syntactic extensions be specified entirely in source terms, or should the user have access to internal representations? I have a running argument with the MIT Scheme folks about this; I think that Scheme makes a perfectly good representation for programs. There are some scoping problems to be addressed, e.g. any macro expander which wants to understand anything about a subexpression must know what syntax table the subexpression is to be interpreted relative to, but as long as some convention is established (there are a couple of alternatives) and programmers are made aware of these issues, I don't think we need to go to the trouble of defining new data types for representing programs. That would be very complicated and there'd be little hope of making the interface portable. Should simple source-to-source optimizations be communicated to the compiler using the same mechanism, or is something else more appropriate? Definitely not. I'm not sure that users should ever specify optimizations to a compiler; perhaps it would be acceptable if the compiler had some mechanism by which it could prove the correctness of such transformations before deciding it was okay to do them. Also, the situation with free variables can be very complicated. But in any case, just for pedagogical reasons one shouldn't do anything to lead users to believe that macros and procedures have any similarites at all. It's bad enough that they are syntactically similar. [By the way, even though T and MIT Scheme agreed some years back that (LET ((IF LIST)) (IF 1 2 3)) should evaluate to 2 , and I used to think this was the only acceptable semantics, now I'm not convinced that this is the right thing; I think Kent Pitman has a coherent semantics which would allow it to evaluate to (1 2 3), without sacrificing compilability or purity. But I'll let him explain that. I'm not convinced he's right either.]  Date: Wed, 13 Nov 85 19:13:47 EST From: Jonathan A Rees Subject: a modest backquote proposal To: RRRS-AUTHORS@MIT-MC.ARPA Message-ID: <[MIT-MC.ARPA].716925.851113.JAR> I would like to propose the following: we should agree on a standard internal form for backquote forms. That is, if a portable Scheme program does (READ) and the available input is e.g. "`(FOO ,BAR (A ,@BAZ B))", or if the expression '`(FOO ...) occurs in the program itself, the program should be able to detect the fact that it has its hands on a backquoted form, and take it apart and do something meaningful with it. I would like the situation to be exactly the same as that of the 'FOO syntax, where we know e.g. that (CAR ''FOO) => QUOTE and (CADR ''FOO) => FOO. Just to make this concrete, here is one form that this proposal could take: we could say that '`(FOO ,BAR (A ,@BAZ B)) is semantically identical to '(#!QUASIQUOTE (#!UNQUOTE BAR) (A (#!SPLICE BAZ) B)). We need not specify what the objects #!QUASIQUOTE etc. are; they could be symbols or special constants. All that's needed is e.g. that (EQ? '#!UNQUOTE '#!UNQUOTE) will work, that is, you get an EQ object each time you read that syntax - the same situation as with symbols. Thus we would have: (CAADR '`(FOO ,BAR)) => #!UNQUOTE, and so on. Note that this implies nothing about implementation except that the "expansion" of backquote forms into calls to LIST etc., if indeed that's how it's done, isn't generally done at "read time," but rather at "compile time" or "macro expansion time". (Expansion at read time would of course be correct if the file was being read by LOAD and the backquote form was known to occur outside of a QUOTE special form, etc. All that's important is behavior detectable by user code.) We would just need to add #!QUASIQUOTE as a new special form or macro. I know that this would be a trivial change to MIT Scheme and T, for example, since they already work this way. Now here's the rationale: (1) This removes a wart in the meaning of READ and QUOTE. Without this, there is absolutely nothing in the report which guarantees that reading or quoting backquote forms is even legal, much less that it has a predictable meaning. But the absence of a representation for backquote forms is the only obstacle to this. Given such a representation, we have a very nice general syntactic property: any sequence of terminals which may be derived from the the nonterminal may also be derived from the terminal. That is, READ and QUOTE give standard representations of programs as data. (2) As a consequence, it will be possible to write portable interpreters, compilers, cross-compilers, macroexpanders, cross-reference programs, pretty-printers, readers, and so forth, which will work on any program written in RRRSS, even those that use backquote. It seems to me that the desirability of being able to write meta-programs would far outweigh any arguments about implementation preemption. I guess I might be asking some of you to modify your implementations (e.g. some implementations might be expanding at read time), so there's a tiny bit of work involved in this, but I think it would generally be a very easy change for most implementors to make. (By the way, there is a subtlety involved in expanding nested backquote forms AFTER read time which I'll explain under separate cover if someone asks me to.) The fact that Common Lisp doesn't have this feature has made it difficult to write a portable implementation of RRRSS in Common Lisp. To do it right I'll have to define my own backquote read macro and expander, which is silly of course. The situation is worse in Scheme, which doesn't have read macros - one would actually have to write a reader in order to write any portable meta-program. [Unrelated footnote: LOAD and EVAL are equipotent; either can be implemented in terms of the other. RRRS has one but not the other.]  Date: Tue, 12 Nov 85 22:04:47 EST From: Jonathan A Rees Subject: test message To: RRRS-AUTHORS@MIT-MC.ARPA Message-ID: <[MIT-MC.ARPA].716206.851112.JAR> Please ignore. (I want to make sure that sending to this list actually works.)  Received: from CSNET-RELAY.ARPA by MIT-MC.ARPA 8 Nov 85 12:49:59 EST Received: from ti-csl by csnet-relay.csnet id ad09069; 8 Nov 85 12:44 EST Date: 7 Nov 1985 1112-CST From: David Bartley Subject: Syntactic extensions to Scheme To: RRRS-Authors@mit-mc.arpa cc: Bartley%CSL60%ti-csl.csnet@CSNET-RELAY.ARPA Received: from csl60 by ti-csl; Fri, 8 Nov 85 10:09 CST One of the issues shunted aside a year ago was the question of how to provide users with the ability to make syntactic extensions. I think it would be fruitful to have some calm discussion on this topic in the RRRS-AUTHORS mailing list. I doubt if we're ready to standardize on any particular approach, but it would be helpful if we all knew more about the alternatives and people's experiences with them. I'm particularly interested in experiments such as T's with extensive support for syntax tables. What are their strengths and weaknesses? What do they cost in terms of compiler or runtime complexity? Do users make effective use of them? How flexible are they---would they suffice for major language changes? Should syntactic extensions be specified entirely in source terms, or should the user have access to internal representations? Should simple source-to-source optimizations be communicated to the compiler using the same mechanism, or is something else more appropriate? Regards, David Bartley -------