Date: Sun, 9 Mar 86 13:30:36 EST From: "Devon S. McCullough" Sender: LIZZIT@MC.LCS.MIT.EDU Subject: lousy MIT-TAC modems To: BUG-ARPA@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].844023.860309.LIZZIT> Logged into MC from home using my VA3451 and AAA, lines of UUU's suddenly appeared and there was a lot of beeping. Hanging up and calling back revealed that MC thought I was still logged in! I'm told that it is a common bug in Bell 212 modems to drop into test mode spontaneously, and the only way to fix it without hanging up is to interrupt the carrier for 1/20 of a second, but no more or it will hang up. --Devon  Date: Thu, 20 Mar 86 16:25:22 EST From: "Christopher C. Stacy" Subject: [TLTUNG: Follow-up on SU-SIERRA] To: BUG-TCP@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].857389.860320.CSTACY> Date: Thu, 20 Mar 86 09:10:26 EST From: Thye-Lai Tung To: BUG-HOSTS at MC.LCS.MIT.EDU Re: Follow-up on SU-SIERRA I tried fingering SU-SIERRA from MIT-CAF and MIT-XX and got an average response time of 4 to 6 seconds. However, I always got timed out on MC. - Thye-Lai  Date: Thu, 20 Mar 86 16:34:25 EST From: "Christopher C. Stacy" Subject: Follow-up on SU-SIERRA To: TLTUNG@MC.LCS.MIT.EDU cc: BUG-TCP@MC.LCS.MIT.EDU, BUG-HOSTS@MC.LCS.MIT.EDU In-reply-to: Msg of Thu 20 Mar 86 16:25:22 EST from Christopher C. Stacy Message-ID: <[MC.LCS.MIT.EDU].857404.860320.CSTACY> Date: Thu, 20 Mar 86 09:10:26 EST From: Thye-Lai Tung To: BUG-HOSTS at MC.LCS.MIT.EDU Re: Follow-up on SU-SIERRA I tried fingering SU-SIERRA from MIT-CAF and MIT-XX and got an average response time of 4 to 6 seconds. However, I always got timed out on MC. - Thye-Lai I suspect that SIERRA does not respond on its ARPAnet connection, but that it does on its SU-NET-TEMP address (36.40.0.213), since when I use the latter on MC, I tend to get through. Maybe XX does not try SIERRA's ARPAnet address or something.  Received: from XX.LCS.MIT.EDU by MC.LCS.MIT.EDU 20 Mar 86 23:49:59 EST Date: Thu, 20 Mar 1986 23:47 EST Message-ID: From: Rob Austein To: "Christopher C. Stacy" Cc: BUG-HOSTS@MC.LCS.MIT.EDU, BUG-TCP@MC.LCS.MIT.EDU, TLTUNG@MC.LCS.MIT.EDU Subject: Follow-up on SU-SIERRA In-reply-to: Msg of 20 Mar 1986 16:34-EST from "Christopher C. Stacy" both sierra's an20 and the su-net gateway are flakey. luck of the draw which one is flaking out at the moment. i believe current software and tables will cause mc to always try su-net address and xx to always try arpanet address. although on xx it depends on whether the program you are using uses hosts3 or the gthst% system call, since they return different "primary" addresses with the current tables.  Received: from AI.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 27 MAR 86 09:50:28 EST Date: Thu, 27 Mar 86 09:49:24 EST From: Richard Mlynarik Subject: tcp send on AI failed. To: BUG-TCP@AI.AI.MIT.EDU Message-ID: <[AI.AI.MIT.EDU].21811.860327.MLY> ":fs khs@tanager" (192.10.41.47) fails IOCERROR: TCP: Host 0, Local 0, Foreign 0 - CHNL IN ILLEGAL MODE ON IOT TCPIN1>>.IOT E,T T/ E  Date: Wed, 2 Apr 86 02:45:27 EST From: "Paul R. Grupp" Subject: MILnet<->ARPAnet lossage To: BUG-TCP@MC.LCS.MIT.EDU cc: GRUPP@MC.LCS.MIT.EDU Message-ID: <[MC.LCS.MIT.EDU].869762.860402.GRUPP> There seems to be a bug in the ITS TCP code which causes the following problem. When connected to MC or AI from the MILnet the echo time is between 30 seconds and 3 minutes, with output pauses of the same time. This by it self is only frustrating and may be due to the heavy ARPA<->MILNET gatway traffic, however these long pauses seem to be causing other problems. The worst is about every 1-10 minutes the TAC reports "Host reset the connection" and can not recover from this. Once in a while the TAC reports "Host not responding" or "Destination host dead". After any of these messages re-opening the connection will work but, responce reamins about the same. This problem has been very noticable for the last month or so, but has gotten quite bad in the last 3-4 days, and tonight is all but unusable with most time spent reattaching trees and recovering work.  Received: from XX.LCS.MIT.EDU by MC.LCS.MIT.EDU 2 Apr 86 22:44:02 EST Date: Wed 2 Apr 86 22:46:11-EST From: "J. Noel Chiappa" Subject: Re: MILnet<->ARPAnet lossage To: GRUPP@MC.LCS.MIT.EDU, BUG-TCP@MC.LCS.MIT.EDU cc: JNC@XX.LCS.MIT.EDU In-Reply-To: <[MC.LCS.MIT.EDU].869762.860402.GRUPP> Message-ID: <12195766488.25.JNC@XX.LCS.MIT.EDU> No, it's probably just the usual congestion overload collapse lossage, nothing to do with ITS. I've had the same thing happen from a TAC on the MILNET at ISI. Call up DCA and complain. Noel -------  Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 7 MAY 86 11:10:01 EDT Date: Wed, 7 May 1986 10:53 EDT Message-ID: From: Peter de Jong To: bug-arpanet%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU Subject: [Mailer: Message of 7-May-86 10:21:09] What happened to Xerox on the arpanet? Peter Date: Wednesday, 7 May 1986 10:26-EDT From: The Mailer Daemon To: DEJONG at OZ.AI.MIT.EDU Re: Message of 7-May-86 10:21:09 Message failed for the following: allmer.pasa@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "xerox" bagley.pa@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "xerox" dekleer@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" dts.xsis@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "xerox" halvorsen@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" withgott.pa@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "xerox" ------------ Date: Wed, 7 May 1986 10:21 EDT Message-ID: From: Peter de Jong To: Cog-Sci@OZ.AI.MIT.EDU Subject: Cognitive Science Calendar Reply-to: Cog-Sci-Request%OZ.AI.MIT.EDU@AI.AI.MIT.EDU start-date: 5/8/86 12:00pm expiration-date: 5/11/86 8:00pm cog-sci-calendar digest ---------------------------------------------------------------------- Date: Wednesday, 5 May 1986 10:00-EST start-date: 5/8/86 12:00pm expiration-date: 5/8/86 1:00pm Subject: Vision Lunch - - - - Thursday, 8 May 12:00pm Room: E25-401 VISION LUNCH "Shifting Visual Attention" Adam Reeves Northeastern University ------------------------------ Date: Wednesday, 5 May 1986 10:00-EST From: etzi%oz.ai.mit.edu@ai.ai.mit.edu start-date: 5/11/86 7:00pm expiration-date: 5/11/86 8:00pm Subject: Harvard-Radcliffe Cognitive Sciences Society - - - - Sunday, 11 May 7:00 pm Room: Dunster Junior Common Room, Harvard HARVARD-RADCLIFFE COGNITIVE SCIENCES SOCIETY "Cognition and Creativity" Howard Gardner Harvard END OF cog-sci-calendar digest ******************************  Date: Mon, 26 May 86 06:43:42 EDT From: "Mark E. Becker" Subject: Many "Host reset the connection" messages To: BUG-TCP@AI.AI.MIT.EDU Message-ID: <[AI.AI.MIT.EDU].46074.860526.MBECK> Hello - During the course of this weekend, I have several times received the message from my local TAC that MIT-AI has disconnected me. This happens anywhere from right after the connect banner to 15 minutes into a session. To put it mildly: "This Is A Pain". On the other hand, maybe something I'm doing is causing it. Doing an @S a y t is guarenteed to do a disconnect. Your suggestions/explainations would be appreciated. Mark Becker  Date: Mon, 26 May 86 06:52:58 EDT From: "Mark E. Becker" Subject: Forgot to mention "Host not responding" To: BUG-TCP@AI.AI.MIT.EDU Message-ID: <[AI.AI.MIT.EDU].46075.860526.MBECK> That one pops up a lot too.. M.  Date: Sat, 31 May 86 04:13:37 EDT From: "Pandora B. Berman" Subject: MC TCP loss To: BUG-ITS@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU Message-ID: <[AI.AI.MIT.EDU].49182.860531.CENT> while on MC and trying to finger out at other hosts, i kept getting an "all sockets in use" msg. alan poked around and said the system was out of TCP buffers and COMSAT was acting strangely, so i should raise switch 0 and take a crash dump. CRASH;TCP LOSS.  Date: Fri, 20 Jun 86 13:34:02 EDT From: David Chapman To: BUG-TCP@AI.AI.MIT.EDU Message-ID: <[AI.AI.MIT.EDU].59523.860620.ZVONA> I've been supduping in from the left coast a lot recently. AI keeps dropping net connections. The failure mode is that it just times out. A lot of the time packets come in just inside the timeout window, instead. This problem is not due to my end, because I've had the same problem coming in from Stanford, SRI, and Xerox, and Agre's got the same problem coming in from MCC. I suppose it could be the Arpanet, but I don't remember having this problem going in the opposite direction (right to left). I was blaming subnet six for a while, but of course that's not relevant. Any ideas? Oh yeah, also, it's not supdup-specific; I've the same problem with FTP.  Received: from MC.LCS.MIT.EDU by AI.AI.MIT.EDU via Chaosnet; 10 SEP 86 00:45:21 EDT Received: from MX.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 10 SEP 86 00:43:20 EDT Date: Wed, 10 Sep 86 00:40:49 EDT From: "Mark E. Becker" Subject: It's back... To: BUG-TCP%MX.LCS.MIT.EDU@MC.LCS.MIT.EDU cc: MBECK%MX.LCS.MIT.EDU@MC.LCS.MIT.EDU Message-ID: <[MX.LCS.MIT.EDU].946133.860910.MBECK> Hello - For the sixth time this evening I have seen the message: "Host reset the connection" appear on my terminal. This is unsettling as I had thought this problem solved some time back.. The usual scenario of this happening goes like: I call up a local TAC, and issue @E R and @D C A commands. I @O to 10.1.0.6 and receive the message "TAC trying... Open" After reading the sign-on banner, I log in successfully. My next keyhit gives me the message "Host reset the connection". Indeed, constructing this message is the longest I have been on this evening so far. The TAC I'm using is DDN-PMO-MIL-TAC. Your help is appreciated. Mark Becker (turist) MBECK@MX  Received: from MC.LCS.MIT.EDU by AI.AI.MIT.EDU via Chaosnet; 11 SEP 86 20:47:21 EDT Received: from MX.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 11 SEP 86 20:45:24 EDT Received: from MC.LCS.MIT.EDU by MX.LCS.MIT.EDU via Chaosnet; 11 SEP 86 20:42:36 EDT Received: from XX.LCS.MIT.EDU by MC.LCS.MIT.EDU 11 Sep 86 20:45:13 EDT Date: Thu 11 Sep 86 19:11:07-EDT From: "J. Noel Chiappa" Subject: Re: It's back... To: MBECK%MX.LCS.MIT.EDU@MC.LCS.MIT.EDU, BUG-TCP%MX.LCS.MIT.EDU@MC.LCS.MIT.EDU cc: JNC@XX.LCS.MIT.EDU In-Reply-To: <[MX.LCS.MIT.EDU].946133.860910.MBECK> Message-ID: <12238183740.45.JNC@XX.LCS.MIT.EDU> This is caused by network congestion; so many of your packets are being dropped that the host is giving up. Call the NOC and complain; it's nothing to do with anything in ITS or at MIT. -------  Received: from MC.LCS.MIT.EDU by AI.AI.MIT.EDU via Chaosnet; 8 NOV 86 06:54:36 EST Received: from MX.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 8 NOV 86 06:52:10 EST Date: Sat, 8 Nov 86 06:49:39 EST From: "Mark E. Becker" Subject: Congratulations. To: BUG-TCP@MX.LCS.MIT.EDU Message-ID: <[MX.LCS.MIT.EDU].957105.861108.MBECK> As all I usually read in mailings to BUG-xx lists is complaints of poor performance or bug reports, I'd like to take a moment to congratulate whomever did whatever in upgrading the Internet performance from the recent low level. I now have real-time echo and haven't seen a "Host reset the connect" or "Host not responding" message in several days. Being of a curious nature, I ask: Was something done at MIT or was the Internet node eating network capacity taken off the air? Mark Becker  Received: from MC.LCS.MIT.EDU by AI.AI.MIT.EDU via Chaosnet; 14 NOV 86 10:44:57 EST Received: from MX.LCS.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 14 NOV 86 10:42:00 EST Received: from XX.LCS.MIT.EDU by MX.LCS.MIT.EDU via Chaosnet; 14 NOV 86 10:38:55 EST Date: Fri 14 Nov 86 07:03:51-EST From: "J. Noel Chiappa" Subject: Re: Congratulations. To: MBECK@MX.LCS.MIT.EDU, BUG-TCP@MX.LCS.MIT.EDU cc: JNC@XX.LCS.MIT.EDU In-Reply-To: <[MX.LCS.MIT.EDU].957105.861108.MBECK> Message-ID: <12254839486.67.JNC@XX.LCS.MIT.EDU> As far as I know absolutely nothing has changed, either in ITS or at MIT, to cause the network performance problems (which were due to congestion problems in the ARPANet) to go away. BBN has a good idea what the causes of the problem are (there are several, working together), but some of the fixes are long lead time (i.e. involve ordering new cross country trunks, which takes aeons), and I doubt they are in yet. I think you may just be seeing random fluctuations. Noel -------  Date: Tue, 2 Dec 86 20:59:14 EST From: David Vinayak Wallace Subject: Shushi reprise? Can't connect to wiscvm.wisc.edu To: BUG-TCP@AI.AI.MIT.EDU cc: ZVONA@AI.AI.MIT.EDU, JAR@AI.AI.MIT.EDU, BUG-MAIL@AI.AI.MIT.EDU Message-ID: <125575.861202.GUMBY@AI.AI.MIT.EDU> ITS cannot connect to wiscvm.wisc.edu (the bitnet gaeway) and has not been able to for a week. XX can get there just fine. Phu  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 1 Feb 87 05:59:53 EST Received: from ci.sei.cmu.edu (TCP 20000566527) by MC.LCS.MIT.EDU 1 Feb 87 06:00:48 EST Message-Id: <8702011056.AA01643@ci.sei.cmu.edu> Date: 1 Feb 1987 0556-EST (Sunday) From: Patrick.Barron@sei.cmu.edu To: bug-arpanet@mc.lcs.mit.edu Fcc: blind_copy Subject: Problem with AI.MIT.EDU domain server? I'm not sure if this is the right place to send this, but: Right now, I can not get to your subnetwork 128.52.22; on this subnet are PREP.AI.MIT.EDU, HERMES.AI.MIT.EDU, and HEPHAESTUS.AI.MIT.EDU, which are the only registered domain name servers for AI.MIT.EDU. I tried telnetting directly to these machines from AI.AI.MIT.EDU, but I was told that these machines are Chaosnet-only??? Just thought you'd want to know.... --Pat (a/k/a PDB@AI.AI.MIT.EDU)  Date: Sun, 1 Feb 87 16:47:50 EST From: Alan Bawden Subject: Problem with AI.MIT.EDU domain server? To: Patrick.Barron@SEI.CMU.EDU cc: BUG-ARPANET@AI.AI.MIT.EDU, BUG-TELNET@AI.AI.MIT.EDU In-reply-to: Msg of 1 Feb 1987 0556-EST () from Patrick.Barron at sei.cmu.edu Message-ID: <147551.870201.ALAN@AI.AI.MIT.EDU> Date: 1 Feb 1987 0556-EST (Sunday) From: Patrick.Barron at sei.cmu.edu To: bug-arpanet at mc.lcs.mit.edu Re: Problem with AI.MIT.EDU domain server? I'm not sure if this is the right place to send this, but: Probably not. As with any BUG-xxx mailing list, your mail was redirected to Bug-Random-Program, which is how it got to me. Right now, I can not get to your subnetwork 128.52.22; on this subnet are PREP.AI.MIT.EDU, HERMES.AI.MIT.EDU, and HEPHAESTUS.AI.MIT.EDU, which are the only registered domain name servers for AI.MIT.EDU. All I can suggest is that the hardware was probably broken. It seems to be working right now. I tried telnetting directly to these machines from AI.AI.MIT.EDU, but I was told that these machines are Chaosnet-only??? This is a bug in the TELNET program. What TELNET is trying to tell you is that this host is -also- on Chaosnet. This should not prevent it from using TCP, but there is a bug somewhere... There happens to be a separate program for Chaosnet TELNET called :CHTN. There is also :SUPDUP which works over either network (as long as the foreign machine supports the Supdup protocol). Just thought you'd want to know.... --Pat (a/k/a PDB@AI.AI.MIT.EDU)  Date: Mon, 2 Feb 87 04:04:11 EST From: "Pandora B. Berman" Subject: MIT access To: USER-ACCOUNTS@AI.AI.MIT.EDU, amit@UMN-CS.ARPA cc: BUG-TCP@AI.AI.MIT.EDU, JAR@AI.AI.MIT.EDU Message-ID: <147780.870202.CENT@AI.AI.MIT.EDU> Date: Mon, 2 Feb 87 00:38:50 CST From: amit@umn-cs (Neta Amit) To: cent@ai.ai.mit.edu Subject: Re: test Thanx for your reply. Just got it. Three questions regarding your network and its software: (1) is there a login-guest account on mit-ai and/or mit-prep? (2) prep has had problems with arpanet during the weekend. It's up, but unreachanble from either umn-cs or cornell. Moreover, Jonathan Rees from MIT tried to establish connection *from* prep to umn-cs, and prep responded "network unreachable". Do you know anything about it? Any timeframe for a fix? (3) The reason for all of that stuff is that I'm trying to get T3 from mit over to umn. Rees put this package (5 files, 2-3 MB each) on Zurich. I could access zurich, but couldn't ftp -- either from it to umn, or from umn. Ftp would hang before sending the first packet. Do you have any clue? --Neta Amit (amit@umn-cs.arpa) U of Minnesota CSci i don't know what kind of guest access prep has. AI does not have a general guest account; however, if you connect to it you will be given an opportunity to apply for a personal guest acct, and your reason for wanting one (reaching Rees's software) is good, so you should have no trouble. be warned that AI runs the Incompatible Timesharing System, which resembles un*x very little. we like it. from AI, you should be able to get at files on zermatt. i have no idea what prep's problem is in talking with your machine. maybe someone on BUG-TCP will have an idea, or will be able to redirect your complaint to someone better able to answer it.  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 29 Apr 87 16:55:39 EDT Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 29 APR 87 16:39:42 EDT Date: Wed, 29 Apr 1987 16:28 EDT Message-ID: From: Peter de Jong To: bug-arpanet%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU Subject: [Mailer: Message of 28-Apr-87 17:50:52] The following addresses have been failing for about a week. Peter de Jong Date: Tuesday, 28 April 1987 17:53-EDT From: The Mailer Daemon To: DEJONG at OZ.AI.MIT.EDU Re: Message of 28-Apr-87 17:50:52 Message failed for the following: alpern@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "ibm-sj.arpa" andya@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbnccp" bgoodman@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" boyle@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "cmu-psy-a" broberts@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" dekleer@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" dstallard@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" halvorsen@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" kcorker@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" levin@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "cmuc" lwelber@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbncc6" madams@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbna" mvilain@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" scacchi@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "isib" ticsl@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbna" ------------ Date: Tue, 28 Apr 1987 17:50 EDT Message-ID: From: Peter de Jong To: Cog-Sci-Distribution@OZ.AI.MIT.EDU Subject: Cognitive Science Calendar Reply-to: Cog-Sci-Request%OZ.AI.MIT.EDU@MC.LCS.MIT.EDU Start-Date: 5/1/87 2:15pm Expiration-Date: 5/1/87 3:15pm cog-sci-calendar digest ----------------------------------------------------------------- Date: Tuesday, 28 April 1987 17:55-EDT From: Mary E. Spollen subject: non-monotonic seminar - - - - Friday, 1 May 2:15pm Room: NE43-512A REPORTING THE NON-MONOTONIC NEWS: Keeping the Beat Local by BENJAMIN GROSOF Computer Science Department Stanford University ``Non-monotonic'' reasoning systems are ones in which some conclusions have a default or retractible status. A prime motivation for such systems is to build agents that revise their beliefs in response to news from their environment. Efficient updating is problematic, however, because adding new information in general may require the revision of many, or even all, previous retractible conclusions. An understanding is needed of the ``partial monotonicities'' of updating, i.e. of the irrelevance of updates to parts of the previous retractible conclusions. To define non-monotonic theories, we introduce a formalism based on McCarthy's circumscription that directly expresses, as axioms, both default beliefs and preferences among default beliefs. It has a strong semantics based on first- and second- order logic. We characterize non-monotonic theories as hierarchically decomposable in a manner more analogous to programming languages than to ordinary monotonic logics. We then give a set of results about partial monotonicities of updating. We discover some surprising differences between updates consisting of default axioms and those consisting of non-retractible axioms. These results bear on a wide variety of applications of non-monotonic reasoning. Host: Gerald Jay Sussman END OF cog-sci-calendar digest ******************************  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 29 Apr 87 19:30:23 EDT Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 29 APR 87 18:33:41 EDT Received: from AI.AI.MIT.EDU by OZ.AI.MIT.EDU with Chaos/SMTP; Wed 29 Apr 87 18:27:00-EDT Date: Wed, 29 Apr 87 18:28:10 EDT From: Alan Bawden Subject: [Mailer: Message of 28-Apr-87 17:50:52] To: DEJONG@OZ.AI.MIT.EDU cc: bug-arpanet@OZ.AI.MIT.EDU In-reply-to: Msg of Wed 29 Apr 1987 16:28 EDT from Peter de Jong Message-ID: <192998.870429.ALAN@AI.AI.MIT.EDU> Date: Wed, 29 Apr 1987 16:28 EDT From: Peter de Jong To: bug-arpanet at OZ ... The following addresses have been failing for about a week. ... alpern@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "ibm-sj.arpa" andya@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbnccp" bgoodman@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" boyle@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "cmu-psy-a" broberts@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" dekleer@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" dstallard@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" halvorsen@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "parc" kcorker@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" levin@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "cmuc" lwelber@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbncc6" madams@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbna" mvilain@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbng" scacchi@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "isib" ticsl@OZ.AI.MIT.EDU.#Chaos: Can't forward - unknown host "bbna" ... Replace these obsolete host names as follows: bbna => BBN.COM bbncc6 => CC6.BBN.COM bbnccp => CCP.BBN.COM bbng => G.BBN.COM cmu-psy-a => A.PSY.CMU.EDU cmuc => C.CS.CMU.EDU ibm-sj.arpa => IBM.COM isib => C.ISI.EDU parc => XEROX.COM  Date: Wed, 15 Jul 87 01:55:10 EDT From: Alan Bawden Subject: IP routing table To: BUG-TCP@AI.AI.MIT.EDU, BUG-ITS@MC.LCS.MIT.EDU, JNC@MC.LCS.MIT.EDU In-reply-to: Msg of Tue 14 Jul 87 13:40:44 EDT from Alan Bawden Message-ID: <227628.870715.ALAN@AI.AI.MIT.EDU> Date: Tue, 14 Jul 87 13:40:44 EDT From: Alan Bawden Date: Tue, 14 Jul 87 02:44:52 EDT From: J. Noel Chiappa You know, I wish there were some tools to manipulate this table...., some command that could delete routes, or modify the entyry, could be a real win. This sounds like an easy one night hack for someone. I'd do it if I was certain just what kinds of commands were desirable. Stop by my office some day, and spend 5 minutes helping me design it. I was in the mood for a little hack, so I wrote one. (Source on SYSNET;REDRCT.) Try out: :ALAN;REDRCT (If anyone can think of a good directory for this to live on other than ALAN, let me know.) With no JCL it will remind you of its usage as follows: Usage is: :REDRCT Gateways can be given as hostnames, or in decimal octet form (as in "10.0.0.77"). If the new gateway is omitted, REDRCT will simply pick a likely-looking gateway from ITS's list of main gateways. REDRCT will ask for confirmation once after it interprets its command, to be sure it properly understood what you asked it to do. Then for each routing table entry it finds that uses the old gateway, it will ask for confirmation before clobbering it with the new gateway.  Date: Sun, 2 Aug 87 20:34:10 EDT From: Alan Bawden Subject: IP routing table To: BUG-ITS@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU, JNC@AI.AI.MIT.EDU In-reply-to: Msg of Wed 15 Jul 87 01:55 EDT from Alan Bawden Message-ID: <236463.870802.ALAN@AI.AI.MIT.EDU> Date: Wed, 15 Jul 87 01:55 EDT From: Alan Bawden ... Try out: :ALAN;REDRCT (If anyone can think of a good directory for this to live on other than ALAN, let me know.)... It finally occured to me that the right directory for something like this is the one named ".". (Remember ":.;BOOT11" on the KL?) So make that: :.;REDRCT to fool with IP routing.  Date: Wed, 21 Oct 87 19:14:51 EDT From: Alan Bawden Subject: [hqm: Some interesting problems] To: BUG-TCP@AI.AI.MIT.EDU cc: HQM@AI.AI.MIT.EDU Message-ID: <272532.871021.ALAN@AI.AI.MIT.EDU> Perhaps someone else can say something about the issues Henry raises here and save me the trouble of having to learn this stuff myself. Please? Date: Tue, 20 Oct 87 21:25:57 EDT From: Henry Minsky To: ALAN at AI.AI.MIT.EDU, CJL at AI.AI.MIT.EDU Re: AI TCP lossage I seem to be having trouble when I telnet to AI from my radio IP site. What seems to be happening is that the delay path for a packet from the ninth-floor radio, through the green bldg, to me is about 2 or 3 seconds. That seems to be too long for AI to wait for an acknowledge, so it retransmits the packet. I see the same packet repeated every second being sent from AI. Is AI really that impatient? I thought TCP/IP was supposed to have binary exponential backoff for failed packets. (or is that only for detected ethernet collision?) Maybe I have diagnosed the problem wrong, but that's what it looks like to me. Is AI's TCP/IP code wired to not wait longer than a few seconds for a packet to be acknowledged? Henry Date: Wed, 21 Oct 87 13:54 EDT From: Henry Minsky To: alan at MIT-AI.ARPA, cjl at MIT-AI.ARPA, hqm at VALLECITO.SCRC.Symbolics.COM Re: Some interesting problems I was talking to the guy who implemented the TCP/IP radio network package, about the trouble I was having linking to the internet. Date: Wed, 21 Oct 87 13:26:44 edt From: karn@faline.bellcore.com (Phil R. Karn) Message-Id: <8710211726.AA25509@faline.bellcore.com> To: hqm@VALLECITO.SCRC.Symbolics.COM Subject: Re: Some interesting problems Cc: karn@faline.bellcore.com Yes, there are MANY implementations of TCP out there that retransmit on a hair trigger. I believe they are responsible for much of the congestion we see on the ARPANET. Lean heavily on the system administrators to get them to clean up their TCPs. They should do the following things: 1. Increase the initial estimated round trip time to 5 seconds or more. (4.2BSD initially had it as 0.5 seconds!) 2. Make sure the round trip timing algorithm in the host doesn't suffer from collapse due to retransmission ambiguity. See the paper by Craig Partridge and myself in last summer's ACM SIGCOMM proceedings. 3. Make sure the host implements the "Nagle algorithm" for tinygram avoidance (see RFC 896). You can test this by running a program on the host that deliberately does single-character writes to the network, and seeing if they all come out as a flood of uncontrolled 1-byte tinygrams. Not only will these fixes help you over radio, but they'll help greatly anyone using SLIP links. It'll also help alleviate ARPANET congestion. Phil  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 23 Oct 87 16:57:52 EDT Received: from OZ.AI.MIT.EDU by MC.LCS.MIT.EDU via Chaosnet; 23 OCT 87 16:55:35 EDT Received: from REAGAN.AI.MIT.EDU by OZ.AI.MIT.EDU with Chaos/SMTP; Fri 23 Oct 87 16:54:20-EDT Received: from puffed-wheat.ai.mit.edu (COCOA-PUFFS.AI.MIT.EDU) by REAGAN.AI.MIT.EDU via INTERNET with SMTP id 66049; 23 Oct 87 16:53:59 EDT Received: by cocoa-puffs.ai.mit.edu; Fri, 23 Oct 87 15:51:58 EDT Date: Fri, 23 Oct 87 15:51:58 EDT From: dilip@wheaties.ai.mit.edu (Dilip A. Soni) Message-Id: <8710231951.AA01356@cocoa-puffs.ai.mit.edu> To: BUG-ARPANET@oz.ai.mit.edu Subject: Connect to a machine in Princeton Hi I would like to connect/login to a Vax named siemens.com (magic number 192.5.31 ) and to a SUN named cadillac.siemens.com (magic number 192.5.31.112). Neither FTP nor TELNET work as the host name/number are not known. The gateway site is princeton.edu and any mail to user%siemens@princeton.edu does get delivered. Siemens does have a "restricted" commercial arpanet license. So the question is: What do I need to do to achieve my goal? Any pointers in the above matter would be greatly appreciated. Thanks a bunch. - Dilip ----- End Forwarded Message -----  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 6 Nov 87 17:11:29 EST Received: from SRI-NIC.ARPA (TCP 1200000063) by MC.LCS.MIT.EDU 6 Nov 87 17:10:23 EST Date: Fri, 6 Nov 87 13:54:46 PST From: Ken Harrenstien Subject: [hedrick@aramis.rutgers.edu (Charles Hedrick): new Arpanet end to end protocol] To: bug-tcp@MC.LCS.MIT.EDU cc: bug-its@MC.LCS.MIT.EDU Message-ID: <12348532467.31.KLH@SRI-NIC.ARPA> If this message is true, ITS systems will have problems, since the IMP code counts RFNMs. I guess new code would need to be added which (depending on a runtime flag setting) handles the new scheme. But someone needs to find out exactly what the new scheme is first... --------------- Received: from aramis.rutgers.edu by SRI-NIC.ARPA with TCP; Fri 6 Nov 87 12:06:51-PST Received: by athos.rutgers.edu (5.54/1.14) id AA24392; Fri, 6 Nov 87 02:20:54 EST Date: Fri, 6 Nov 87 02:20:54 EST From: hedrick@aramis.rutgers.edu (Charles Hedrick) Message-Id: <8711060720.AA24392@athos.rutgers.edu> To: tcp-ip@sri-nic.arpa Subject: new Arpanet end to end protocol I have just heard from a reliable source a fairly interesting fact about the new end to end protocol implemented in PSN 7.0. (Note that my terminology is probably slightly off in this message. I don't know anything about the imp to host protocol, so I am almost certainly introducing some distortion in passing on this information.) Apparently one of the efficiency improvements in the new end to end protocol is that the IMP's will no longer attempt to return a RFNM for each packet. You will be expected to look at the ID number included in the RFNM's. Any outstanding RFNM's with ID numbers lower than the current one are also to be considered as acknowledged. Many implementations apparently simply count RFNM's. They assume that one acknowledgement is received per packet. This will no longer be true with the new end to end protocol, and so these implementations will break. I have some reason to think that most existing implementations fall into this category. Tests of the new end to end protocol are scheduled for Nov 7, 14-15, and 18. Implementors should be alert to misbehaviors during these test periods. -------  Date: Mon, 9 Nov 87 13:37:05 EST From: Alan Bawden Subject: MC:CRASH;PI FAULT To: BUG-ITS@AI.AI.MIT.EDU cc: MAP@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU In-reply-to: Msg of Mon 9 Nov 87 12:36:25 EST from Michael A. Patton Message-ID: <282278.871109.ALAN@AI.AI.MIT.EDU> Date: Mon, 9 Nov 87 12:36:25 EST From: Michael A. Patton I just reloaded MC. It had gotten a "PAGE FAULT WITH PI IN PROGRESS". Dump is in MC:CRASH;PI FAULT. Seems to have restarted all right. This one is a good joke that has happened once before. The fault was taken by the TCP checksumming code. Presumably what happens is that when a packet arrives with a large enough bogus length, the checksumming code applies the checksumming algorithm to a huge block of memory that starts with the packet and extends up to some nonexistent page.  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 10 Nov 87 14:12:11 EST Received: from SRI-NIC.ARPA (TCP 1200000063) by MC.LCS.MIT.EDU 10 Nov 87 13:45:09 EST Date: Tue, 10 Nov 87 10:24:43 PST From: Ken Harrenstien Subject: [Andy Malis : Re: new Arpanet end to end protocol ] To: bug-tcp@MC.LCS.MIT.EDU Message-ID: <12349542805.44.KLH@SRI-NIC.ARPA> Evidently previous message was a false alarm, after all. --------------- Received: from CC5.BBN.COM by SRI-NIC.ARPA with TCP; Tue 10 Nov 87 08:28:15-PST To: Charles Hedrick cc: tcp-ip@SRI-NIC.ARPA, malis@CC5.BBN.COM Subject: Re: new Arpanet end to end protocol In-reply-to: Your message of Fri, 06 Nov 87 02:20:54 -0500. <8711060720.AA24392@athos.rutgers.edu> Date: Tue, 10 Nov 87 09:49:17 -0500 From: Andy Malis Charles, Your message is quite wrong (I know - I designed the new End-to-End). I would be interested in knowing (in private) who your "reliable source" is, so that such rumors can be source quenched. After the recent messages on the tcp-ip list, I'm sure we all realize how important source quenching is. The truth of the matter is: PSN 7.0 has two different End-to-End protocols (old EE and new EE). Either one or the other runs at any particular time, and the two cannot interoperate. The ARPANET is currently using PSN 7.0 with the old EE. It is the new EE that will be tested on Nov. 7, 14-15, and 18. The old EE protocol explicitly returned, across the PSN subnet, a separate RFNM packet for each message delivered to a destination host. This RFNM packet was then converted, in the source PSN, into the 1822 RFNM for that message and delivered to the source host. This had the result that, depending on traffic mixes, roughly about 45% to 50% of the packets going through the subnet were RFNMs. Since the PSN does so much per-packet processing, even for these RFNMs, the network was passing much less host traffic than otherwise might be possible. We fixed this in the new EE by making it an explicitly windowed protocol IN THE SUBNET. The destination PSNs have the ability to aggregate ACKs (the new EE internal equivalent to RFNMs) and send multiple ACKs for the same connection in windowed ACK (by using an INTERNAL message sequence number). In addition, these ACKs can now be piggybacked on data traffic, and many ACKs for different EE connections can be shipped together in the same subnet packet to a source PSN. The important thing to note is that when the destination PSN receives an ACK for a connection, it generates, and sends to the source host, a separate 1822 RFNM for EACH and EVERY message submitted by the host and being acknowledged by the ACK. There are no host-visible sequence numbers; the 1822 protocol stays the same as before. What may have confused you is the fact that we at BBN are, concurrent with the PSN 7.0 testing, trying to track down which ARPANET hosts might be affected by a known BSD 4.2 network software problem that may cause RFNMs to be lost in the host itself (I believe it is related to the size of the message received PREVIOUS to the RFNM). This bug has been fixed in BSD 4.3, and I have been told that Lars Poulsen of ACC (lars@acc.arpa) has a patch for BSD 4.2-derived host software. By the way, we have measured in our internal BBNNET (which has been running PSN 7.0 with the new EE only for over five months now) that only about 14% of the packets through the network only contain ACKs - the rest of the ACKs are being piggybacked on the data traffic. We are very pleased with this result. Also, most of our BBNNET hosts (around 125 C/70s, VAXEN, TOPS-20s, TACs, and others) use 1822, and they have had no problems with the new EE. Regards, Andy -------  Date: Wed, 23 Dec 87 01:54:04 EST From: David Vinayak Wallace Subject: MPV'd TCP job eating core? To: BUG-TCP@AI.AI.MIT.EDU Message-ID: <303201.871223.GUMBY@AI.AI.MIT.EDU> Is there some logical explanation for this? I Y'd a copy to crash;tcpmpv Y in case it got killed or the system crashed before it could be looked at. I inadvertently renamed the job GUMBY TCP in the process, in case the system doesn't crash etc... AI ITS 1615 Peek 629 12/23/87 01:04:55 Up time = 11:04:53:58 Memory: Free=63 Runnable Total=87 Out=71 Users: High=20 Runnable=1 Index Uname Jname Sname Status TTY Core Out %Time Time PIs 10 007M10 TCP HPLABS _10!0 ? DSN 255 192 0% 7:49 REALTM MPV Fair Share 78% Totals: 255 0% 7:49 Logout time = 1:20:25:43 Lost 2% Idle 68% Null time = 7:08:49:53 TCP conns: Ix Usr Uname Jname State RWnd Ibf SWnd ReTxQ Lclprt Fgnprt Fgnhst 15 10 007M10 TCP OPEN 0 5 10000 0 0 31 6216 HPLABS.HP.COM AI ITS 1615 Peek 629 12/23/87 01:04:40 Up time = 11:04:53:43 Uname Jname Idx Memory Histogram : One mark = 6 pages 30+ 60+ 90+ 120+ 150+ 180+ 210+ 240+ ALAN ITS * 4 ++++++++=======--------- COMSAT IV 3 ++++++++++++++++-------------------- ___015 HACTRN 15 +++==========------------- FREE @@@@@@@@@@@ COMSAT JOB.07 2 ==========-- 007M10 TCP * 10 ==========-------------------------------- SYS SYS 0 ++++++++++ ALAN E * 11 =======-------------------------- GUMBY HACTRN 12 +++- EXDDT OOO DSKUDR O GUMBY P 13 + DSKTUT O 050C17 FILE 17 = 070C05 FILE 5 = PFTHMG DRAGON 6 + NETPKT MMP CHAOS DSKMFD DSKBUF 16TLNT TELSER 14 # 15TLNT TELSER 7 # CORE JOB - Available user memory= 334/512 AI ITS 1615 Peek 629 12/23/87 01:05:22 Up time = 11:04:54:25 IMP is up. TCP/IP is available. Ix Usr Uname Jname State RWnd Ibf SWnd ReTxQ Lclprt Fgnprt Fgnhst 23 14 16TLNT TELSER OPEN 11324 0 6161 1 0 27 110525 MULTICS.MIT.EDU 15 10 007M10 TCP OPEN 0 5 10000 0 0 31 6216 HPLABS.HP.COM 16 buffers (8 free) Status of non-free packet buffers: 15 TCP-IN(15) QF2 @425000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack 30 TCPRTR(23) Out-done QF2 @420400 IP hdrlen=5 totlen=199 ptcl=TCP src=AI.AI.MIT.EDU dst=MULTICS.MIT.EDU ports=23->37205 Ack Push 31 TCP-IN(15) QF2 @430000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack 34 CUROUT(23) @424000 IP hdrlen=0 totlen=9008 ptcl=TCP src=MULTICS.MIT.EDU dst=AI.AI.MIT.EDU ports=37205->23 Ack 41 TCP-IN(15) QF2 @421400 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack 62 NO-QUE Out-done QF2 @424400 IP hdrlen=5 totlen=556 ptcl=TCP src=AI.AI.MIT.EDU dst=0 ports=0->0 Ack Push 66 TCP-IN(15) QF2 @430400 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack 101 TCP-IN(15) QF2 @420000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack Status of non-free packet buffers: 15 TCP-IN(15) QF2 @425000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack IPGIPT Net input alloc IPRDGM Input from net 31 TCP-IN(15) QF2 @430000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack IPGIPT Net input alloc IPRDGM Input from net 34 CUROUT(23) @424000 IP hdrlen=0 totlen=9008 ptcl=TCP src=MULTICS.MIT.EDU dst=AI.AI.MIT.EDU ports=37205->23 Ack TCPOSB Alloc for STYNET output data 41 TCP-IN(15) QF2 @421400 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack IPGIPT Net input alloc IPRDGM Input from net 62 NO-QUE Out-done QF2 @424400 IP hdrlen=5 totlen=556 ptcl=TCP src=AI.AI.MIT.EDU dst=0 ports=0->0 Ack Push TCPOBW Alloc for IOT output buffer TCPOB6 IOT Send TSOSND Pkt w/seq space added to retransmit queue IPKSND output call IPKSNQ output call IPIODN Packet output complete 66 TCP-IN(15) QF2 @430400 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack IPGIPT Net input alloc IPRDGM Input from net 101 TCP-IN(15) QF2 @420000 IP hdrlen=5 totlen=552 ptcl=TCP src=HPLABS.HP.COM dst=AI.AI.MIT.EDU ports=3214->25 Ack IPGIPT Net input alloc IPRDGM Input from net  Received: from REAGAN.AI.MIT.EDU (CHAOS 13065) by AI.AI.MIT.EDU 23 Dec 87 13:20:03 EST Received: from PIGPEN.AI.MIT.EDU by REAGAN.AI.MIT.EDU via CHAOS with CHAOS-MAIL id 82499; Wed 23-Dec-87 13:20:55 EST Date: Wed, 23 Dec 87 13:22 EST From: Alan Bawden Subject: MPV'd TCP job eating core? To: BUG-TCP@AI.AI.MIT.EDU cc: GUMBY@AI.AI.MIT.EDU In-Reply-To: <303201.871223.GUMBY@AI.AI.MIT.EDU> Message-ID: <871223132235.2.ALAN@PIGPEN.AI.MIT.EDU> Date: Wed, 23 Dec 87 01:54:04 EST From: David Vinayak Wallace Is there some logical explanation for this? ... I replied to this message to Bug-FTP where it should have gone originally. But I happened to notice: AI ITS 1615 Peek 629 12/23/87 01:05:22 Up time = 11:04:54:25 IMP is up. TCP/IP is available. Ix Usr Uname Jname State RWnd Ibf SWnd ReTxQ Lclprt Fgnprt Fgnhst 23 14 16TLNT TELSER OPEN 11324 0 6161 1 0 27 110525 MULTICS.MIT.EDU 15 10 007M10 TCP OPEN 0 5 10000 0 0 31 6216 HPLABS.HP.COM 16 buffers (8 free) Status of non-free packet buffers: ... 62 NO-QUE Out-done QF2 @424400 IP hdrlen=5 totlen=556 ptcl=TCP src=AI.AI.MIT.EDU dst=0 ports=0->0 Ack Push TCPOBW Alloc for IOT output buffer TCPOB6 IOT Send TSOSND Pkt w/seq space added to retransmit queue IPKSND output call IPKSNQ output call IPIODN Packet output complete ... This is apparently a lost TCP buffer. (Sorry John...)  Received: from SPEECH.MIT.EDU by AI.AI.MIT.EDU via Chaosnet; 6 JAN 88 16:44:12 EST Date: Wed 6 Jan 88 16:48:27-EST From: "John Wroclawski" Subject: Re: MPV'd TCP job eating core? To: Alan@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU cc: GUMBY@AI.AI.MIT.EDU In-Reply-To: <871223132235.2.ALAN@PIGPEN.AI.MIT.EDU> Message-ID: <12364522099.28.JTW@MIT-SPEECH> Date: Wed, 23 Dec 87 13:22 EST From: Alan Bawden Subject: MPV'd TCP job eating core? Status of non-free packet buffers: ... 62 NO-QUE Out-done QF2 @424400 IP hdrlen=5 totlen=556 ptcl=TCP src=AI.AI.MIT.EDU dst=0 ports=0->0 Ack Push TCPOBW Alloc for IOT output buffer TCPOB6 IOT Send TSOSND Pkt w/seq space added to retransmit queue IPKSND output call IPKSNQ output call IPIODN Packet output complete ... This is apparently a lost TCP buffer. (Sorry John...) Actually, I'd guess not. Note that it really -is- on a queue, indicated by the QF2 flag being on in the packet. The NOQUE is likely either due to transient condition randomness or more probably because it's on a queue that PEEK doesn't know about (compiled in list of queue names, and so on). But since the machine's been booted since then I have no idea if it did get freed right. I don't suppose anyone knows..? -------  Date: Wed, 6 Jan 88 17:17:31 EST From: Alan Bawden Subject: MPV'd TCP job eating core? To: JTW@SPEECH.MIT.EDU cc: BUG-TCP@AI.AI.MIT.EDU, GUMBY@AI.AI.MIT.EDU In-reply-to: Msg of Wed 6 Jan 88 16:48:27-EST from John Wroclawski Message-ID: <307654.880106.ALAN@AI.AI.MIT.EDU> Date: Wed 6 Jan 88 16:48:27-EST From: John Wroclawski ... 62 NO-QUE Out-done QF2 @424400 IP hdrlen=5 totlen=556 ptcl=TCP src=AI.AI.MIT.EDU dst=0 ports=0->0 Ack Push TCPOBW Alloc for IOT output buffer TCPOB6 IOT Send TSOSND Pkt w/seq space added to retransmit queue IPKSND output call IPKSNQ output call IPIODN Packet output complete ... This is apparently a lost TCP buffer. (Sorry John...) Actually, I'd guess not. Note that it really -is- on a queue, indicated by the QF2 flag being on in the packet. The NOQUE is likely either due to transient condition randomness or more probably because it's on a queue that PEEK doesn't know about (compiled in list of queue names, and so on). But since the machine's been booted since then I have no idea if it did get freed right. I don't suppose anyone knows..? Well, it was still there 10 hours or so after GUMBY captured the original output from PEEK in his message. Also, the fact that the destination and ports are set to zero seems somewhat odd to me as well... BTW, I've been doing some primitive, crude metering of COMSAT performance under your TCP changes. We should talk about this in person sometime.  Date: Wed, 24 Feb 88 06:25:47 EST From: David Vinayak Wallace Subject: weird MC TCP lossage To: BUG-ITS@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU Message-ID: <331156.880224.GUMBY@AI.AI.MIT.EDU> I was unable to get any sort of tcp connection to MC this morning. Chaosnet worked fine. I logged in; the machine was idle! Early morning is often a busy time for ol'MC. I spied on COMSAT, which was idle most of the time. The stats file was very interesting. COMSAT would wake up, connect to some host, deliver all the mail for that host, and then choke when it got to the next host for which it had mail. The way in which it choked was interesting: After the HELO it would read the terminating command from the previous connection. For example COMSAT connected to DECWRL.DEC.COM and apparently delivered lots of mail successfully (one wonders what it was actually sending, but the remote host seemed happy). Anyway, when done with decwrl.dec.com, it tried to talk to some other host, say, bbn.com. The stats file would say something like "ICP BBN.COM: Bad reply 221 decwrl.dec.com signing off." Well, this was certainly weird. I called someone on the ninth floor to ask him to take a crash dump, but mc appears not to have come back up. I don't know what the story is, but it seemed to be happy running in this weird state. Perhaps our local TCP frobber can shed some light on this.  Received: from REAGAN.AI.MIT.EDU (CHAOS 13065) by AI.AI.MIT.EDU 24 Feb 88 16:42:34 EST Received: from PIGPEN.AI.MIT.EDU by REAGAN.AI.MIT.EDU via CHAOS with CHAOS-MAIL id 94062; Wed 24-Feb-88 16:39:52 EST Date: Wed, 24 Feb 88 16:39 EST From: Alan Bawden Subject: weird MC TCP lossage To: GUMBY@AI.AI.MIT.EDU cc: BUG-ITS@AI.AI.MIT.EDU, BUG-TCP@AI.AI.MIT.EDU In-Reply-To: <331156.880224.GUMBY@AI.AI.MIT.EDU> Message-ID: <880224163949.6.ALAN@PIGPEN.AI.MIT.EDU> Date: Wed, 24 Feb 88 06:25:47 EST From: David Vinayak Wallace ... The stats file was very interesting. COMSAT would wake up, connect to some host, deliver all the mail for that host, and then choke when it got to the next host for which it had mail. The way in which it choked was interesting: After the HELO it would read the terminating command from the previous connection. This is the usual situation when some TCP resource runs out. Opening a new pair of TCP channels fails, probably with some kind of device full error, which COMSAT apparently fumbles to produce this behavior. At least, that's my theory. SRA claims that the code in COMSAT couldn't possibly have this bug, and it must be ITS's fault, but I find this hard to believe given that no other program that uses TCP shows any kind of analogous behavior. Unfortunately, it is hard to reproduce this situation so that we can see what COMSAT is -really- doing. ... I called someone on the ninth floor to ask him to take a crash dump, but mc appears not to have come back up.... It looks like it did come back up, but whoever you talked to was typing total jokes on the console. Like giving nonexistent commands, and typing DDT commands to the 8080 front-end, etc.  Received: from MC.LCS.MIT.EDU (CHAOS 3131) by AI.AI.MIT.EDU 23 May 88 04:26:54 EDT Date: Mon, 23 May 88 04:23:59 EDT From: David Vinayak Wallace Sender: GUMBY0@MC.LCS.MIT.EDU Subject: What's with all these TIMWTs? To: BUG-TCP@MC.LCS.MIT.EDU Message-ID: <424674.880523.GUMBY0@MC.LCS.MIT.EDU> Right now there are 26 connections to unix.sri.com in state TIMWT, plus, plus one in FINWT1 to ub.cc.umich.edu. I also had a telser on this machine, although I had dropped the connection hours ago. I could not get MC to respond to a tcp connection. Now I've killed that telser, we have a connection to lll-crg.llnl.gov in that state. There really only looks like there are two lost packets (one to host 0!). Hmm, the net command doesn't seem to help any, so I wonder if you'll ever see this message, john. It appears that unix.sri.com is a "DIMWIT" host..  Received: from REAGAN.AI.MIT.EDU (CHAOS 13065) by AI.AI.MIT.EDU 23 May 88 14:20:00 EDT Received: from PIGPEN.AI.MIT.EDU by REAGAN.AI.MIT.EDU via CHAOS with CHAOS-MAIL id 113962; Mon 23-May-88 14:14:19 EDT Date: Mon, 23 May 88 14:14 EDT From: Alan Bawden Subject: What's with all these TIMWTs? To: GUMBY@AI.AI.MIT.EDU cc: BUG-TCP@AI.AI.MIT.EDU, BUG-ITS@AI.AI.MIT.EDU In-Reply-To: <424674.880523.GUMBY0@MC.LCS.MIT.EDU> Message-ID: <19880523181422.3.ALAN@PIGPEN.AI.MIT.EDU> Date: Mon, 23 May 88 04:23:59 EDT From: David Vinayak Wallace Right now there are 26 connections to unix.sri.com in state TIMWT, plus, plus one in FINWT1 to ub.cc.umich.edu. ... There really only looks like there are two lost packets (one to host 0!). ... Its not a question of lost packets, its a question of TCBs, the per-connection data-structure ITS has to maintain throughout the connection's lifetime. Unfortunately a connection can live on after a user process is done with it while the operating systems do some final handshaking to close things down cleanly. It appears that some new version of Unix is making the rounds that does something that causes this handshaking to take virtually forever. AI and MC each have 30 TCB's. They used to have 20, but I increased that when this problem first started happening. I just had to reload AI for the same reason. There are crash dumps for the interested in AI:CRASH;CRASH TCB and MC:CRASH;TCP BITIT.  Date: Tue, 24 May 88 01:47:53 EDT From: Alan Bawden Subject: TCB's all in use. To: BUG-TCP@AI.AI.MIT.EDU, BUG-ITS@AI.AI.MIT.EDU cc: GUMBY@AI.AI.MIT.EDU, ZVONA@AI.AI.MIT.EDU Message-ID: <384099.880524.ALAN@AI.AI.MIT.EDU> Here is my diagnosis of the lossage. Consider the following code: ; TSIATW - Received ACK while in TIME-WAIT state. This should be ; a re-transmit of the remote FIN. ACK it, and restart ; 2-MSL timeout. TSIATW: METER("TCP: ACK in .XSTMW") MOVSI T,(TC%ACK) TRCPKT R,"TSIATW ACK send in TIME-WAIT" CALL TSOSNR ; Send simple ACK in response. JRST TSITM2 ; and restart 2-MSL timeout. Well, if the guy on the other end keeps sending you ACKs, the timeout keeps getting reset and the TCB never gets freed. I have verified that this is in fact the path that causes the problem by patching that JRST TSITM2 to be a POPJ P, and watching the stuck TCB's all vanish. I actually don't understand the logic here, it would seem to me that you should only be sending an ACK for a actual FIN, not just an ACK. I didn't look to see if the other guy was sending both ACK and FIN or just ACK. Do you suppose it is likely that the other machines all have this bug as well and the two are just spinning their wheels bouncing ACKs back an forth? There does seem to be other code that handles ACKing of FINs elsewhere in the the TCP code, but I don't understand enough to know if it is active when you are in the TIME-WAIT state or not. Conceivably the POPJ P, I patched in might be the solution to the problem? Suggestions?  Received: from XX.LCS.MIT.EDU (CHAOS 2420) by AI.AI.MIT.EDU 24 May 88 02:32:57 EDT Date: Tue, 24 May 1988 02:33 EDT Message-ID: From: JTW@XX.LCS.MIT.EDU To: Alan Bawden Cc: BUG-TCP@AI.AI.MIT.EDU, GUMBY@AI.AI.MIT.EDU, ZVONA@AI.AI.MIT.EDU Subject: TCB's all in use. In-reply-to: Msg of 24 May 1988 01:47-EDT from Alan Bawden From: Alan Bawden Re: TCB's all in use. Here is my diagnosis of the lossage. Consider the following code: ..... I actually don't understand the logic here, it would seem to me that you should only be sending an ACK for a actual FIN, not just an ACK. I didn't look to see if the other guy was sending both ACK and FIN or just ACK. Do you suppose it is likely that the other machines all have this bug as well and the two are just spinning their wheels bouncing ACKs back an forth? The fragment of code you showed is following the (original 1981) spec. Previous steps in the processing are supposed to ensure that the comment (that the incoming thing is a retransmission of a remote FIN) is true. It's possible that there's a small bug in the spec here, or that our code is broken somewhere before this point, but what I really suspect is that the unix on the other end does something unpleasant if the application closes the connection and then goes away before TCP has finished the handshake sequence. The change you made introduces a small but existant possibility that TCP will lose data by prematurely giving up on a connection; so I'd like to think things through a little more carefully before we make it permanant.  Received: from XX.LCS.MIT.EDU (CHAOS 2420) by AI.AI.MIT.EDU 24 May 88 03:30:15 EDT Date: Tue 24 May 88 03:30:19-EDT From: Rob Austein Subject: Re: TCB's all in use. To: ALAN@AI.AI.MIT.EDU cc: BUG-TCP@AI.AI.MIT.EDU, BUG-ITS@AI.AI.MIT.EDU, GUMBY@AI.AI.MIT.EDU, ZVONA@AI.AI.MIT.EDU In-Reply-To: <384099.880524.ALAN@AI.AI.MIT.EDU> Message-ID: <12400803897.66.SRA@XX.LCS.MIT.EDU> The code itself is per spec, although it may not be sufficiently paranoid. See RFC 793, pages 38-39. It includes a diagram which is somewhat easier to follow than the text: Figure 13 "Normal Close Sequence": TCP A TCP B 1. ESTABLISHED ESTABLISHED 2. (Close) FIN-WAIT-1 --> --> CLOSE-WAIT 3. FIN-WAIT-2 <-- <-- CLOSE-WAIT 4. (Close) TIME-WAIT <-- <-- LAST-ACK 5. TIME-WAIT --> --> CLOSED 6. (2 MSL) CLOSED ITS is party "A" in this case. COMSAT tells ITS "close this connection", ITS sends off a FIN. Party "B" ACKs the packet but doesn't ACK the FIN until it feels like it (closing is half-duplex). When party "B" decides to close too, it sends a FIN to ITS (note the odd sequence numbers here). ITS is supposed to ACK this FIN so that party "B" knows the connection has indeed been closed in everybody's opinion (ie, FIN is considered data to the extent that must be ACKed). So ITS sends the ACK and goes into the TIME-WAIT state. If ITS hears nothing for a certain period of time ("2 MSL") ITS assumes everything's cool and punts the TCB. If, however, ITS gets another FIN from party "B", ITS must assume that the ACK ITS just sent got lost, so ITS sends another ACK and resets the timer. Whew. No wonder so many implementers get confused by this! It is easy to see how a misbehaving TCP on party "B" could keep us wedged here forever. The RFC says that the only thing you can get when in a TIME-WAIT state is a retransmission of the other party's FIN. Perhaps the ITS code takes that for a law of nature rather than a description of two working TCPs having a conversation. It would be interesting to know if the packet that caused us to get to TSIATW is really the {FIN,ACK} packet we're assuming. If not I'd think that's immediate grounds for dropping the connection on the floor, since it demonstrates that at least one party is seriously confused about the current state. If it really is a FIN that we keep getting over and over and over, it might be reasonable to keep track of how many times we've gone through this routine and just punt when it gets ridiculous. I think this is even legitimate: either the foreign machine is broken or the intervening path is consistantly losing our ACKs, and in either case it won't do any good to send more ACKs so we might as well not bother. Of course this is the first time I've ever tried to follow all those silly state diagrams in TCP, so I might be completely confused. --Rob -------  Received: from XX.LCS.MIT.EDU (CHAOS 2420) by AI.AI.MIT.EDU 25 May 88 18:17:27 EDT Date: Wed 25 May 88 18:17:03-EDT From: "J. Noel Chiappa" Subject: Re: TCB's all in use. To: SRA@XX.LCS.MIT.EDU, ALAN@AI.AI.MIT.EDU cc: BUG-TCP@AI.AI.MIT.EDU, BUG-ITS@AI.AI.MIT.EDU, GUMBY@AI.AI.MIT.EDU, ZVONA@AI.AI.MIT.EDU, JNC@XX.LCS.MIT.EDU In-Reply-To: <12400803897.66.SRA@XX.LCS.MIT.EDU> Message-ID: <12401227468.31.JNC@XX.LCS.MIT.EDU> Rob, your analysis is 100% on the money, and your selected fix also appears to be the Right Thing. I suggest we do that. Noel -------