PDP-10.its/doc/sysdoc/chaord.57

;comment Copyright (c) 1999 Massachusetts Institute of Technology
;comment See the COPYING file at the top-level directory of this project.
;comment ------------------------------
;skip 1
;list
;lftmar 350
;kset 20fg

				CHAOS	ORDER

			       **** DRAFT ****
NOTES:
Work more on dynamic flow control, see end of that section.
Data grams seem to have a bug that the two parties can never
 agree on whether they both agree that it really happened.  Am I losing?
Flush cruft at end?
Add QES, which is the same as RFC but implies you expect ANS?  Any point to it?
For ITS, a flavor of listening which queues RFCs coming in while index is busy,
 or otherwise avoids that timing problem.
-------
This table of contents has not been kept up to date *******

	Goals
	Non-goals
	Hardware Assumptions and Constraints

User appearance

	Connections
	Contact Names
	ITS implementation
	Lisp Machine implementation

Network Control Program

	Packets
	Hosts attached to more than one subnet
	Subnet and Host numbers
	Packet numbers
	Indices
	Routing
	Operations
	Table of packet field use vs opcode
	Flow and Error Control
	Dynamic Window-size Adjustment
	Uniquization
	Media Handlers
	Buffers
	ITS System Calls

Comparison with LCSNET
	Principle differences
	High priority data packets, interrupts, and flushing
	Data grams
	Multiple messages per packet
	Checksums in the packet
	Host-independent user-level protocols
	Very Small Hosts

Transmission Media

	Ethernet
	TEN11 Interface
	DL10 & DTE20
	Asynchronous line

Higher-Level Protocols

	Telnet
	File Access
	Mail
	Locate-named-service

>Goals

High speed communication between processes running in various local machines.
By "high speed", I mean much faster than the Arpanet.
At least comparable to TU20 magtape (30000 characters/second),
about 10 times the speed of the Arpanet.
(30-50 kcps seems to be the measured preformance, with ITS, 10/2/78)

No undetected errors in data transmission.

Not to depend on a particular medium.  (However, we are compromising
by picking a fixed packet size.  The simplicity and efficiency are worth it.)

Simple enough to put in small pdp11's.  Also, simple to the user.

As much power as the Arpanet but, hopefully, a lot less hair.

Work well for both "telnet" and "file transfer."

The initial implementation in ITS should have the "in-system" part as
small and simple as possible.  This includes no per-host tables
in the NCP, which allows large nets and easy reconfiguration.
There is, of course, a host-name to address translation table
used by "user" programs.  The NCP has to have a per-subnet table
which remembers where the bridge(s) to that subnet from
the local subnet are.

The acknowledgement protocol must be designed not to limit
performance.

Statistical flow control ((see below.))

Avoid bottlenecks such as the Arpanet "control link".  Be immune
to transmission medium failures which cause deleted, duplicated,
or out-of-order packets.


>Non-goals

Byte sizes other than 8 bits.  (pdp10 binary transmission should
be part of a user-level file-transfer/ML-device protocol.)

Compatibility with the Arpanet.

Substituting for TEN11 interface functions such as running
the AI TV11 and XGP.

Automatic routing will be deferred.  Initially the routing
tables will be assembled into the programs.  A host needs
one routing table for each subnet it is connected to
(or one for each network interface it possesses.)


>Hardware Assumptions and Constraints

Transmission is physically in "packets" which have headers, rather
than in, e.g., continuous streams.

The chaos net (ether) interface limits the physical length
of a packet to 4097 bits including overhead bits.  The net result
is the maximum number of data bytes (excluding the header
defined by this protocol) in any packet is 488.  This limitation
will be extended to the whole network (to keep things simple).
However some provision will be made for a possible network-wide
packet-size expansion, which has already been done once.

All transmission media will be assumed to be highly-reliable
but not perfect; "perfect" reliability will be assured by having
the two ends of a connection use an acknowledgement protocol
which detects lost messages.  Transmission media are required
to lose any messages that they don't deliver intact.  (I.e. there
must be hardware checksums.)

>User appearance

The network allows user processes in various machines to
communicate with each other in various ways, for instance,
in imitation of a terminal, or in imitation of a disk file
system.  These facilities are built on top of the basic
capability to send "packets" (a header plus some data in the
form of 8-bit bytes) through the network.  The network undertakes
never to lose or garble any packets, except when the connection
is cut off entirely.

This document defines the low-level, "in-system" part of the
protocol.  On top of this, special programs (running in user-mode)
will implement the higher-level protocol that the general user
program sees.  These protocols and programs won't be discussed
further in this document, but remember that the strange packet
formats and so forth are not seen by most user programs.

>>Connections

When two processes wish to communicate, they establish a
connection between them.  This connection allows two streams
of packets to flow, one in each direction.  [Explain why
connections should be bi-directional rather than uni-directional.
Basically that's what you always want, and it makes things simpler.]

Connections are essentially the only facility provided by the network.
However, when first establishing the connection it is necessary
for the two processes to contact each other, and make each
other known to their respective operating systems.  In addition,
it is often the case (in the usual user-server situation) that
one of the processes does not exist beforehand, but is to be created
and made to run a specified program.

>>Contact Names

The way we choose to implement contacting is to say that one process
is always a "user" and one process is always a "server".  The server
has some "contact name" to which it "listens".  The user requests its
operating system to connect it to a specified contact name at a
specified host.  If a process at that host is listening to that
contact name, the two are connected.  If no one is listening to that
contact name, the operating system must create a server process
which will load itself with the appropriate program and connect up.

Discovering which host to connect to to obtain a given service
is an issue for higher-level protocols.  It will not be dealt
with at all initially (that is, there will be a table of host
names and numbers and the user will have to enter the name.)

Once the connection has been established, there is no more need for
the contact name, and it is discarded.  Indeed, often the contact name
is simply the name of a network protocol (such as "telnet") and several
users may want to have connections to that service at the same time,
so contact names must be "reusable."  (In the other common case, the
contact name will be a "gensym".)

As far as the operating systems involved are concerned, contact names
are simply arbitrary ascii strings defined by user programs.  It is
expected that the various higher-level protocols will define standard
contact names; for instance, to get the telnet protocol one would
connect to "telnet"; to get the file transfer protocol one would
connect to "file-transfer".  If a machine receives a request to connect
to a contact name which no one is currently listening to, a server
process must be created and made to execute a program which decides,
from the contact name, what server program to load and execute, or else
to refuse the request for connection.

Contact names have no relation to file names; they are simply
a device for introducing two processes to each other.  If one was
using the network to transfer a file, one would first contact
the file transfer server at the appropriate host, then send a
packet containing the name of the file to be accessed.


>>ITS system calls

Ordinary user programs will not access the network directly; they will
go indirectly through a job-device or sty-type program which will
use a higher-level protocol to make the network look like what the
user wants, the traditional things being a terminal and a disk
file system.

Since these intermediate user-mode programs for using the network will
exist, there is no reason for the interface to the low level network
provided by the system to look at all like a standard device. Instead,
it will be designed solely for simplicity and ease of implementation,
and for a certain degree of efficiency.  This interface will be
described after the interface between Network Control Programs in
different machines (the low-level protocol) is described.

At some future time the intermediate programs might get moved into the
system for reasons of efficiency, but that should not be allowed to
complicate the initial implementation.

As of October 1978, the opening and closing of connections is completely
device-dependent, as are "status"-type operations, however byte-string
I/O is supported in a fashion which is device-independent except when
errors occur, which is handy in several programs.  Packet-level,
device-dependent I/O is also supported.

The .INSRT-able file of routines NETWRK, will be augmented to handle
both the Chaos net and the Arpa net.


>>Lisp Machine implementation

In the case of the Lisp Machine, the only distinction between user
programs and system programs is who maintains and documents them,
and how carefully.

(More?)

>Network Control Program

This is the part of the operating system(s) that implements the network
(obviously).

>>Packets

The NCP's operate by exchanging packets.  A packet consists of a
header containing control information, and zero or more 8-bit bytes of
data. Hardware restrictions of the Chaos net interface
restrict the maximum length of a packet to 253 16-bit words.  In fact,
we will limit it to 252 words (to make packet buffers in pdp10's be 128
words including two overhead words).  Again for the convenience of
pdp10's, the header should be an even number of 16-bit words.

In this section the packets will be described as they look to a pdp11.
They look the same inside a Lisp Machine, since the byte structure is the
same.  Inside a pdp10, packets are stored with two 16-bit words
left-adjusted in each pdp10 word. Additionally, the bytes in the data
portion of the packet are swapped so as to put them in pdp10 standard
order.  pdp11's that act as network interfaces for pdp10's will be required
to do this byte swapping since they're likely to have more time available
than the 10 to do it in, and can also do it faster, having a special
instruction for it.  pdp10's that communicate directly to the network will
have hardware assistance for byte reshuffling in their interfaces.  See the
transmission media section for how packets are encapsulated during
transmission through the various media.

The header is 8 16-bit words and contains the following fields:

	-----------------------
	|opcode(8) | unused(8)|
	-----------------------
	|fc(4) |  nbytes(12)  |
	-----------------------
	|  destination host # |
	-----------------------
	|  destination index  |
	-----------------------
	|    source host #    |
	-----------------------
	|    source index     |
	-----------------------
	|      packet #       |
	-----------------------
	|    ack packet #     |
	-----------------------

	opcode - tells the receiver of the packet how to interpret
		it.  See the Operations section below.
		This is 8 bits long.  The 128 opcodes with high
		order bit =0 are for NCP use.  The 128 with high
		order bit =1 are for user-program use.

	unused	8 bits reserved for future use.

	fc - forwarding count. 4 bits which count the number of times this
		packet has been forwarded by bridges.  Initially this field
		is always generated as zero.  Each bridge increments it;
		if it overflows, there is assumed to be a loop
		and the packet is discarded.

	nbytes - the number of 8-bit bytes of data in the data part.
		The maximum value of nbytes is 488.  The minimum is 0.
		This is 12 bits long to allow for 4K-bit packets.
		(Actually 12 bits is enough for up to 32K-bit packets.)

	destination host #
		This is divided into two 8-bit fields.  The high
		byte specifies which subnet.  The low byte specifies
		which host on that subnet, and (on ethernet subnets)
		is identical to the hardware host number.  Neither
		field may be zero in a valid host number.

	destination index - index for this connection assigned by the
		destination host's NCP.

	source host # - see destination host #.

	source index - index for this connection assigned by the
		source host's NCP.

	packet # - an ascending reference number used in error and
		flow control (see below).

	ack packet # - used in error and flow control (see below.)

>>Hosts attached to more than one subnet

(This also applies to hosts with more than one interface to
the same subnet, if there ever are any.)

Such hosts ought to act as bridges.  That is, if a packet
is received which is not addressed to this host, it should
be sent back out, using this host's routing tables.  The
forwarding count should be used to prevent infinite loops
in the event of inconsistent forwarding tables in two bridges.

It is undesirable for a host to have more than one number.
So a host connected to multiple subnets should choose one
subnet as its "home", which is the address which is advertised
as that host.  The host's other network connections are
un-named bridges.  In some causes it may be preferable
not to pick a "home" subnet; instead, one invents a new
private subnet which only has that one host on it,
and all the host's network connections act as bridges
to that subnet (also to each other).

The routing should be set up so that packets destined for such
a host from a subnet on which it has an interface choose that
interface as the bridge to that host, so that in fact data flows
the same way as if the host had more than one number and the
host-name to host-number lookup magically chose the right number.


>>Subnet and host numbers.

Subnet numbers are arbitrary.  Host numbers are assigned according to
position on the cable, as explained (elsewhere).

These numbers may be found in the file SYSENG;HOSTS >

The physical cable address of a host should include both its subnet number
and its host number (prior to October 1978 the subnet field of a physical
address was always 0).  This allows the physical address to be used as
a unique machine identifier and makes it possible for a host to discover
its full 16-bit host address without prior knowledge.

>>Packet numbers

Each time the sending user puts another packet into the network, this
number is increased by one.  (These numbers are independent for the
two directions of a connection.)  The receiver uses these numbers to
get the packets into order and ensure that there are no duplications
nor omissions.  The packet numbers are 16 bits and wrap around to zero
when they overflow.  When the connection is first opened, an initial
value for the packet# is established.  If it was 0, then the packet#
of the first data packet would be 1.

Packet #'s should be compared modulo 2**16.  On pdp11's, use

	CMP A,B
	BMI <A is less>		(BMI rather than BLT or BLO)

On pdp10's, use

	SUB A,B
	TRNE A,100000		(rather than CAMGE A,B)
	 JRST <A is less>

On Lisp machines, use

	(AND (BITTEST 100000 (- A B))
	     <A is less>)	[rather than (AND (< A B) ...)]

>>Indices

Each connection has two indices assigned to it, one at each end.  Each
index is an arbitrary 16-bit number assigned by the NCP at its end; usually
it is an index into that NCP's tables.  Indices are required to be
non-zero.  For maximum simplicity, all packets include both indices.  The
receiver of a packet uses the destination index to find out who to give the
packet to.  Generally the source index is used only for error checking, but
when a connection is first opened the source index has to be saved and used
as the destination index in future packets in the reverse direction.

To prevent packets somehow left over from old connections from
interfering with new connections, we require that a certain time
elapse between successive connections between the same pair of hosts
with the same index numbers at each end, this time to be longer than
the maximum time a packet is reasonably expected to sit around in
the network somewhere (the maximum transit time through all bridges, etc.)
This requirement is implemented by making part of the index number be
a "uniquizer"; when a host reuses a slot in its tables, it increments
the uniquizer, so that the index number for that slot is not the same
as it was previously.  Then if the uniquizer field is sufficiently wide,
and the rate of creation of connections (actually the rate of allocation
of indices) is sufficiently low, the requirement will be satisfied.
For the kind of network we're talking about, the time is a few tens of
seconds, and the uniquizer need only be a few bits wide.  It is up
to each host how wide it makes the uniquizer, depending on how big
it wants its tables to be.  It is best if table slots are also
allocated circularly, so that slots are reused at the minimum possible rate.

The uniquizer also serves to "more or less" uniquely identify connections
so that duplicate copies of a Request For Connection (RFC) packet
can be identified and discarded.

A user process's "capability" or "channel" to a connection, used by it
to ask the NCP to operate on that connection, simply contains the
appropriate index.

Associated with each index the NCP has a "state", the host # and index
# of the other end of the connection, some read buffers and associated
variables, including a current packet #, and some write buffers and
associated variables, again including a current packet #.

The "state" can be Closed (no connection or other activity currently
associated with this index), Open (this index has a connection to
another index at another machine), RFC-sent (requested another machine
for a connection, but no answer yet), Listen (listening for a request
for connection to a certain contact name), Broken (connection closed
abnormally by network or machine lossage), and RFC-received (waiting
for a server process to get going and pick up a request for connection
that came in).


>>Routing

This section is a place-holder.  Initially routing will be kludged with
a fixed table.  Once the network is set up automatic routing will
be put in.

The routing decision consists of looking at the destination subnet
field of a packet and deciding which interface (on a multi-interface
host) to transmit it on, i.e. what is the best route to that subnet.
In addition, if the destination is not on a subnet to which there
is a direct interface, one must determine what is the host number of
the bridge it should be sent to.

It also involves recognizing packets which are destined to the
current host and receiving them.


The following is not yet implemented.  It is an initial plan for routing.

Gaetways will broadcast RUT packets periodically, perhaps once a minute.
The destination field is zero and the source field is the address of the
gateway on the net on which the packet is being broadcasted.  The data field
contains a bunch of 16-bit words, divided into fields in the same way as
a host address.  The subnet field of each word contains the number of a
subnet which this gateway is able to reach.  The host field of each word
contains the hop count, which is 0 if the gateway is physically connected
to the specified (non-ether) subnet, 1 if the gateway is connected to
the specified (ether) subnet, or 1+ the hop count of the closer gateway
if this gateway indirects through another.  The hop counts allow the gateways
to avoid loops when there are multiple paths; they may not need to be
looked at by non-gateway hosts.

Each host maintains a routing table, with entries keyed by subnet number.
For subnets to which the host is physically connected, the entry points
to the host's physical interface.  For subnets which the host knows how to
get to because it has been informed of a gateway via a RUT packet, the entry
contains the gateway's host address, the hop count, and the time that this
route was last confirmed by a RUT packet.  If the NCP remembers only one
route to a given subnet, it wants to remember the one with the smallest
hop count unless that one is more than (say) 5 minutes old.  If the NCP
remembers all the routes, it should stop using ones which are more than
(say) 5 minutes old.  It is not clear that it should ever use one with
a hop count larger than the minimum.  The reason for the time-out is to
avoid forever losing packets by sending to a gateway which has gone down.
If there are no gateways up and known about by the NCP which reach a given
host, it will look like that host is down.

>>Operations

This section tells what the values of the opcode field in a packet are, and
how an NCP responds to each one.

1    RFC - Request for Connection

This message is sent from user to server in order to open a
connection.  The data contains the contact name.  Actually, if
the data contains a space, the characters before the space are
the contact name and the characters after the space are "arguments".
This is done so that simple transactions can work (see below).

The destination index is zero, because it is not known yet.  The
responses are:

	OPN, if a server process is found or created that wishes
	to accept the request and open up a connection.

	CLS, if the connection cannot be opened.  The data field
	contains an ascii explanation of why not.

	ANS, if a simple transaction is occuring.  The data contains
	the answer, and no connection is ever created.

	FWD, if there is no server process at this host, but there
	might be a suitable one some place else.

There may also be no response, if the RFC was lost in the network, or
the destination host is down, or the reply was lost in the network.

To increase the reliability of establishment of connections, RFC's and
OPN's are retransmitted, just as data packets are, until a response is
elicited.  CLS, ANS, and FWD cannot be retransmitted because we take
the position that retransmission is implemented by something associated
with a connection, and these packets are not sent by connections (and are
not acknowledged).  OPN is sent by a connection, and RFC is sent by
a sort of embryonic connection which continues to exist while it awaits
an OPN or other reply.

Since RFC is retransmitted, it is the responsibility of the NCP to detect
and discard duplicates.  When an RFC is received, all existing connections
in the OPN or RFC-RECEIVED state, and all "pending" RFC's which are awaiting
servers to connect to them (if the NCP has such), should be checked to see
if they have the same source host and index number as the received RFC.
If so, the RFC is a duplicate and should be ignored.  Note that connections
in the LOST, CLOSED, or BROKEN states should not be checked, since these
are not really connections as far as the foreign host is concerned, but simply
ghosts of connections left around to remember error status for their controlling
user programn.

Since the response to RFC is not guaranteed, processes issueing RFC's
must have timeouts.  In most implementations the normal host-down timeout
will suffice.

[We should discuss why the special kludgery for control packets, rather
than using the regular connection mechanism to assure reliable transmission
of control packets, as the Arpanet does.  Also a discussion of which control
operations inherently need reliability and which inherently don't, and
why that should be so.]

The packet # field contains the first packet # that will be assigned
to data transmitted from the user process, minus one modulo 2**16.  In
the simplest case, this can be zero, and the first packet sent will be
packet # 1.  One might also imagine uniquizing the packet numbers
as an extra error check, but this should not be necessary, because
the indices are uniquized, and connections must be explicitly agreed
to by each end before data can flow.


2    OPN - Connection Open

This is the positive acknowledgement to RFC.  The source index field
conveys the acknowledger's connection index to the requester.  The packet #
field contains the first packet # that will be assigned to data transmitted
from the server process, minus one modulo 2**16.  The data portion of this
packet is the same as a STS packet (see below), and mainly serves to convey
the server's window-size to the user.  The ack packet # field must contain
the usual value, i.e. the number that was sent in the packet # field of the RFC.
The receipt and ack in the OPN serve primarily to terminate retransmission of the RFC.

When an OPN is received, a STS is sent in response, telling the server
the user's window-size.  The exchange of an OPN and a STS also serves
as acknowledgement to each side that the other believes the connection
is open.  No data packets may be committed to the network until after
this has occurred.  If this rule was not followed, and packets happened
to get out of order, a data packet could arrive before the connection
was open and cause a LOS.

To improve the reliability of establishment of connections, OPN's are
retransmitted, just as data packets are, until receipted (and
acknowledged) by STS.  Because of this retransmission, the NCP must
detect and discard duplicate OPN's.  If an OPN is received for a connection
which exists and is not in the RFC-SENT state, the OPN should be ignored
and no LOS should be sent.

OPN's contain 16-bit data and are not byte-swapped.  See below under opcode 300.

3    CLS - Connection Closed

CLS is the negative response to RFC.  It indicates that no server was
listening to the contact name, and one couldn't be created, or for
some reason the server didn't feel like accepting this request for a
connection, or the destination NCP was unable to complete the
connection (e.g. connection table full.)  The destination index will
be the source index of the RFC.  The source index will be zero because
the NCP did not put this connection into its tables.  The data bytes,
if there are any, contain an ascii explanation.

CLS is also used to close a connection after it has been open for a while.
In the Arpanet, the NCP undertakes not to close the connection when the
user requests it, but waits until all data transfer has completed.  This is
a source of extra complexity, since data transfer may be hung up, there
have to be timeouts, there have to be connections waiting to be closed
which aren't owned by any user, etc.  It seems simpler to make CLS take
effect immediately, and let the user processes assure that data transfer
has been completed.  Note that telnet-like applications don't need it, and
ftp-like applications have to have it separately from closing anyway.

Since there is no error recovery or retransmission mechanism for CLS,
the use of CLS is necessarily optional.  However, it is desirable to
send a CLS when possible to decrease user confusion.

4    FWD - forward a request for connection

This is a response to RFC which indicates that the desired service
is not available at the process contacted, but may be available at
a possibly-different contact name at a possibly-different host.  The
data field contains the new contact name and the ack packet # field
contains the new host number.  The issuer of the RFC should issue
another RFC to that address.

5    ANS - answer to a simple transaction

Simple transactions are transactions which consist of a single question
(request) and a single answer (response).  They have no side effects,
so it is not necessary for either party to know whether the other party
thinks the transaction was completed or not.  Simple transactions need
no flow control and no error control, other than a timeout by the user
side to detect lost messages.  They are a simple, efficient way for
doing simple-minded things such as extracting information (such as the
time or a host name table) from a central server.

A simple transaction is initiated by sending an RFC to a server which
happens to use simple transactions rather than full connections.  The
data field of the RFC consists of the contact name of the server,
optionally followed by a space and arguments to the request.  The
server responds by sending an ANS packet, whose data field is the
answer.  The destination address of the ANS comes from the source
address of the RFC.  The source address, packet #, and ack packet #
fields of the ANS are not used and should be zero.

The difference between simple transactions (2 packets) and
full datagrams (4 packets) is that in the simple transaction the two
parties don't have to agree about whether or not the transaction
in fact took place, while in the full datagram they do, making
acknowledgement necessary.

The server of a simple transaction should be prepared to process
the transaction multiple times without error.  Simple transactions
should not have side effects which would be dangerous if repeated.

200-377 DATA - Transmits Data

The data portion of the packet is data being sent through the
connection.  The packet # is a number that increments by one for each
data packet sent in this direction on this connection.  This is used to
detect lost packets (which includes packets garbled in transmission and
packets lost in the statistical flow control scheme) and duplicated
packets (caused by lost or delayed acknowledges.  The NCP undertakes to
deliver the packets to the destination process in the same order that
they came from the source process, with no duplications and no
omissions.  Note that any opcode with the sign bit on is a data packet
as far as the NCP is concerned; if they wish, higher-level protocols
may use the opcode field to define various different kinds of data
packets.  Thus, what is herein called a data packet may be a "control"
packet to a higher-level protocol.  Normally, opcodes 200 and 300
should be used.  Opcodes 201-277 and 301-377 are to be used for special
purposes, as defined by higher-level protocols.

Opcodes 300-377 are defined to be "16-bit data".  Note that this does not
affect the byte count in the packet header; it is still a count of 8-bit
bytes.  The sole effect of 16-bit data is to prevent byte-swapping when
communicating with pdp10's; pdp10's store the 2 8-bit bytes in a 16-bit
"word" in the reverse of the network standard order (defined by pdp11's
and Lisp machines).  ((For purposes of 16-bit data, Interdata machines
are considered pdp10's.))

6    SNS - sense status

This packet is a request for a STS packet to be returned.  It is used for
"probing", see the section on flow and error control.

Note that, to avoid a timing error, a SNS received by a connection in the
RFC-sent state should be ignored.  This can happen if an OPN is transmitted
followed by a SNS and the packets get out of order.

SNS should not be transmitted on a connection that is not in the Open state.


7    STS - report status

STS is used for a variety of purposes.  It is the vehicle to carry an
acknowledgement, when no data packet is being sent on which the
acknowledge could be piggy-backed.  STS is used to set or change the
window size, to acknowledge opening of a connection, and to carry
receipts (see the flow control section.)  In the future STS will (may)
be used to carry information used in automatic window-size adjustment.
Like most packets, the ack packet# field of STS carries an
acknowledgement, the number of the last packet given to the user
process.  The first two bytes of the data field (low-order byte first)
carry a receipt, the number of the last packet guaranteed to be given
to the user process eventually.  The next 2 data bytes carry the window
size (low-order byte first).  Additional data bytes will be defined in
the future.

STS's contain 16-bit data and are not byte-swapped.  See above under opcode 300.

10   RUT - routing information

This packet type is reserved for the future, when automatic routing exists.

RUT's contain 16-bit data and are not byte-swapped.  See above under opcode 300.

11   LOS - you are losing

If a host receives a packet for a connection that does not exist (other
than RFC which isn't associated with a particular connection, LOS, and
CLS which is safe to ignore), it should return a LOS packet to the
source of the offending packet.  The source of the LOS should be the
destination of the offending packet, and the packet# and ack packet#
fields should be copied.  The data portion of a LOS contains an ascii
explanation of the problem.  A host receiving a LOS should break the
connection specified by the destination index and inform the associated
process that something has gone wrong.  It should make the LOS packet
available to that process so the explanation of the problem can be
read.

The LOS packet isn't actually necessary, since if the supposed other
end of a connection refuses to cooperate (i.e. never sends any
packets), after a while the NCP will give up, close the connection, and
inform the user that the foreign host appears to be dead.

For debugging, an echo feature is implemented as follows.  If you send
a packet with a data opcode and source and destination indices of 0,
it will be sent back as a LOS packet.  The data field will be destroyed
by the error explanation, but the packet # and ack packet # fields can
be used to remember any sequencing or checking information.

12   LSN - listen (never transmitted through the net, see below)

13   MNT - maintenance

Normal NCPs will discard MNT packets, without generating a LOS.  This packet
type is reserved for use by maintenance programs.

>>Table of packet field use vs opcode

The unused field is never used and must be zero, the forwarding count
field is always used in the same way, and the nbytes field is always
the length of the data.  Fields marked "0" below are don't care
rather than must be zero, but zero is certainly recommended.  The packet#
field of CLS, SNS, and STS would best be set the same as the packet# of
the next data packet to be sent (just as in RFC and OPN).

	 Destination	   Source
Opcode	Host	Index	Host	Index	 Packet#     Ack pk#	Data
------	----	-----	----	-----	 -------     -------	----
 RFC	usual	  0	usual	usual	first - 1	0	contact name

 OPN	usual	usual	usual	usual	first - 1     usual	0, window size

 CLS	usual	usual	usual	  0	    0	        0	reason

 ANS	usual	usual	usual	  0	    0	        0	answer

 FWD	usual	usual	usual	  0	    0	     new host	contact name

 SNS	usual	usual	usual	usual	    0	      usual	   0

 STS	usual	usual	usual	usual	    0	      usual  receipt#, window size

 LOS	src h	src i	dst h	dst i	   pk#	    ack pk #	reason

 LSN	  0	  0	  0	  0	    0		0	contact name

Data	usual	usual	usual	usual	  usual	      usual	data

 MNT	completely nonstandard

 RUT	completely nonstandard

>>Flow and Error Control

The NCPs conspire to ensure that data packets are sent from user to
user with no duplications, omissions, or changes of order.
Secondarily, the NCPs attempt to achieve a maximum rate of flow of
data, and a minimum of overhead and retransmission.

The transmission medium is required to lose all damaged packets.  Therefore
error control reduces to retransmission of lost packets, plus immunity to
duplicated and out-of-sequence packets.

The following concepts must be explained:  the window, acknowledgement,
receipt, retransmission, and probing.

The window is the set of data packets "in the network" between the
sending process and the receiving process.  Conceptually the window
slides along as transmission proceeds.  When a packet is acknowledged,
that means that the window is to the right of that packet, since the
receiving process has gobbled that packet.  The window has a fixed size
to limit the amount of buffer space that is used up if the sender sends
faster than the receiver receives.  If the sending user process tries
to transmit more packets than the window size allows, it should be made
to wait until some packets have been acknowledged.  The window is made
sufficiently large to regulate how often acknowledgements must be
returned.  Note that the window includes only data packets, not
control packets.  Control packets are to be processed immediately
when they are received, and do not take up buffer space.  Separate
mechanisms are provided to deal with control packets being lost in
the network.

No sender is actually required to pay any attention to the window size.
No receiver is actually required to set the window size to something
reasonable.  However, those hosts that want to maximize performance
should do something about the window size.  The size is initially set
during the RFC/OPN/STS dialogue, presumably according to the type of
protocol being used.  An NCP may, if it chooses, dynamically adjust the
window size according to observed network behavior.  (See dynamic
window-size adjustment section below.)

An acknowledgement is a signal from a receiver to a sender that all
packets through packet number "n" have been given to the receiving
process, therefore the window should go from n+1 through n+window_size.
Note that acknowledgements can get out of order, so one should use
the maximum (mod 2^16) of all the acknowledgements ever received
as the start of the window.  Since acknowledgements are so common,
there is a field (ack packet #) in the data packet which allows
an acknowledgement to be "piggy-backed" on another packet.

A receipt is a signal from a receiver to a sender that all packets
through packet number "n" have been received successfully by the NCP
and are guaranteed to be delivered to the user process, therefore they
need not be retransmitted.  Note that acknowledgement implies receipt.
The separate receipt mechanism is supplied so that useless
retransmissions can be limited, when the data have been received but
cannot be acknowledged because the receiving process is being slow
about reading them.  The STS packet is used to send a
receipt.  Receipts are optional.

Retransmission is the process of sending all unreceipted data packets
in the sender's queue through the network to the receiver, except those
that were last sent very recently (within 1/30'th of a second in ITS.)
Retransmission occurs every 1/2 second, and when a STS packet is
received.  The idea of retransmission is to keep sending a packet until
it has been proven to have successfully reached its destination (by
receipt or acknowledgement.)  The reason retransmission occurs in
response to STS is so that a receiver may cause a faster retranmission
rate than twice a second, if it so desires.  Since STS carries a
receipt, and very-recently-transmitted packets are not retransmitted,
this should never cause useless retransmission.

A probe is the sending of a SNS packet, in the hope of eliciting
either a STS or a LOS, depending on whether the other side believes
in the connection.  Probing is used periodically as a way of testing
that the connection is still open, and also serves as a way to get STS
packets retransmitted as a hedge against the loss of an acknowledgement,
which could otherwise stymie the connection.

We probe every five seconds, on connections which have unacknowledged
packets outstanding (a non-empty window), and on connections which have
not received any packets for one minute.  If a connection receives no
packets for 1 1/2 minutes, this means that at least 5 probes have been
ignored, and the connection is declared to be broken.

The receiver generates STS packets under the following circumstances:
When a SNS is received (thus a response to a probe).  When a duplicate
packet is received, so that further useless retransmission will be
prevented by the receipt.  When the number of received but not
acknowledged packets is more than 1/3 the window size; evidently the
normal piggy-backed acknowledge mechanism is not working, so we
generate a STS to carry the acknowledge that will empty the window back
out, hopefully in time before transmission becomes clogged.  When the
window size changes (to tell the other end of the change).

When it is time to send a STS, we attempt to send one immediately.
If this fails (for instance, there might be no buffers available),
we keep trying to send one every half-second.

The receiver can also generate "spontaneous" STS's, to stimulate
retransmission or to carry an acknowledge, to keep things moving on
fast devices with insufficient buffering, such as the Gould printer.
This provides a way for the receiver to speed up the retransmission
timeout in the sender, and to make sure that acknowledges are happening
often enough.  For example, one might use a timer to generate a STS
every 1/10th of a second.

Note that spontaneous STS's should not be generated until the connection
is fully open.  This means that the server should not send STS until it
has gotten a STS back from its OPN.  STS's other than spontaneous ones
have no such problem.

>>Host Status

All physical Chaosnet hosts, even gateways, are required to answer an
RFC with contact name STATUS and byte count 5 (no "arguments" allowed
in the data field of the packet) by returning an ANS packet whose data
field contains:  the name of the host in ascii, an octal 200 (any byte
with the sign bit on terminates the name), and additional status and
metering information to be defined later, perhaps in a site-dependent
way.  This makes it possible to write a program which determines the
status of all nodes in the network; the program could either be driven
by a host table or could try all possible host addresses; the NCP should
respond promptly to the STATUS RFC rather than starting up a program
to handle it, if starting up a program would take more than a second
or two.

>>>Here is some narrative description of the NCP.

Each receiver (each end of each connection is a receiver, and also a
sender; think of receivers and senders as little actors inside the NCP)
has a list of buffers containing packets which have been successfully
received and are waiting to be read by the user process, and two
packet# variables.  One is the number of the last packet read by the
user process.  The other is the number of the last packet which has
been acknowledged.  If these two are not equal, the receiver needs to
send an acknowledgement "soon."

The received-packet list needs to be kept sorted by packet number, and
the NCP has to make sure that the user process does not see duplicates
or omissions.  If packets arrive out of order, the NCP has to sort them.
This means that the user process may not be able to read a packet even
though the receive list is non-empty, because the first packet in the
receive list is not the successor of the last packet read by the user
process.

A good way to do this is to have two lists of received packets, each
of which is sorted.  The first list contains those packets which
the user process may read; the first in the list is the successor
to the last packet read, and there are no gaps in the list.
The second list is the remaining packets, which the user may not
read until some other packet arrives to fill in the gap.  Each
list needs a pointer to head and tail, and the packet number of
the next packet to be appended to the tail.

It is not actually a requirement that an NCP support out-of-order
packets, rather than simply discarding them and hoping that they
will be retransmitted in order, but if it's not very hard to do
one might as well do it.

Acknowledgements are sent by putting the last-received variable into
the "ack packet #" field of an outgoing packet on the opposite
direction of the appropriate connection, and copying the last-received
variable into the last-acknowledged variable.  Where does the outgoing
packet come from?  First of all, all outgoing data, SNS, and STS
packets automatically carry acknowledgement for the reverse direction
of their connection.  So if an outgoing packet happens to be sent at a
time when an acknowledgement is necssary, that takes care of it.

Secondly, if the number of outstanding unacknowledged packets is more
than 1/3 the window size, a STS should be generated and sent immediately
to acknowledge those packets before the sender fills up the window.

Thirdly, the "soon" of four paragraphs back is implemented by a timeout
in the NCP.  If an acknowledgement remains required for a certain amount
of time, a STS should be generated and sent to carry it.  The appropriate
time interval is 1 second, I would guess.  This timeout does not have
to be too exact, however.  One could also not bother with this and let
the other end's probe timeout trigger the STS via a SNS.  However, it
is desirable to send a receipt fairly soon after receiving a packet
to avoid useless retransmission.  This could be done either by a timeout
or by sending a receipt when the packet is received for the second time.
[No known NCP has such a timeout.  10/2/78]

The reason for having a timeout here, rather than just sending an
acknowledgement right away, is two-fold.  It allows "batching" of
acknowledgements, where a single packet can be used to acknowledge
many packets, which halves the network traffic caused by bulk
data transfer.  It also allows "piggy-backing" of acknowledgements
on data packets, which (for instance) decreases the network traffic
caused by remote-echoing telnet connections.

When a receiver receives a data packet, it compares the packet # of
that packet with the last-received variable.  If it is less or equal
(modulo 2^16), it is a duplicate of a packet already given to the user,
and should be discarded (or it is at least 30000 packets ahead of the
user, which is unlikely.)

If it is greater, it is sorted into the received-packet list at the
appropriate position (if it has the same packet# as a packet already in
that list, it is a duplicate and is discarded.)  It is NOT acknowledged
at this time; no packet is ever acknowledged until it has been given to
the user process ("end to end acknowledgement").  Since a packet on the
received packet list has not yet been acknowledged, it may be safely
discarded at any time if the operating system runs out of buffer space,
PROVIDED that it has not yet been receipted.
Also, if the receiving user process is not listening to the net, the
NCP cannot be swamped with arbitrary numbers of packets, since the
sending user process is not supposed to send more than window-size
packets past the last one the receiving user process read.
However, if receipts are being used, once a receipt has been sent for
a packet that packet may not be discarded.  It is up to the individual
NCP which strategy it prefers to use.

Note that the most likely cause of packet duplication is that an
acknowledge or a receipt was lost, so whenever a duplicate packet is
discarded, a STS packet should be sent back containing the current
receipt and acknowledge packet numbers.

The sender has a list of packets which have been entrusted to it by the
user for transmission and one packet # variable, the number of the last
packet sent by the user.  When the user next sends a packet, the sender
will increment this variable and set the packet# of the sent packet to
the result.  The sender also sets the source and destination host
numbers and indices of the packet, sets the "ack packet #" to the
last-received variable of its corresponding receiver, sets the
receiver's last-acknowledged variable to that (clearing the receiver's
need-an-acknowledge flag), chooses a transmission medium by checking
the routing tables, and gives the packet to the transmission medium for
"immediate" transmission (perhaps it has to wait its turn in a queue.)
It also saves the packet on a list, in case retransmission is required.

With each buffered packet the sender holds in trust, it remembers the time
that packet was last transmitted.  From time to time "retransmission"
occurs.  The sender gives one or more packets from its list to the
transmission medium.  It always starts with the oldest, so as to keep
things in order, and sends the rest in order until it gets to one that was
transmitted too recently to do again.  Retransmission is used to recover
from lost or damaged packets, lost or damaged acknowledgements, and packets
discarded by the receiver due to lack of buffering capacity.

Each time a receiver receives a packet, it gives the "ack packet #"
from that packet to its corresponding sender.  The sender discards any
packets with numbers less than or equal to that, since their successful
receipt has just been acknowledged, and advances the window.  If a STS
packet is received, its receipt field is processed by discarding
packets, but the window is not advanced.

>>Dynamic Window-size Adjustment

This section has not been updated for receipts.  Also, it is a bunch
of junk.  Probably we can do without this stuff.

Permit me to stress that this stuff is optional for small NCPs.

The goals of flow control are:
	1. Error recovery.
	2. If the receiver is faster than the sender, avoid unnecessary
	   delays in transmission due to having to wait for an
	   acknowledge or having to wait for the sender process to wake up.
	3. If the sender is faster than the receiver, minimize
	   retransmissions due to receive buffer overflow.
	4. Minimize the number of otherwise-useless packets generated
	   to carry an acknowledgement or a window-size negotiation,
	   and minimize useless retransmissions.

Consequences of the goals:
	1. All packets will be retransmitted until acknowledged.
	2. The sending NCP must buffer several packets, and packets
	   must be acknowledged in groups, not one-by-one.
	3. If the receiver is slow, something must force the sender
	   not to send packets too fast.
	4. The interval between retransmissions should not be too small.
	   It may be desirable for it to increase if the receiving
	   process is not listening for some reason.

The window size is the maximum number of packets which may be in the
network at one time (for one direction of one connection).  "In the
network" means output by the sending process and not yet input by
the receiving process.  (These processes are the entities which
determine the rate, unless they are so fast that the network slows
them down.)

The window size is not the number of packets acknowledged at a time;
for best operation the latter must be 1/2 to 1/3 of the former.
See below.

If the sending process is slow (and determines the rate), things
are relatively simple.  We just have to have a big enough window
size and frequent enough acknowledgement to cover for sending
process wakeup delays.

If things are not limited by the sender, then
		      Window size
	Flow rate = ---------------
		    Round trip time

	Round trip time = time to wake up sender process (multiplied
				by the fraction of the time this
				is necessary)
			+ time packet is sitting in sender buffers
				before it is transmitted
			+ transit time through the net
			+ time packet is sitting in receiver buffers
				before it is read; this is the maximum
				of time to process previous packets
				and time to wakeup sender process
			+ time until acknowledge is generated and sent
			+ transit time through the net

The round trip time is the time between when packet N is output by the
sending process and when it is acknowledged, permitting the sending
process to output packet N+WS.

The main variable components of the round trip time are the delay
before acknowledgement and the delay waiting in the receiver buffer for
packets to be processed.  If these were zero, the round trip time would
consist of two process wakeups and two network transit times
(determined by the delay waiting for the cable and waiting for previous
packets from this host to be transmitted, the time needed to load and
unload the interface in the buffer, and the actual transmission time,
multiplied by the number of bridges in the path.)

This ideal round trip time is probably on the order of 2 seconds.
The timeout for retransmission should be 2 to 3 times the round trip
time.  The timeout for acknowledgement should be 1/3 to 1/2 the
round trip time.  One could either measure the actual round trip time,
or use an estimate of say 3 seconds, a little higher than the ideal.
It would be a good idea to measure the round trip time in any case,
which is done by getting the elapsed time since transmission when
a packet is discarded due to its being acknowledged, and averaging
that.

The receiver process should initially set the window size to the
maximum flow rate it wants to handle times the desirable round trip
time.

Symptoms of improper window size:

If the window-size is too large, the round trip time becomes
long due to packet processing delay in the receiver buffer.
(There will be many packets in the receiver buffer, and the
receiver will be processing them slowly.)  The long round-trip
time will cause unnecessary retransmissions.  Retransmissions
could also be caused by the NCP's discarding received packets
due to insufficient core to buffer them.

If the window-size is too small, excessive process blocking
and waking up occurs.  The receiver process often empties its
buffer and has to block until more packets arrive.  The sender
process often fills up its buffer and has to block until
some of the buffered packets are acknowledged.  A small window
size also causes acknowledgements to have to be sent more
frequently than necessary.  Note that from the receiver's
point of view it is impossible to distinguish between the
window size being too small and the sending process being
too slow.

Here is a scheme for dynamic adjustment of the window size:

Note that window-size adjustments cannot take effect
(in the sense of fixing the symptoms) immediately, so it
is necessary to limit the rate at which the window size
is adjusted.

When the receiver receives (and discards) a duplicate of a
packet it already has in its buffer, this indicates either
that an acknowledgement was lost or that the window size
is too large.  Since packets are assumed not be lost very
often, we may as well assume the window size is too large
and send a WIN packet to decrease it.  Another possibility
would be to have the sender detect the long round-trip
time and complain to the receiver, who could adjust the
window size.  The receiver must not decrease the window
size again until all packets currently buffered have
been read and acknowledged, indicating that the sender
has had a chance to decrease the number of packets
buffered at its end.  A reasonable amount to decrease
the window size by is 1/3 of its current value.

When the sending process wants to output a packet, but the number of
packets already buffered is greater than or equal to the window size,
it should send a WTS, indicating that the problem is too small a window
size or too slow a receiver rather than too slow a sender.  When the
receiving process wants to input a packet, but the buffer is empty, and
a flag is set indicating that a WTS has been received, it should send a
WIN packet adjusting the window size upward by 1/2 of its current value
(and clear the WTS-received flag).  This is rate-limited by preventing
the sender from sending a second WTS until all the packets buffered at
the time the first WTS was sent have been acknowledged, indicating that
the receiver has had time to act on the first WTS.

The variables required.  For both the sending and receiving sides, a
packet number which has to be acknowledged before WTS or WIN can be
sent again, and a flag saying whether this number is operative.  Also,
a WTS-received flag in the receiver.

It is important to meter the performance of this mechanism and find out
whether it does anything and whether what it does is right.

Consider the possibilities of changing this into a more symmetric and
negotiation-based scheme, where the sender always initiates window size
changing and the receiver either agrees or ignores the request.
Consider using elapsed time as an additional rate-limiter (have to use
the other thing, too, so idle connections don't keep changing window
size; this may be deleteable if it is always sender-initiated.)

More notes on the subject of window-size too small.
This is identical to receiver too slow.  The net flow rate
out of the sender is trying to be higher than that into
the receiver, so packets pile up in buffers at each end.
The round-trip becomes arbitrarily high to preserve the
equation and divide window size down enough to get the
flow rate.

The situation where the window-size is too small and we want to do
something about it has to be distinguished from two other situations.
One, the receiver is accepting packets slowly but the sender is also
sending them slowly.  We don't want to change the window-size, because
it doesn't matter since packets aren't piling up, and at any time they
might both decide to go fast.  Two, the receiver's net flow rate is
high, but its response time is long (it's taking packets in bursts).
Here the round-trip time is still long, but making the window size
smaller would make things worse.

The symptoms that distinguish the case where we want to make the
window-size smaller are:  the round-trip time is long, the sender
buffer is full, and the number of packets acknowledged at a time is
small compared to the window size.  Actually, the last two are sufficient,
since if the acknowledgement batch size is small, and we know it's
not the sender's fault, may as well decrease the window size
to save buffer space and decrease the round-trip time.

>>Uniquization

To avoid problems with packets left over from old connections
causing problems with new connections, we do two things.  First of
all, packets are not accepted as input unless the source and
destination hosts and indices correspond to a known, existent
connection.  By itself, this should be adequate, provided that
retransmission is only done by the originating host, not by intervening
gateways and bridges in the network.  This is because we can safely
assume that when a host agrees to open a connection with a certain
index number at its end, it will give up on any previous connection
with the same index, therefore it won't retransmit any old packets
with that index once it has sent out a new RFC or OPN.  The indications
are that our network will be "local" enough that indeed retransmission
will only be done by the original host.

Problems could still occur if packets get out of order, so that an OPN
establishing a new connection gets ahead of a data packet for an old
connection with the same index.  To protect against this, it is
necessary to assure that at least a few seconds elapse before an index
number is reused.  This could be done either by remembering when an
index is last used, or by reserving part of the 16-bit index number as
a uniquization field, which is incremented each time an
otherwise-the-same index is reused.  This field needs to big enough to
cover for the maximum delay of an old data packet with the same index,
and depends on the rate of creation of connections.  Which method is
chosen is at the discretion of each local NCP.  Another necessary
assumption is that when a system crashes and is reloaded (thus
forgetting any remembered information about which indices were in use
when and so forth) that the time to reload it is more than a few
seconds.

Problems could occur not only with left over data packets, but also
with left over control packets.  This isn't too much of a problem since
control packets are not retransmitted, but it could still happen that a
host gets faked out into thinking that it has a connection to another
host that the other host doesn't know about.  In this case, it should
just look like the connection was opened and then either the other host
went down or the connection was broken by a LOS packet, since the other
host won't generate any data packets and won't accept any.

>>Media handlers

A host may be connected to more than one transmission medium.  It has
service programs for each.

When a packet is received that is not addressed to this host, the
forwarding count should be incremented.  If it doesn't overflow, the
packet should be sent back out according to the routing tables,
otherwise it should be discarded.  Normally it would not be useful to
send a packet back out on the same subnet it came in on, but we may as
well let the forwarding count catch this along with other loops.

When a packet is received, if the opcode is RFC, it is handled
specially.  The contact name is compared against those of all the
indices which are in the Listening state.  If a match is found, that
index is put into the RFC-received state, its LSN packet is discarded,
and the RFC packet is put into its input list so that the server
process can see it.  If no server is listening to that contact name,
the RFC packet is placed on the  pending-RFC list, and (in the case of
ITS) a server process is created which will load itself with a suitable
program to open an index in "server" mode, gobble an RFC packet, look
at the contact name, and either reload itself with the  appropriate
server program or send a CLS reply.

When a non-RFC packet is received, the system must look for a receiver
index to handle it.  If none is found, or the state is wrong, or the
source host and index don't match, a LOS should be sent unless the
received packet was a LOS.  Otherwise, if the received packet is WIN,
WTS, or NOP, it is processed and discarded.  Other packets are given to
the user; OPN, CLS, and LOS cause a state change but are also given to
the user as input.

The transmitting side of a transmission medium handler has a queue of
packets to be transmitted.  It should send them out, in order, as fast
as possible, except that if a receiving host has no buffer space (which
can be detected because its chaosnet interface will cause
"interference" on the ether), it should look down the list for another
host to send to.  [No known NCP bothers to look for another
host to send to.  10/2/78]  As long as packets to the same host are sent in the
order they are queued, everything will be all right.  (Actually, this
normally shouldn't matter much.) In addition, when the packets are put
into the transmit queue, the destination host number has to be looked
up in a table to determine which transmission medium to use to get to
it and (in the case of ether) which physical host number to put in the
packet trailer for the hardware.

>>Buffers

In ITS, the buffering scheme will be as follows.  There will be a pool of
128-word packet buffers available.  When it runs out, more can be made.  When
there are many free some can be flushed.  128-word buffers are made out of
1024-word pages (adding a new page type), rather than using the existing
128-word buffer mechanism, because there is a limited number of 128-word
buffers, and that mechanism is a little painful to use.  There are likely
to be several hundred (?) packet buffers (say 12K of core) in use when
high-speed (e.g. mag-tape speed) file transfer is going on.

Each packet buffer has a two-word header, and 126 words which can hold a
packet.  Packet buffers can be on one (or sometimes two) of six lists:
The free list.  The receive list, of which there are two for each
index, one for packets which the user may safely read and one for
out-of-order packets which are awaiting the arrival of an earlier
packet before the user may read them.  The send list, of which there is
one for each index.  The transmission list.  The pending-RFC list.  The
pending-LSN list.

The free list contains packet buffers which are free.  They are threaded
together through addresses in the first word.  Zero ends the list.

The transmission list contains packets which are to be transmitted out
on the network "immediately".  At interrupt level packets are pulled
off of this list and sent.  (There might be more than one transmission
list if a machine is connected to more than one physical medium.)
The transmission list is threaded through addresses in the left half
of the first word.  Zero ends the list.  After transmission -1 is stored
to indicate that the packet is not on the transmission list any more.
If the right half of the first word is -1, indicating that the packet
is not also on a send list, it is returned to free.

Each send list contains packets for a particular connection which have
been entrusted to the system by the user to be sent, but have not yet
been acknowledged.  They are threaded together through the right half
of the first word.  The second word contains the time that the packet
was last transmitted (actually, the time that it was last put on the
transmission list.)

Each receive list contains packets which have been received on a particular
connection and not yet read by the user.  They are threaded together
by addresses in the first word, and the list ends with zero.  The receive
lists must be kept sorted by packet number.

The pending-RFC list contains request-for-connection packets which have
not yet been either accepted or rejected.  They are threaded together
through the first word.  When a server process finishes getting created
and loaded, it will take an RFC off the pending-RFC list and put it
on its own receive list.  The second word of these packets contains
the time received so that the system can know when something has gone
wrong and they should be thrown away.

The pending-LSN list contains LSN packets for all the listening users.
These packets are just used as a handy place to save the contact name
being listened to.  They are threaded together through the first word.
The source-index field in the packet header can, of course, be used
to find which user this packet belongs to.

>>ITS System Calls

(Other systems would have similar calls, with appropriate
changes for their own ways of doing things.)

OPEN

	Not allowed.  (I said this wasn't a "standard" device!)
	Instead use:

CHAOSO

	arg 1 - receive channel number
	arg 2 - transmit channel number
	arg 3 - receive window size

	First, the two specified channels are closed.  Then an index
	is assigned to the user and the two channels are set up to
	point to it.  Two channels are used since in general ITS
	channels are unidirectional, and to allow to the user to
	handle receive and transmit interrupts differently.

	The created index is placed in the Closed state.  To set up
	a connection, IOT an RFC or LSN packet down the transmit
	channel.


PKTIOT
	arg 1 - channel number
	arg 2 - address of a 126.-word block.

	Always transfers exactly one packet.
	The format of the 126.-word block is:
			 16		  16	      4
		-----------------------------------------
		| opcode | unused | fc |   nbytes   | 0 |
		-----------------------------------------
		|destination host |destination index| 0 |
		-----------------------------------------
		|   source host   |   source index  | 0 |
		-----------------------------------------
		|    packet #     |  ack packet #   | 0 |
		-----------------------------------------
		| data1  |  data2  ...

		                            ... data487 |
		-----------------------------------------

	In the descriptions below, if an error is said to
	occur that means IOC error 10. (channel in illegal mode) [3?]
	is signalled.  (Probably change this to an error return?) *******

	In the case of an output PKTIOT, the user sets only
	the opcode, nbytes, and data-n fields.  When the
	NCP copies the packet into a buffer in the system
	it sets the other fields of the header to the
	appropriate values.

	This is not completely true.  When outputting an RFC,
	the user sets the destination host field, and sets the
	ack packet # to the receive window size desired.  The user
	also sets the window size when outputting an OPN.

	The NCP checks for the following special values
	in the opcode field of a packet output by the user:

	RFC - error if the index is not in the Closed state.
	      The packet is transmitted (but not queued for
	      possible retransmission) and the index enters
	      the RFC-sent state.  The user should do an input
	      PKTIOT which will wait for the OPN, CLS, FWD, or ANS reply
	      packet to arrive.  The NCP also copies and saves
	      the user-specified host number and window size.

	LSN - error if the index is not in the Closed state.
	      It is put into the Listen state.  The packet
	      is not transmitted, but it is saved so that
	      when an RFC comes in the system can compare
	      the contact names.  (Note- LSN is a special
	      opcode which is never actually transmitted
	      through the net.)  The pending-RFC list is searched
	      to see if an RFC with the same contact name has
	      been received.  If so, it is given to this index
	      as if it was received just after the LSN was
	      sent out.

	OPN - error if the connection is not in the RFC-received
	      state.  It is put into the Open state.  The
	      packet is transmitted (but not queued for
	      retransmission, since until it is received
	      the other end does not know what index to
	      send acknowledgements to.)  The system also
	      copies and remembers the window size.

	CLS - error if the connection is not in the Open
	      or the RFC-received state.  It is put into
	      the Closed state and the packet is transmitted
	      (but not queued for retransmission).  This packet
	      may optionally contain data bytes which are
	      an ascii excuse for the close.

	FWD - error if the connection is not in the RFC-received
	      state.  The packet is transmitted, but not queued
	      for retransmission, and the connection is put into
	      the Closed state.

	ANS - error if the connection is not in the RFC-received
	      state.  The packet is transmitted, but not queued
	      for retransmission, and the connection is put into
	      the Closed state.

	200 or higher - This is a data packet.  Error if the
	      connection is not in the Open state.  A packet#
	      is assigned, the destination and source fields
	      are filled in, and the packet is transmitted and
	      queued for retransmission.

	Any other opcode causes an error.

	In the case of an input PKTIOT, the user will get an error
	if the connection is in the Closed or Broken state,
	except if it is in the Closed state and there are data
	packets queued.  This is so that the user can read the
	CLS packet.  Otherwise, it will hang until a packet
	arrives, then return the packet into the user's
	126.-word block.

	The user should check the sign bit of the first word,
	which will be set if this is a data packet.  The
	non-data packets which can get given to the user are
	RFC, OPN, FWD, ANS, LOS, and CLS.  (You shouldn't be
	surprised if you get something else, though!)


IOT, SIOT
	These can be used to do unit-mode 8-bit-byte transfers.
	Control bit 1.4 means don't-hang, and applies to both input
	and output.  Only data packets with opcode 200 will be
	transferred.  Anything else on input causes the transfer
	to stop, like an end-of-file.  Use PKTIOT to find out what
	the story is.  (The correct way is to verify that there are
	some packets in the input buffer, then do a SIOT, and if it
	transfers 0 bytes then the first packet in the input buffer
	must not be a data packet, so PKTIOT it in.)

	There can be input available to SIOT even when the state is
	not %CSOPN (e.g. if the input buffer contains data and
	a CLS packet.)  In this case, you should first SIOT (if you
	care to pick up the data) then PKTIOT.

CLOSE

	Immediately closes the connection.  All buffers and other
	information associated with the index are discarded.  Normally
	the user should first IOT a CLS
	packet containing an ascii explanation for why it is
	closing.  Note that any data previously written on the
	connection but not yet received by the other end will be
	lost.  User programs should exchange "bye" commands of some
	sort before closing if they care about losing data.  It is
	done this way to keep the NCP simple.


RESET

	Does nothing.


FORCE

	If there is a partially-filled output packet (created by IOT
	or SIOT), it is transmitted.

FLUSH

	On an output channel, does FORCE and then waits until
	there are no queued output buffers.  I.e., waits for
	all output to be received and acknowledged by the foreign
	host.  This in fact waits for acknowledge, not just receipt.


RCHST

	val 1 	SIXBIT/CHAOS/
	val 2	0
	val 3	0
	val 4	0
	val 5	-1


RFNAME

	val 1 	SIXBIT/CHAOS/
	val 2	0
	val 3	0
	val 4	0
	val 5	0 or 1  (i.e. .UAI or .UAO)


WHYINT

	val 1 - %WYCHA
	val 2 - state
	val 3 - number of packets queued (receive,,transmit)
	val 4 - window size (receive,,transmit)
	val 5 - input channel#,,output channel# (-1 if not open or I/O-pushed)

	LH(val 3) is the number of packets available to input IOT.
		  This is different from the number of received packets
		  if some are out of order.  This is increased by 1 if
		  there is a partially-read buffer available to SIOT;
		  this packet is not available to PKTIOT.  This is zero
		  if the connection is direct-connected to a STY.

	RH(val 3) is the number of packets which have been transmitted
		  by output IOT but which have not yet been received and
		  acknowledged by the foreign host.

	The state codes are:

		%CSCLS	Closed
		%CSLSN	Listen
		%CSRFC	RFC-received
		%CSRFS	RFC-sent
		%CSOPN	Open
		%CSLOS	Broken by receipt of "LOS" packet.
		%CSINC	Broken by incomplete transmission (no acknowledge
			for a long time)


NETBLK

	Similar to Arpanet NETBLK.


STYNET

	arg 1 - STY channel
	arg 2 - Chaos channel to connect to, or
		-1 to disconnect
	arg 3 - Other Chaos channel (not actually used)
	arg 4 - Output-reset character sequence, up to 3 8-bit
		characters left-justified.

	This works the same as on the Arpanet.  The specified STY
	is connected to or disconnected from a Chaos net channel.
	Data is transferred in and out by the system without the
	need for intervention by the user program.  If an unusual
	condition occurs, the STY is disconnected from the Chaos
	channel and the user is interrupted.  It is illegal to do
	I/O on any of the involved channels while they are connected.

	This call is provided for the benefit of the "Telnet" server.


CHAOSQ

	arg 1 - address of a 126.-word block (packet buffer)

	This is a special system call for use by the ATSIGN CHAOS
	program, which is a daemon program that gets run when
	an RFC is received that does not match up against an
	existing LSN.

	The first packet on the pending-RFC queue is copied
	into the packet buffer, then moved to the end of the
	queue (so that the right thing happens when several
	RFC's are pending at the same time.)

	The call fails if the pending-RFC queue is empty.

	The program should use the contact name in this
	packet to choose a server program to execute.  This
	server program will then LSN to (presumably) the same
	contact name, thus picking up the RFC.

Interrupts

	IOC error interrupts occur if an attempt is made to IOT
	when the connection is in an improper state, or to IOT
	a packet with an illegal opcode.

	An I/O channel interrupt is signalled on the input channel
	when the number of queued buffers changes from zero to
	non-zero.

	An I/O channel interrupt is signalled on the output channel
	when the number of queued buffers changes from greater or
	equal to the window size, to less than the window size.

	An I/O channel interrupt is signalled on the input channel
	when the connection state changes.

	Interrupts can be used for

		(1) detecting when input arrives.
		(2) detecting when the system is willing to accept
		    output.
		(3) detecting when the other end does a CLOSE.
		(4) detecting when a requested connection
		    is accepted or rejected.
		(5) detecting when a request for connection
		    comes into a listening server.

>ITS packages

Here document CHSDEF and NNETWK someday.  Also the host table.

>An NCP in English

This section contains the salient routines and variables of the ITS
Chaos Network Control Program, in English.

The following variables exist per-connection:

CHSUSR	the user who owns the connection, and his channel number for
	it.  This is used for reporting interrupts, and to keep track
	of who owns what.

CHSSTA	the state.  The following states exist:
	Closed - this is the initial state
	Listening - awaiting an RFC that matches the user's LSN.
	RFC-received - entered from the Listening state when a matching
		RFC arrives.
	RFC-sent - entered from the closed state when an RFC is transmitted.
	Open - entered from RFC-sent when an OPN is received, from RFC-received
		when an OPN is sent.  This is the "active" state.
	Lost - entered when a LOS is received
	Incomplete transmission - entered when probing produces no response.

	Some flags also exist:
	Send STS as soon as possible
	Connection is direct-connected to a STY.

CHSIBF	List of received packets, in order, which the user may read.

CHSPBF	List of out-of-order received packets.  The user may not read these.
	When the missing packets arrive, these are transferred to CHSIBF.

CHSOBF	List of transmitted packets which have not yet been receipted.  This
	list retains the packets for retransmission.  Control packets do
	not go on this list, only data packets.  RFC's and OPN's DO go on this list.

CHSNOS	Number of output slots.  This is the number of packets which the user
	may output before the window fills and the user must wait.  It is equal
	to the window size minus the number of unacknowledged packets.

CHSPKN	The number of the last packet given to the user, and the number of
	the last packet sent by the user.  Used in assigning packet numbers,
	checking order, and sending acknowledgements.

CHSACK	The numbers of the last packets acknowledged in each direction
	of the connection.

CHSNBF	The number of packets in CHSIBF and the number of packets in CHSPBF.
	Redundant information which is handy now and then.

CHSITM	The time (in 30ths of a second) that a packet was last received
	from the network for this connection.  Used in probing/incomplete-transmission.

CHSWIN	The two window sizes for the two directions of the connection.

CHSLCL	The local host and index numbers.

CHSFRN	The foreign host and index numbers.

---
When the user reads a packet, this algorithm is executed:
If the CHSIBF list is empty, then if the connection state is
not Open, give an error, otherwise await the arrival of a packet
into CHSIBF.  Remove the first packet in CHSIBF and give it to
the user.  If it is a data packet (opcode >= 200), "virtually
acknowledge" it as follows.  Put its packet number into CHSPKN
as the last packet given to the user.  Get the last packet number
acknowledged from CHSACK.  If the difference of these is more than
1/3 the window size, send a STS.

When the user outputs a packet, this algorithm is executed:
(what algorithm?)

---------- Material after this point may be inoperative ----------

>Comparison with LCSnet
		, and other blathering.

>>Principle differences

The LCSnet proposed protocol is called DSP.  The Chaosnet protocol will
just be called chaos in this section.

(1) DSP specifies things in terms of bytes where Chaosnet specifies
them in terms of packets.  We choose packets to increase the simplicity
and efficiency of the scheme.  DSP has to work in terms of bytes because
it allows packets to be reformatted en route, hence

(2) DSP assumes that gateways can exist between networks with the same
protocols but different packet sizes.  Therefore, the protocol has to
allow for the fact that packets may be reformatted en route.  I happen
to believe that this situation is extremely unlikely to exist, and in
fact gateways between "different" networks will have to do much more
than just change the packet size.  Therefore, it makes sense to make
the gateway worry about gateway issues, rather than have them permeate
the whole protocol.  I believe that gateways will look more like
regular network ports than like transmission media; to get from a host
on the local net to a host on the arpa net, one will connect to the
arpa net gateway and ask it to open a connection from itself to the
host on the arpa net, then tie those connections together.  A gateway
will perform not only packet reformatting, but protocol translation,
flow control on both sides, and maybe even character set translation.
There can also be entities called "bridges", which connect two networks
(or two separate segments of one network) with the same protocol.  A bridge
simply forwards any packets it receives, but never alters the packets,
and never looks inside them except to find out where to forward them to.

(3) A related difference is that DSP includes the arpa net, and TCP,
and by extension all the networks in the universe, in its port number
address space.  Chaosnet would have you connect to a gateway, then
send the gateway the port number to connect to in the foreign
address space separately.  Chaosnet does have to include all networks
reachable by bridges in its address space.

(4) Chaosnet has an "opcode" field in the packet header, where DSP
does not.  DSP acheives the same effect with various bits here and
there.  It makes little difference unless user-level programs decide
to exploit the opcode field.

(5) DSP and Chaosnet have quite different mechanisms for creating
connections.  In DSP, one never creates a connection, exactly;
one simply starts sending to a port address.  Local network note
#3 mumbles about how one discovers which port address to send to,
but I have not seen any specifics.  In Chaosnet, the mechanism
for finding out where to send to and the mechanism for creating
a connection are intertwined; the excuse is that often the process
being connected to is created at the same time as the connection.

(6) DSP uses unique, never-reused port IDs.  Chaosnet does not.
The problem with unique, never-reused IDs is that I know of no
system that can implement them.  Multics comes close, with the
aid of a special hardware clock.  The clock is set from the
operator's watch when the system is powered on, and the mechanism
depends on the fact that the error in the operator's watch is
less then the time required to bring up the system after a power
failure.  Small systems that cannot afford special hardware just
for this, and don't have permanent disk storage, would find it
very hard to generate unique IDs.

Chaosnet prefers to rely on a scheme that doesn't require special
hardware, but nearly always works.  By requiring a connection
to be opened before data packets can be sent through it, and by
some assumptions about the structure of the network, the problem
is eliminated.  See the Flow and Error Control section for
further discussion.

(7) DSP closes the two directions of a connection separately.  Why?

>>High priority data packets, interrupts, and flushing.

The basic idea is to note that if you want to send a high priority
message, this means you want it out of order with respect to previously-
sent data on some connection.  Therefore, high priority data should
be sent over an auxiliary connection.  The per-connection overhead
is not prohibitively high, and this eliminates vast quantities of
hair from the innermost portion of the system.

One advantage that DSP gains by having "high priority messages"
built into the system is that it also incorporates a standardized
way to "mark" a particular point in a data stream.  However, this
is comparatively unimportant, particularly since I think high-priority
messages will probably never get used.  The only place I've heard
them proposed to be used is with Telnet, but ordinary terminals
get along quite well without "out of band" signals when used with
reasonable operating system software.

Interrupts and flushing of input are random crocks associated
with high priority messages.  I don't propose to implement them either.

>>Datagrams.  (connections only used to pass a single packet.)

These are easy.  The guy who wishes to send a datagram does
OPEN, IOTs an RFC to the service to which the gram is to be
sent, and NETBLKs waiting for the connection to open up.  He
then IOTs the data packet, FLUSHes waiting for it to get there,
then CLOSEs.

The server OPENs and IOTs an OPN in response to the RFC.  She
then IOTs in the datagram packet, CLOSEs, and goes off processing
the message.

Four packets are transmitted, two in each direction.  (An RFC, an OPN,
a DATA, and an ACKing NOP.)  No need to send any CLS messages, since
each user process knows to do a CLOSE system call after one data
packet has been transmitted.  It has been claimed that this is
the theoretical minimum if acknowledgement is required.  The reason
is that the data packet must contain some unique id generated by
the RECEIVER to avoid duplicates, and it must be acknowledged,
so that's two packets in each direction, with no combining possible.

Note that as [someone at] PARC has pointed out, for the important
case of side-effect-free transactions, a timeout can substitute
for acknowledgement, and only two packets are necessary.  See ANS.


>>Why not multiple messages per packet?

[1] Not needed for data.  The division of the data stream into
packets is invisble to the real user, anyway.  It's only used by
the "user-ring" portion of the network system software.

[2] Extra complexity.  Consider the hair involved with packed
control messages in the Arpanet.  Because of the control link being
shared between multiple connections between the same pair of hosts,
this could save a little.  I don't know of any NCP that does this;
furthermore, having that shared facility is a bad idea.  The only
case in the Arpanet where packing two control messages into one
packet is useful is when opening a connection the receiver wants
to send STR and ALL both.  In this protocol we just put the window
size in as part of the RFC and OPN messages.

[3] There is an argument that having message boundaries separate
from packet boundaries is useful because gateways between different
networks may need to split up packets because the two networks
may have different maximum packet sizes.  My feeling about this
is that the gateway is likely to have to do a good deal more than
that.  It seems like too much to wish for that the two networks
should use exactly the same packet format, protocols, or even character
set; so the gateway rather than being a packet-reformatting device
is much more likely to look like a user program with two connections,
one on one network and one on the other, which passes data between
the connections with appropriate conversion.  In particular, flow
control is likely to be host1 to gateway and host2 to gateway,
rather than host1 to host2.


>>Why not have a checksum in the packet?

This network is likely to have a very diverse collection of machines
on it, which means it will be impossible to define a checksum which
can be computed efficiently in software on all machines.  Now all
hardware links in the system ought to have whatever amount of
hardware checking is appropriate to them, but due to the efficiency
costs of a user-level end to end checksum, this should not be
a built-in requirement of the basic low-level protocol.  Instead,
checksumming should be an optional feature which some higher-level
protocols (those that need it because the data being passed through
them is so vitally important that every possible effort must be made
to ensure its correctness) may implement.  Checksumming should
be implemented at the user level in exactly the same way and for exactly
the same reasons as encryption should be implemented at the user level.


>>How about host-independent user-level protocols, where one just
connects to a service and doesn't have to know what host it's at today?

Yeah, how about it?  As far as I know, this protocol provides an
adequate base for constructing such a thing.  Also I haven't
seen anything published on the subject.


>>Very small hosts.

E.g. we'd like to put the Chess machine on the net.  It has very little
memory, but not totally impotent microcode.  A small host need only
support one connection, may ignore WIN, LOS, and CLS, may only have one packet
in transmission at a time, and may process receive packets one at a time
(ignoring any that come in until the first one has been fully processed).
It IS necessary to check that received DATA packets come in in the right order.

RFC may be handled by remembering the other guy's host number and index,
and sending back a completely canned OPN.  The contact name is ignored.
If a second user tries to connect while a first user is connected,
the first user gets bumped.  Let them fight it out on some larger
machine (or the floor) for who will get to use the small machine.
Never originate any packet type other than DATA and that one OPN.

Attaching ordinary terminals "directly" to the network is obviously
undesirable.

>Transmission Media

This section describes how packets are encapsulated for transmission
through various media, and what auxiliary hair is needed by each
medium.


>>Ethernet

The messages transmitted through the ether (or CAIOS) net consist of
a packet followed by a three-word trailer:

	+----------------+
	|     header     |  8 words
	+----------------+
	|      data      |  0-52 words
	+----------------+
	| immediate dest |  1 word
	+----------------+
	| immediate src  |  1 word
	+----------------+
	| CRC check word |  1 word
	+----------------+

The three trailer words are looked at by the hardware; the last two
of them are supplied by the hardware.  The reason this stuff is in
a trailer rather than a leader is that the caiosnet hardware actually
transmits the packet backwards.  However, this is transparent to
the software.

Bytes are sent two per word.  The low-order byte is first (pdp11 standard).


>>TEN11 Interface

[The following total hair has not been checked recently.]

Since interrupts can't be sent through the TEN11 interface, the pdp10 can
only service the network at relatively-infrequent intervals, for instance
every 1/60'th of a second.  Therefore it is necessary to have queues of
packet buffers in each direction.  This provides high speed by allowing
several packets to be processed at a time.

The speed and reliability of the TEN11 interface eliminates any need for
error checking.  (ha ha) [ho ho] <he he>   To decrease the load on the AI
pdp10, it is assumed that the pdp11's will be responsible for swapping the
bytes in the data portions of packets so that they will be in pdp10
standard order.

Even though the contents of packets will not be error-checked, the
pdp10 must check addresses to avoid being screwed by a losing pdp11.

The form of a packet encapsulated for the TEN11 interface will then be

	|-----------------|-----------------|----|
	|  queue thread   | 0=empty, 1=full |  0 |
	|-----------------|-----------------|----|
	| #bytes | opcode |     unused      |  0 |
	|-----------------|-----------------|----|
	|destination host |destination index|  0 |
	|-----------------|-----------------|----|
	|   source host   |   source index  |  0 |
	|-----------------|-----------------|----|
	|    packet #     |   ack packet #  |  0 |
	|-----------------|-----------------|----|
	| data 0 | data 1 | data 2  . . .   |  0 |
	|				    |  0 |
	|-----------------|-----------------|----|

for a total of 31 36-bit words, or 62 pdp11 words.

The queue thread is the pdp11 address of the next packet-buffer in
a queue, or zero if this is the last.  The empty/full indicator
says whether this buffer currently contains a packet, or not.

The following is an attempt to express the algorithms of the
pdp10 and pdp11 in concise form.  Hopefully they are self-
explanatory.

Several queues of buffers need to exist in the pdp11.  Only
two of these are known to the pdp10.

TO10QF - first buffer queued for transmission to the 10.

TO10QL - last buffer queued for transmission to the 10.
	 Exists so that buffers can be appended to the
	 list more quickly.

TO10AC - first buffer in list of buffers actively being
	 gobbled by the 10.  Set by 11, cleared by 10.

TO10FR - copy of TO10AC.  Used to free the buffers after
	 the 10 gobbles them.

(come-from pdp11-packet-receivers
      when (eq (destination-host ?packet) pdp10)
       ;re-arrange 8-bit bytes for 10
	(swap-bytes (data-part packet))
       ;Add to to-10 queue
	(set (thread packet) nil)	;nil=0
	(cond ((null TO10QF)
	       (setq TO10QF packet TO10QL packet))
	      (t (set (thread TO10QL) packet)
		 (setq TO10QL packet)))
       ;Try to activate to-10 queue
	(cond ((null TO10FR)
	       (setq TO10FR TO10QF
		     TO10AC TO10QF
		     TO10QF nil
		     TO10QL nil)))
)

(come-from pdp11-polling-loop
      when (and (not (null TO10FR))  ;buffers were sent to 10
		(null TO10AC))	     ;and 10 has finished gobbling
	(mapcar 'make-buffer-free TO10FR)	;mapcar follows queue words
	(setq TO10FR nil)	     ;OK to activate more buffers now
	(cond ((not (null TO10QF))   ; more stuff waiting, activate it now
	       (setq TO10FR TO10QF
		     TO10AC TO10QF
		     TO10QF nil
		     TO10QL nil)))
)

(come-from pdp10-clock-level
      when (and (not (null TO10AC))  ;11 is sending buffers
		(not web-buffers-locked))
       ;copy to user, process control message, or reject if buffers full
	(mapcar 'process-input
		TO10AC)
       ;signal pdp11 that all packets have been gobbled
	(setq TO10AC nil))


FRBFLS - list of free buffers.  cons-buffer gets from here,
	 make-buffer-free puts here.

FM10AC - list of buffers into which pdp10 may place packets
	 set by 11 / cleared by 10.

FM10GB - copy of FM10AC, used by 11 to process buffers after
	10 has placed packets into them.

(come-from pdp11-polling-loop
      when (and (null FM10GB)	     ;10 needs some buffers &
		(not (null FRBFLS))) ; free buffers available
       ;give the 10 a list of a suitable number of empty buffers
	(repeat max-at-a-whack times
	  (and (null FRBFLS) (exitloop))
          (setq buffer (cons-buffer)) ;pull off free list
	  (set (full-indicator buffer) nil) ;contains no packet
	  (set (thread buffer) FM10GB) ;cons onto list
	  (setq FM10GB buffer))
	(setq FM10AC FM10GB)	     ;give buffer list to 10
)

(come-from pdp11-polling-loop
      when (and (not (null FM10GB))  ;gave 10 some buffers
		(null FM10AC))	     ;which it has used
       ;process packets sent from 10.
	(mapcar
	 '(lambda (buffer)
	    (cond ((null (full-indicator buffer))
		   (make-buffer-free buffer)) ;didn't get used
		  (t (swap-bytes buffer)
		     (send-to-destination buffer))))
	 FM10GB)
	(setq FM10GB nil))	     ;no buffers active in 10 now

(come-from pdp10-clock-level
      when (and (not (null FM10AC))  ;buffer space avail
		(not web-buffers-locked)) ;no M.P. interference
       ;send as many packets as possible
	(mapcar
	 '(lambda (buffer)
	    (cond ((needs-to-be-sent ?packet)	;find a sendable packet somewhere
		   (copy-into buffer packet)
		   (set (full-indicator buffer) t))))
	 FM10AC)
       ;signal pdp11 to gobble the packets
	(setq FM10AC nil))


To avoid the need for a gross number of exec pages in the pdp10,
the FM10AC and TO10AC words, and all the packet buffers, should
lie in a single 4K page.  The best location for this page varies
from machine to machine.  On dedicated 11's such as the AI TV11,
the MC console 11, etc. it should probably just be the first 4K
of memory.  On the logo machine, it would probably be best to put
this page up in high memory where RUG won't mess with it.  In the
case of the mini-robot system, I'm not sure.

It would be best not to try to use this protocol with "general
purpose" machines, because of problems with finding the list
headers and packet buffers, problems with telling whether the
machine is running the right program, etc.  It should be used
just as a way for the AI machine to get a path to the net.

>>DL10 & DTE20

[Outline only]

[Just use the pdp11 as a substitute for a direct chaosnet interface.]

[Basically, the 11 says ready to transfer (in either direction), the 10
sets up the pointers and says to transfer, and the 11 transfers the cruft.
To eliminate an extra level of buffering, on input transfers the 11 makes a
copy of the first 4 16-bit words of the header available to the 10 when it first
says "ready to transfer."  The 10 uses these to decide where to copy the
packet into.  It helps if you don't try to use a DL10 on a machine with a
cache.]

>>Asynchronous line

Packets are encapsulated by preceding them with start of text (202),
and following them with a 1-byte additive checksum and an end of text (203).
The 16-bit words are transmitted low order byte first.  If the checksum
is wrong the receiver ignores the packet.  The start and end characters are
just there to help in ignoring random noise on the line.  If they don't
appear, the packet is ignored.  The full 60-word packet is not transmitted;
the #bytes count is used to determine how many of the data bytes to
transmit; the receiver fills the remainder with zero (or garbage, at its
convenience.)

This protocol is intended mainly for communication between the plasma
physics pdp11 in bldg. 38 and a pdp11 in 545, until the caiosnet
gets extended that far (or a longer-distance, lower-speed caiosnet
is extended to various machines off in that direction.)