mirror of
https://github.com/PDP-10/its.git
synced 2026-01-11 23:53:12 +00:00
335 lines
15 KiB
Plaintext
Executable File
335 lines
15 KiB
Plaintext
Executable File
Digestifying an ITS Mailing List
|
|
|
|
Why Digestify?
|
|
--------------
|
|
|
|
First, what is digestifying and why do it? A mailing list is used by a
|
|
mailer program (such as ITS's COMSAT) to distribute messages to more than
|
|
one address, translating the single list address given into the addresses
|
|
of the intended recipients. Normally this process occurs without any
|
|
built-in delays; the mailer receives a message, checks the set of
|
|
addressees for mailing lists to expand, performs such expansions, and
|
|
immediately delivers the message to the desired recipients.
|
|
|
|
However, some mailing lists have a consistently high volume of mail
|
|
travelling to them. Any message sent through an ITS machine ties up its
|
|
COMSAT in proportion to the number of recipients and the number of messages
|
|
sent, rather than in proportion to the total number of characters sent --
|
|
for example, delivering any message to AI-List (one very large mailing
|
|
list) takes two and a half hours! Mail to hosts that are neither local to
|
|
MIT nor directly connected to the central nodes of the Arpanet -- "weird"
|
|
hosts -- is particularly expensive. Thus, any mailing list which gets more
|
|
than a few messages per day and which goes to more than about ten weird
|
|
hosts imposes a large load on COMSAT.
|
|
|
|
Also, the overhead to list members of reading mail sent to the list is a
|
|
function of the number of messages received, as well as the number of
|
|
characters. For these two reasons, large mailing lists should be
|
|
-digested-; that is, arrangements should be made to collect all the
|
|
messages sent to such a list during each day, bundle them up as one single,
|
|
long message, and send that message, the digest, to the list members. The
|
|
digest has a special format which can be undigested or burst -- broken up
|
|
into individual messages -- by many mail-reading programs.
|
|
|
|
This file explains how to get the digestifier daemon to digest a ITS
|
|
mailing list automatically, using as an example a hypothetical mailing list
|
|
called FOOBLATZ.
|
|
|
|
|
|
How To Digestify
|
|
----------------
|
|
|
|
So now that you've decided your mailing list should be digested, how do
|
|
you arrange that?
|
|
|
|
First, all automatically digestified ITS mailing lists currently must
|
|
reside on MC.LCS.MIT.EDU; if yours lives elsewhere, you must move it to MC.
|
|
You can make forwarding pointers from other machines to MC; on another ITS
|
|
machine, such a pointer would be a line like
|
|
|
|
(FOOBLATZ (EQV-LIST FOOBLATZ@MC))
|
|
|
|
in the other machine's .MAIL.;NAMES file.
|
|
|
|
Second, you must alter the mailing list entry in MC:.MAIL.;NAMES > . Mail
|
|
sent to FOOBLATZ needs to be collected into a file, the inbox, for later
|
|
digestification, rather than immediately sent out to members of the mailing
|
|
list. So the mailing list entry for FOOBLATZ should look like
|
|
|
|
(FOOBLATZ (EQV-LIST ([COMAIL;FOO INBOX] (R-OPTION FAST-APPEND))))
|
|
|
|
Note that the FAST-APPEND option makes COMSAT append new mail to the end of
|
|
the inbox, which will cause the FOOBLATZ Digest to include the messages
|
|
sent to FOOBLATZ in the chronological order in which they arrived at MC.
|
|
Not including this option will cause the digest to include the messages in
|
|
the reverse order of arrival, which will confuse many list members.
|
|
Further, as explained below in the digestifier algorithm section, when
|
|
discussion grows very brisk, the inbox may contain more than one digest's
|
|
worth of messages; in this case the digestifier will create a digest
|
|
starting at the beginning of the file and going until it reaches its size
|
|
limit, so the FAST-APPEND option will ensure that the older part of the
|
|
conversation is sent out first, and that very old messages don't accumulate
|
|
unsent.
|
|
|
|
Third, you must create an entry in the file MC:DIGEST;DEFS > for your
|
|
digest. Everything in that file up to the first ^_ (ascii 037) is a
|
|
comment, and after that comes a series of digest definitions, separated by
|
|
^_'s; to make life easy, put yours at the end.
|
|
|
|
Each digest definition has a format that looks like an RFC822 mail header
|
|
-- that is, it consists of a series of named fields of the form:
|
|
|
|
Name: FooBlatz
|
|
Inbox: COMAIL;FOO INBOX
|
|
Administrivia: COMAIL;FOO ADMIN
|
|
Record: COMAIL;FOO RECORD
|
|
First-Issue-Number: 259
|
|
AUTHOR: FOOBLATZ-REQUEST
|
|
RCPT: (@FILE [COMAIL;FOO LIST])
|
|
From: FooBlatz Daily Blast
|
|
<FooBlatz-Request%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU>
|
|
Reply-To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
|
To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
|
|
|
Continuation lines, such as the second line of the From: field in the
|
|
example above, -must- start with a space or a tab. Blank lines between
|
|
fields are ignored, so you can insert them in your digest entry if you
|
|
like. The order of the fields in the definition does not matter.
|
|
Capitalization of field names does not matter, but capitalization of field
|
|
values does -- if you want the From: field to look like "FooBlatz-Request"
|
|
in the digests sent out, capitalize it that way.
|
|
|
|
Here is a catalog of all of the currently accepted fields, their
|
|
meanings, and whether they are required for the digestifier to work:
|
|
|
|
Name: (required)
|
|
|
|
The name of the digest, such as "FooBlatz". This name is usually used
|
|
before the word "Digest", as in "FooBlatz Digest #259".
|
|
|
|
Inbox: (required)
|
|
|
|
The name of the file that COMSAT delivers mail to for this digest. The
|
|
device is defaulted to "DSK", the directory is defaulted to "COMAIL",
|
|
and the second filename is defaulted to "INBOX". You -must- supply the
|
|
first filename. Thus you can say just "Inbox: FOO" if your inbox is
|
|
DSK:COMAIL;FOO INBOX.
|
|
|
|
Administrivia: (optional)
|
|
|
|
The name of the file that the digestifier should check for
|
|
administrative messages that should be inserted at the front of the
|
|
next digest. By default this file has the same name as the inbox file,
|
|
but with the second filename of "ADMIN". If you don't specify an
|
|
Administrivia field, then the digestifier will not look for an
|
|
administrivia file at all -- if you want to use the default file name,
|
|
you can simply give Administrivia: a blank field.
|
|
|
|
If this field exists, the digestifier will look for a file of the
|
|
specified name; if the file exists, the digestifier will include its
|
|
contents in the digest between the list of message topics and the first
|
|
message, and delete the file. Note that the administrivia file is not
|
|
a mailbox -- its contents will be included in the digest exactly,
|
|
including all the headers and other (for this purpose) extraneous
|
|
nonsense of anything sent to the file as mail. Spare your list members
|
|
by avoiding this action; log in and write the file directly if that's
|
|
at all possible. When you write the file, you don't need to explicitly
|
|
create white space around your text; the digestifier will automatically
|
|
provide blank lines before and after it.
|
|
|
|
Record: (optional)
|
|
|
|
The name of the file that the digestifier uses to keep track of the
|
|
state of the digest. This contains vital data like the current issue
|
|
number and the time that the most recent digest was mailed. By default
|
|
this file has the same name as the inbox file, but with the second
|
|
filename of "RECORD".
|
|
|
|
Do not try to create this file yourself! Doing so will only confuse
|
|
the situation. The digestifier will create this file the first time it
|
|
processes your digest; if you don't specify a Record field, the
|
|
digestifier will use the default name for this file.
|
|
|
|
First-Issue-Number: (optional, usually)
|
|
|
|
This field is -only- used by the digestifier when it creates a new
|
|
record file the first time it processes your digest. This is used to
|
|
initialize the issue number stored in the record file so that the next
|
|
digest created will have the given number. It should consist of a
|
|
string of digits (only digits!) representing a decimal number, like
|
|
"259".
|
|
|
|
If this field is not present and the digestifier can't find a record
|
|
file, then the digest definition is broken and no digest will be
|
|
produced. For safety, you can remove the First-Issue-Number from your
|
|
digest definition after your record file is created; that way, if
|
|
someone accidentally deletes your record file, the digestifier won't
|
|
automatically recreate it and start duplicating issue numbers.
|
|
|
|
When converting an existing digest to use this digestifier, the initial
|
|
contents of this field should be set to the contents of the existing
|
|
"NUMBER" file associated with that digest, to preserve continuity of
|
|
issue numbers.
|
|
|
|
RCPT: (required)
|
|
|
|
This is the digest recipient, the address to which the digest will
|
|
actually be mailed (independent of what you may list in the To: and
|
|
From: fields!). This has exactly the format of a recipient from
|
|
.MAIL.;NAMES >, except it cannot be continued onto a continuation line.
|
|
Typically you will keep the actual mailing list in a file, say
|
|
COMAIL;FOO LIST, and have a RCPT field like
|
|
|
|
RCPT: (@FILE [COMAIL;FOO LIST])
|
|
|
|
Defining the list in an indirect file is a good idea for large mailing
|
|
lists that change frequently, since it allows you to avoid recompiling
|
|
NAMES > and messing with the NAMED ERR file.
|
|
|
|
When converting an existing digest to use this digestifier, note that a
|
|
separate entry in NAMES > is no longer called for to hold the mailing
|
|
list which includes the actual list members. You can use such a list,
|
|
but it's probably easier to simply specify the indirect file containing
|
|
the list members explicitly in this field.
|
|
|
|
AUTHOR: (required)
|
|
|
|
This is where delivery error messages should be directed.
|
|
|
|
Unlike the RCPT: field, this is not a NAMES > style recipient; nor is
|
|
it an RFC822 style recipient (something of the form User@Host). It can
|
|
only be a simple string -- "FOOBLATZ-REQUEST" in our example. This
|
|
means that unless you put a single person's name here, you will have to
|
|
create a mailing list to receive the errors.
|
|
|
|
If you want to try to keep human-generated requests apart from
|
|
mailer-generated errors, you can create a mailing list separate from
|
|
your administrative list -- called, say, FooBlatz-Errors -- and put it
|
|
here in the AUTHOR: field.
|
|
|
|
From: (required)
|
|
Reply-To: (optional)
|
|
To: (optional)
|
|
|
|
The values of these three fields are copied verbatim into the header of
|
|
all generated digests. If the optional fields are not given, the
|
|
generated digests will not have these fields -- no default values are
|
|
generated for them. Please be careful to specify only RFC822-legal
|
|
values for these fields. Currently most digests use an address of the
|
|
form
|
|
|
|
From: FooBlatz Daily Blast
|
|
<FooBlatz-Request%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU>
|
|
or
|
|
To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
|
|
|
(By the way, there is no reason why MC's name has to appear here. Your
|
|
subscribers don't need to know that MC is involved in producing the
|
|
digest as long as you give them -some- address that reaches your
|
|
inbox.)
|
|
|
|
Generally the From: field will contain the name of the mailing list's
|
|
auxiliary administration list -- FooBlatz-Request in our example. This
|
|
is the address where people will generally send their administrative
|
|
requests. It need -not- be the same as the address that appears in the
|
|
AUTHOR: field, although typically it is.
|
|
|
|
The To: and Reply-To: fields should contain the address of the mailing
|
|
list itself -- that is, the address where people send mail they want
|
|
included in the digest. Mail sent to this address should eventually
|
|
reach your digest's inbox file.
|
|
|
|
Actually, many other well-known RFC822 header fields can be given as
|
|
fields in the digest definition, but most digests will want to use
|
|
exactly these three. (See the source code if you want to know what
|
|
others will work. Note that Date, Subject and Message-ID headers are
|
|
automatically generated for each issue of each digest by the
|
|
digestifier.)
|
|
|
|
|
|
Digestifier Algorithms
|
|
----------------------
|
|
|
|
The digestifier is run automatically once every hour. It reads through the
|
|
file DIGEST;DEFS > and considers each digest in turn, keeping a log of its
|
|
actions in the file DIGEST;LOG >.
|
|
|
|
For each digest, the digestifier considers a number of factors to determine
|
|
whether or not it is going to produce a digest this time:
|
|
|
|
1. The current time of day. The hours between 2AM and 7AM are "Prime
|
|
Time" and the digestifier prefers to create a digest then.
|
|
|
|
2. The current size of the digest's inbox. The digestifier never produces
|
|
a digest larger than a certain size (around 48000 characters). If the
|
|
inbox looks like it contains more than 1.5 digests worth of material,
|
|
then the inbox is "bloated" and the digestifier tries to create a
|
|
digest soon.
|
|
|
|
3. How long ago the previous digest was mailed. The digestifier tries not
|
|
to produce digests so frequently that people and mailers are
|
|
overwhelmed with them, nor so infrequently that a message can sit in
|
|
the inbox for an unreasonably long time.
|
|
|
|
The precise test is:
|
|
|
|
(AND <the previous digest was created more than 90 minutes ago>
|
|
(OR <the inbox is bloated> ; more than 1.5 digests pending
|
|
(AND <it is prime time> ; between 2AM and 7AM
|
|
<the previous digest was created more than 18 hours ago>)))
|
|
|
|
In English, this translates as:
|
|
If the last issue of this digest was sent out less than an hour and a
|
|
half ago, wait. If the last issue went out longer ago than that and the
|
|
inbox is bloated, create a digest. But if the inbox isn't bloated, check
|
|
whether it's prime time; if it is, and the last issue went out yesterday,
|
|
then create and send today's issue.
|
|
|
|
The various numbers and times are all subject to future adjustment of
|
|
course.
|
|
|
|
This digestifier should be fairly robust in the face of system crashes,
|
|
being gunned down in the middle of processing, etc. The worst that can
|
|
happen is that a duplicate issue can be produced, and that can only occur
|
|
if the digestifier is zapped during an extremely small window. I'll be
|
|
surprised if it ever happens.
|
|
|
|
It is perfectly safe to run two digestifiers at once, since both the
|
|
digestifier and COMSAT use the LOCK device to coordinate access to inboxes.
|
|
|
|
In fact, if you edit the DEFS file, it is probably a good idea to run the
|
|
digestifier once yourself and check the LOG file to see if you made any
|
|
errors. Even if your inbox is empty, this procedure will catch many
|
|
possible problems with your digest definition. You should be able to run
|
|
the digestifier by typing:
|
|
|
|
:DIGEST;DIGEST
|
|
|
|
to DDT. This might take a couple of minutes to finish (the digestifier
|
|
might decide to produce some digests!) so be patient. Then you should look
|
|
at what was appended to the end of the current DIGEST;LOG > file.
|
|
|
|
This digestifier tries to be fairly civic-minded about cleaning up after
|
|
itself. If it encounters any errors during the processing of a digest it
|
|
logs the error, then carefully deletes any partially-written output and
|
|
either proceeds to the next digest, or kills itself (depending on the
|
|
nature of the error). Only a few amazingly unlikely errors should ever
|
|
leave a dead disowned job as a corpse.
|
|
|
|
Mail is always delivered through the bulk COMSAT (through the .BULK.
|
|
directory.)
|
|
|
|
This digestifier is careful to check the messages included in the digests
|
|
it builds for lines of "-"s that could confuse digest bursting or
|
|
undigestifying tools, and modifies the first "-" in any suspect line to be
|
|
a space.
|
|
|
|
Send bug reports to Bug-DIGEST.
|
|
|
|
|
|
This digestifier was written by Alan Bawden and supersedes previous digests
|
|
written by Rob Austein, David Wallace, and Chris Maeda. A lot of the
|
|
documentation was adapted from documentation written by David Chapman for
|
|
the GUMBY digestifier.
|
|
|