mirror of
https://github.com/PDP-10/its.git
synced 2026-02-04 23:54:37 +00:00
DIGEST - digestify a mailing list.
This commit is contained in:
21
doc/digest/-read-.-this-
Executable file
21
doc/digest/-read-.-this-
Executable file
@@ -0,0 +1,21 @@
|
||||
This directory is for the ITS mail digesting tool.
|
||||
|
||||
DEFS > contains the digest definitions.
|
||||
|
||||
DIGEST ORDER contains documentation on the definitions and other
|
||||
information on how to use the digestifier.
|
||||
|
||||
LOG * files contain digestifier log output.
|
||||
|
||||
TS DIGEST is the installed binary of the digestifier.
|
||||
(DRAGON;HOURLY DIGEST should be a link to DIGEST;TS DIGEST.)
|
||||
|
||||
DIGEST > is the source for TS DIGEST.
|
||||
|
||||
TS MBXLOC is a utility program for locking inboxes so that COMSAT and
|
||||
the digestifier wont touch them (so that you can edit them
|
||||
yourself, should that become necessary).
|
||||
|
||||
MBXLOC > is the source for TS MBXLOC.
|
||||
|
||||
DIGEST BUGS ia the archive for BUG-DIGEST mail.
|
||||
2025
doc/digest/digest.bugs
Executable file
2025
doc/digest/digest.bugs
Executable file
File diff suppressed because it is too large
Load Diff
334
doc/digest/digest.order
Executable file
334
doc/digest/digest.order
Executable file
@@ -0,0 +1,334 @@
|
||||
Digestifying an ITS Mailing List
|
||||
|
||||
Why Digestify?
|
||||
--------------
|
||||
|
||||
First, what is digestifying and why do it? A mailing list is used by a
|
||||
mailer program (such as ITS's COMSAT) to distribute messages to more than
|
||||
one address, translating the single list address given into the addresses
|
||||
of the intended recipients. Normally this process occurs without any
|
||||
built-in delays; the mailer receives a message, checks the set of
|
||||
addressees for mailing lists to expand, performs such expansions, and
|
||||
immediately delivers the message to the desired recipients.
|
||||
|
||||
However, some mailing lists have a consistently high volume of mail
|
||||
travelling to them. Any message sent through an ITS machine ties up its
|
||||
COMSAT in proportion to the number of recipients and the number of messages
|
||||
sent, rather than in proportion to the total number of characters sent --
|
||||
for example, delivering any message to AI-List (one very large mailing
|
||||
list) takes two and a half hours! Mail to hosts that are neither local to
|
||||
MIT nor directly connected to the central nodes of the Arpanet -- "weird"
|
||||
hosts -- is particularly expensive. Thus, any mailing list which gets more
|
||||
than a few messages per day and which goes to more than about ten weird
|
||||
hosts imposes a large load on COMSAT.
|
||||
|
||||
Also, the overhead to list members of reading mail sent to the list is a
|
||||
function of the number of messages received, as well as the number of
|
||||
characters. For these two reasons, large mailing lists should be
|
||||
-digested-; that is, arrangements should be made to collect all the
|
||||
messages sent to such a list during each day, bundle them up as one single,
|
||||
long message, and send that message, the digest, to the list members. The
|
||||
digest has a special format which can be undigested or burst -- broken up
|
||||
into individual messages -- by many mail-reading programs.
|
||||
|
||||
This file explains how to get the digestifier daemon to digest a ITS
|
||||
mailing list automatically, using as an example a hypothetical mailing list
|
||||
called FOOBLATZ.
|
||||
|
||||
|
||||
How To Digestify
|
||||
----------------
|
||||
|
||||
So now that you've decided your mailing list should be digested, how do
|
||||
you arrange that?
|
||||
|
||||
First, all automatically digestified ITS mailing lists currently must
|
||||
reside on MC.LCS.MIT.EDU; if yours lives elsewhere, you must move it to MC.
|
||||
You can make forwarding pointers from other machines to MC; on another ITS
|
||||
machine, such a pointer would be a line like
|
||||
|
||||
(FOOBLATZ (EQV-LIST FOOBLATZ@MC))
|
||||
|
||||
in the other machine's .MAIL.;NAMES file.
|
||||
|
||||
Second, you must alter the mailing list entry in MC:.MAIL.;NAMES > . Mail
|
||||
sent to FOOBLATZ needs to be collected into a file, the inbox, for later
|
||||
digestification, rather than immediately sent out to members of the mailing
|
||||
list. So the mailing list entry for FOOBLATZ should look like
|
||||
|
||||
(FOOBLATZ (EQV-LIST ([COMAIL;FOO INBOX] (R-OPTION FAST-APPEND))))
|
||||
|
||||
Note that the FAST-APPEND option makes COMSAT append new mail to the end of
|
||||
the inbox, which will cause the FOOBLATZ Digest to include the messages
|
||||
sent to FOOBLATZ in the chronological order in which they arrived at MC.
|
||||
Not including this option will cause the digest to include the messages in
|
||||
the reverse order of arrival, which will confuse many list members.
|
||||
Further, as explained below in the digestifier algorithm section, when
|
||||
discussion grows very brisk, the inbox may contain more than one digest's
|
||||
worth of messages; in this case the digestifier will create a digest
|
||||
starting at the beginning of the file and going until it reaches its size
|
||||
limit, so the FAST-APPEND option will ensure that the older part of the
|
||||
conversation is sent out first, and that very old messages don't accumulate
|
||||
unsent.
|
||||
|
||||
Third, you must create an entry in the file MC:DIGEST;DEFS > for your
|
||||
digest. Everything in that file up to the first ^_ (ascii 037) is a
|
||||
comment, and after that comes a series of digest definitions, separated by
|
||||
^_'s; to make life easy, put yours at the end.
|
||||
|
||||
Each digest definition has a format that looks like an RFC822 mail header
|
||||
-- that is, it consists of a series of named fields of the form:
|
||||
|
||||
Name: FooBlatz
|
||||
Inbox: COMAIL;FOO INBOX
|
||||
Administrivia: COMAIL;FOO ADMIN
|
||||
Record: COMAIL;FOO RECORD
|
||||
First-Issue-Number: 259
|
||||
AUTHOR: FOOBLATZ-REQUEST
|
||||
RCPT: (@FILE [COMAIL;FOO LIST])
|
||||
From: FooBlatz Daily Blast
|
||||
<FooBlatz-Request%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU>
|
||||
Reply-To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
||||
To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
||||
|
||||
Continuation lines, such as the second line of the From: field in the
|
||||
example above, -must- start with a space or a tab. Blank lines between
|
||||
fields are ignored, so you can insert them in your digest entry if you
|
||||
like. The order of the fields in the definition does not matter.
|
||||
Capitalization of field names does not matter, but capitalization of field
|
||||
values does -- if you want the From: field to look like "FooBlatz-Request"
|
||||
in the digests sent out, capitalize it that way.
|
||||
|
||||
Here is a catalog of all of the currently accepted fields, their
|
||||
meanings, and whether they are required for the digestifier to work:
|
||||
|
||||
Name: (required)
|
||||
|
||||
The name of the digest, such as "FooBlatz". This name is usually used
|
||||
before the word "Digest", as in "FooBlatz Digest #259".
|
||||
|
||||
Inbox: (required)
|
||||
|
||||
The name of the file that COMSAT delivers mail to for this digest. The
|
||||
device is defaulted to "DSK", the directory is defaulted to "COMAIL",
|
||||
and the second filename is defaulted to "INBOX". You -must- supply the
|
||||
first filename. Thus you can say just "Inbox: FOO" if your inbox is
|
||||
DSK:COMAIL;FOO INBOX.
|
||||
|
||||
Administrivia: (optional)
|
||||
|
||||
The name of the file that the digestifier should check for
|
||||
administrative messages that should be inserted at the front of the
|
||||
next digest. By default this file has the same name as the inbox file,
|
||||
but with the second filename of "ADMIN". If you don't specify an
|
||||
Administrivia field, then the digestifier will not look for an
|
||||
administrivia file at all -- if you want to use the default file name,
|
||||
you can simply give Administrivia: a blank field.
|
||||
|
||||
If this field exists, the digestifier will look for a file of the
|
||||
specified name; if the file exists, the digestifier will include its
|
||||
contents in the digest between the list of message topics and the first
|
||||
message, and delete the file. Note that the administrivia file is not
|
||||
a mailbox -- its contents will be included in the digest exactly,
|
||||
including all the headers and other (for this purpose) extraneous
|
||||
nonsense of anything sent to the file as mail. Spare your list members
|
||||
by avoiding this action; log in and write the file directly if that's
|
||||
at all possible. When you write the file, you don't need to explicitly
|
||||
create white space around your text; the digestifier will automatically
|
||||
provide blank lines before and after it.
|
||||
|
||||
Record: (optional)
|
||||
|
||||
The name of the file that the digestifier uses to keep track of the
|
||||
state of the digest. This contains vital data like the current issue
|
||||
number and the time that the most recent digest was mailed. By default
|
||||
this file has the same name as the inbox file, but with the second
|
||||
filename of "RECORD".
|
||||
|
||||
Do not try to create this file yourself! Doing so will only confuse
|
||||
the situation. The digestifier will create this file the first time it
|
||||
processes your digest; if you don't specify a Record field, the
|
||||
digestifier will use the default name for this file.
|
||||
|
||||
First-Issue-Number: (optional, usually)
|
||||
|
||||
This field is -only- used by the digestifier when it creates a new
|
||||
record file the first time it processes your digest. This is used to
|
||||
initialize the issue number stored in the record file so that the next
|
||||
digest created will have the given number. It should consist of a
|
||||
string of digits (only digits!) representing a decimal number, like
|
||||
"259".
|
||||
|
||||
If this field is not present and the digestifier can't find a record
|
||||
file, then the digest definition is broken and no digest will be
|
||||
produced. For safety, you can remove the First-Issue-Number from your
|
||||
digest definition after your record file is created; that way, if
|
||||
someone accidentally deletes your record file, the digestifier won't
|
||||
automatically recreate it and start duplicating issue numbers.
|
||||
|
||||
When converting an existing digest to use this digestifier, the initial
|
||||
contents of this field should be set to the contents of the existing
|
||||
"NUMBER" file associated with that digest, to preserve continuity of
|
||||
issue numbers.
|
||||
|
||||
RCPT: (required)
|
||||
|
||||
This is the digest recipient, the address to which the digest will
|
||||
actually be mailed (independent of what you may list in the To: and
|
||||
From: fields!). This has exactly the format of a recipient from
|
||||
.MAIL.;NAMES >, except it cannot be continued onto a continuation line.
|
||||
Typically you will keep the actual mailing list in a file, say
|
||||
COMAIL;FOO LIST, and have a RCPT field like
|
||||
|
||||
RCPT: (@FILE [COMAIL;FOO LIST])
|
||||
|
||||
Defining the list in an indirect file is a good idea for large mailing
|
||||
lists that change frequently, since it allows you to avoid recompiling
|
||||
NAMES > and messing with the NAMED ERR file.
|
||||
|
||||
When converting an existing digest to use this digestifier, note that a
|
||||
separate entry in NAMES > is no longer called for to hold the mailing
|
||||
list which includes the actual list members. You can use such a list,
|
||||
but it's probably easier to simply specify the indirect file containing
|
||||
the list members explicitly in this field.
|
||||
|
||||
AUTHOR: (required)
|
||||
|
||||
This is where delivery error messages should be directed.
|
||||
|
||||
Unlike the RCPT: field, this is not a NAMES > style recipient; nor is
|
||||
it an RFC822 style recipient (something of the form User@Host). It can
|
||||
only be a simple string -- "FOOBLATZ-REQUEST" in our example. This
|
||||
means that unless you put a single person's name here, you will have to
|
||||
create a mailing list to receive the errors.
|
||||
|
||||
If you want to try to keep human-generated requests apart from
|
||||
mailer-generated errors, you can create a mailing list separate from
|
||||
your administrative list -- called, say, FooBlatz-Errors -- and put it
|
||||
here in the AUTHOR: field.
|
||||
|
||||
From: (required)
|
||||
Reply-To: (optional)
|
||||
To: (optional)
|
||||
|
||||
The values of these three fields are copied verbatim into the header of
|
||||
all generated digests. If the optional fields are not given, the
|
||||
generated digests will not have these fields -- no default values are
|
||||
generated for them. Please be careful to specify only RFC822-legal
|
||||
values for these fields. Currently most digests use an address of the
|
||||
form
|
||||
|
||||
From: FooBlatz Daily Blast
|
||||
<FooBlatz-Request%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU>
|
||||
or
|
||||
To: FOOBLATZ%MC.LCS.MIT.EDU@Mintaka.LCS.MIT.EDU
|
||||
|
||||
(By the way, there is no reason why MC's name has to appear here. Your
|
||||
subscribers don't need to know that MC is involved in producing the
|
||||
digest as long as you give them -some- address that reaches your
|
||||
inbox.)
|
||||
|
||||
Generally the From: field will contain the name of the mailing list's
|
||||
auxiliary administration list -- FooBlatz-Request in our example. This
|
||||
is the address where people will generally send their administrative
|
||||
requests. It need -not- be the same as the address that appears in the
|
||||
AUTHOR: field, although typically it is.
|
||||
|
||||
The To: and Reply-To: fields should contain the address of the mailing
|
||||
list itself -- that is, the address where people send mail they want
|
||||
included in the digest. Mail sent to this address should eventually
|
||||
reach your digest's inbox file.
|
||||
|
||||
Actually, many other well-known RFC822 header fields can be given as
|
||||
fields in the digest definition, but most digests will want to use
|
||||
exactly these three. (See the source code if you want to know what
|
||||
others will work. Note that Date, Subject and Message-ID headers are
|
||||
automatically generated for each issue of each digest by the
|
||||
digestifier.)
|
||||
|
||||
|
||||
Digestifier Algorithms
|
||||
----------------------
|
||||
|
||||
The digestifier is run automatically once every hour. It reads through the
|
||||
file DIGEST;DEFS > and considers each digest in turn, keeping a log of its
|
||||
actions in the file DIGEST;LOG >.
|
||||
|
||||
For each digest, the digestifier considers a number of factors to determine
|
||||
whether or not it is going to produce a digest this time:
|
||||
|
||||
1. The current time of day. The hours between 2AM and 7AM are "Prime
|
||||
Time" and the digestifier prefers to create a digest then.
|
||||
|
||||
2. The current size of the digest's inbox. The digestifier never produces
|
||||
a digest larger than a certain size (around 48000 characters). If the
|
||||
inbox looks like it contains more than 1.5 digests worth of material,
|
||||
then the inbox is "bloated" and the digestifier tries to create a
|
||||
digest soon.
|
||||
|
||||
3. How long ago the previous digest was mailed. The digestifier tries not
|
||||
to produce digests so frequently that people and mailers are
|
||||
overwhelmed with them, nor so infrequently that a message can sit in
|
||||
the inbox for an unreasonably long time.
|
||||
|
||||
The precise test is:
|
||||
|
||||
(AND <the previous digest was created more than 90 minutes ago>
|
||||
(OR <the inbox is bloated> ; more than 1.5 digests pending
|
||||
(AND <it is prime time> ; between 2AM and 7AM
|
||||
<the previous digest was created more than 18 hours ago>)))
|
||||
|
||||
In English, this translates as:
|
||||
If the last issue of this digest was sent out less than an hour and a
|
||||
half ago, wait. If the last issue went out longer ago than that and the
|
||||
inbox is bloated, create a digest. But if the inbox isn't bloated, check
|
||||
whether it's prime time; if it is, and the last issue went out yesterday,
|
||||
then create and send today's issue.
|
||||
|
||||
The various numbers and times are all subject to future adjustment of
|
||||
course.
|
||||
|
||||
This digestifier should be fairly robust in the face of system crashes,
|
||||
being gunned down in the middle of processing, etc. The worst that can
|
||||
happen is that a duplicate issue can be produced, and that can only occur
|
||||
if the digestifier is zapped during an extremely small window. I'll be
|
||||
surprised if it ever happens.
|
||||
|
||||
It is perfectly safe to run two digestifiers at once, since both the
|
||||
digestifier and COMSAT use the LOCK device to coordinate access to inboxes.
|
||||
|
||||
In fact, if you edit the DEFS file, it is probably a good idea to run the
|
||||
digestifier once yourself and check the LOG file to see if you made any
|
||||
errors. Even if your inbox is empty, this procedure will catch many
|
||||
possible problems with your digest definition. You should be able to run
|
||||
the digestifier by typing:
|
||||
|
||||
:DIGEST;DIGEST
|
||||
|
||||
to DDT. This might take a couple of minutes to finish (the digestifier
|
||||
might decide to produce some digests!) so be patient. Then you should look
|
||||
at what was appended to the end of the current DIGEST;LOG > file.
|
||||
|
||||
This digestifier tries to be fairly civic-minded about cleaning up after
|
||||
itself. If it encounters any errors during the processing of a digest it
|
||||
logs the error, then carefully deletes any partially-written output and
|
||||
either proceeds to the next digest, or kills itself (depending on the
|
||||
nature of the error). Only a few amazingly unlikely errors should ever
|
||||
leave a dead disowned job as a corpse.
|
||||
|
||||
Mail is always delivered through the bulk COMSAT (through the .BULK.
|
||||
directory.)
|
||||
|
||||
This digestifier is careful to check the messages included in the digests
|
||||
it builds for lines of "-"s that could confuse digest bursting or
|
||||
undigestifying tools, and modifies the first "-" in any suspect line to be
|
||||
a space.
|
||||
|
||||
Send bug reports to Bug-DIGEST.
|
||||
|
||||
|
||||
This digestifier was written by Alan Bawden and supersedes previous digests
|
||||
written by Rob Austein, David Wallace, and Chris Maeda. A lot of the
|
||||
documentation was adapted from documentation written by David Chapman for
|
||||
the GUMBY digestifier.
|
||||
|
||||
32
doc/digest/new.admin
Executable file
32
doc/digest/new.admin
Executable file
@@ -0,0 +1,32 @@
|
||||
Congratulations! We're the lucky winners in the lottery to choose the
|
||||
lucky victim, er, test case for the new improved automatic digestifier,
|
||||
which has just been finished by the small but dedicated cadre of ITS
|
||||
hackers. This change should if anything improve the situation for
|
||||
list members. Specific differences you may notice:
|
||||
|
||||
All messages included in the digest will now include all the To: and CC:
|
||||
fields they arrived here with. So if a message was sent to some particular
|
||||
list member but cc'd to the whole list, or sent to several lists, that will
|
||||
now be obvious. For those of us who use undigestifying or digest-bursting
|
||||
tools, this change will have the beneficial effect of causing each message
|
||||
burst from the digest to automatically have a legal mail header.
|
||||
|
||||
There are now rules governing the maximum size of digests sent out. If
|
||||
there is too much accumulated mail, the digestifier will send out the
|
||||
oldest section as the digest and save the newest section for later.
|
||||
Conversely, the digestifier also has the ability to produce more than one
|
||||
digest a day. The effect of this change is that list members whose sites
|
||||
have difficulty swallowing very large mail should no longer run into that
|
||||
problem; on the other hand, if the discussion grows extensive, it will get
|
||||
throughput in efficient chunks.
|
||||
|
||||
The digestifier will now check each included message for lines that begin
|
||||
with lots of dashes. Such lines are created by the digestifier to separate
|
||||
messages from each other, so similar lines inside any message have the
|
||||
potential to confuse digest-bursting tools. To guard against this problem,
|
||||
the digestifier will now change the initial dash of any such lines inside
|
||||
messages to a space.
|
||||
|
||||
If this change causes any problems, please report them to us.
|
||||
|
||||
SCA-REQUEST@MC.LCS.MIT.EDU
|
||||
@@ -82,6 +82,7 @@
|
||||
- DDTDOC, interactive DDT documentation.
|
||||
- DECUUO, TOPS-10 and WAITS emulator.
|
||||
- DFTP, Datacomputer file transfer.
|
||||
- DIGEST, digestify a mailing list.
|
||||
- DIRCPY, copy directory.
|
||||
- DIRDEV, list directories, sorted or subsetted.
|
||||
- DIRED, directory editor (independent from EMACS DIRED).
|
||||
|
||||
Reference in New Issue
Block a user