Building the BALGOL Compiler
The Burroughs Algebraic Compiler for the 220 (BAC-220, or more popularly, "BALGOL") was distributed by Burroughs as a magnetic tape plus a set of small "callout" card decks. Using this tape, a version of the compiler could be generated for a number of system configurations and peripheral device types. This page describes how to build that distribution tape, known as the "Generator tape."
Background
We do not have a copy of the Burroughs Generator tape, so for the retro-220 emulator project the contents and organization of that tape had to be reverse-engineered from the source code and documentation. The layout of that tape is described at the end of this wiki page.
The compiler consists of four parts:
- The Main compiler program, which does a one-pass compilation of a BALGOL program.
- The Overlay program, which amends the compiler output with routines from the standard library, external machine-language routines supplied by the user, and if specified in the BALGOL program source, support for overlays and diagnostic facilities.
- The standard Library routines (SIN, COS, SQRT, I/O routines, etc.)
- The Generator program, which customizes the compiler for a specific environment and produces a magnetic tape from which the compiler can be loaded and run.
All four parts of the compiler source code were transcribed from a scanned listing donated by Professor Donald Knuth to the Computer History Museum in Mountain View, California.
The listings were transcribed in their entirety. The listings for the Main, Overlay, and Library portions show address assignments but not the words of assembled code. The listing for the Generator shows both addresses and assembled code.
The idea of transcribing the entire listing is to be able to compare the transcription to a listing generated by our assembler. In the case of the Generator, a match between the transcription and assembly listing is good evidence that the transcription is correct, since the address assignments and words of assembled code agree between the two. For the other parts, the evidence provided by the match is far less certain, as there is no assembled code to compare, so only the address assignments could be verified. These transcriptions were therefore carefully proofread.
The Main, Overlay, and Library parts of the compiler were written in one assembler notation, while the Generator was written using an entirely different one. We had neither documentation nor source code for either of those assemblers at the time, so the definition of their notations had to be reverse-engineered from the compiler listings, and then cross-assemblers written to process the transcribed compiler source code and generate 220 object code.
These cross-assemblers are written in HTML and Javascript, and run in a standard web browser. The assembler for the Main, Overlay, and Library modules is referred to as BAC-Assembler. The assembler for the Generator is referred to as GEN-Assembler. The scripts for these are located in the software/tools/ directory of the project repository.
In October 2019, after the compiler recovery project had been completed, a document was discovered at the Charles Babbage Institute in Minneapolis, Minnesota, that appears to be a manual for the assembler used with the compiler Main, Overlay, and Library modules. See "Burroughs 220 Assembler-Compiler," Bulletin 5024, Burroughs Corporation, Data-Processing Systems Group, April 1960. Burroughs Corporation Records, Product Literature (CBI 90), Charles Babbage Institute, University of Minnesota, Minneapolis, Series 74, box 5, folder 17.
References
The primary documentation for generating the compiler can be found in Appendices A and F of Burroughs Algebraic Compiler, Revised Edition, March 1963, Bulletin 220-21017.
The following sections describe in detail the steps to assemble the various pieces and create a Generator tape. Significant portions of the material have been adapted from a file of notes in the project repository, BALGOL-Build-Notes.txt.
Preparing the Source Code
Files for each of the four parts of the compiler reside in separate sub-directories within the software/BALGOL/ directory of the project repository:
The transcription files for the Main, Overlay, Library listings have a .baca extension. The transcription file for the Generator has a .bacg extension.
All of these files must be converted to card-image format prior to being assembled. This has already been done, yielding corresponding files in the sub-directories above having a .card extension. This conversion should not need to be done again unless the transcription files are subsequently changed. There are Windows Script Host (VBSript) programs to do this in the software/tools/ directory of the project repository. Comments in the script files describe how to use them.
- BAC-Xscript-Reformatter.wsf is the converter for
.bacafiles. - GEN-Xscript-Reformatter.wsf is the converter for
.bacgfiles. - BAC-DeckGen.cmd is a Windows command-line script to do the conversion for all of the
.bacafiles as a batch.
Building the BAC-220 Generator Tape
This section describes how to build a Generator tape, starting with the card decks for the source code described in the prior section.
Preparing for the Build
A fully usable Generator tape image, BAC-220-Generator.tape is available from the project repository. The procedures below need to be repeated only if changes are made to the Generator or compiler modules. Also, the intermediate files generated below (object tapes, object decks) are also available from the repository, so if only one element needs to be changed, the steps below to generate the others can be skipped, and the appropriate files from the repository for the non-modified elements used instead.
In the following discussion, all references to card columns, card numbers within decks, and block numbers within tape images are 1-relative. All references to word offsets within a block are 0-relative.
Note that there are multiple ways the individual source programs could be assembled and prepared to create the initial BAC-220 Generator Tape. The process described below assembles the Generator, Main, and Overlay programs to tape, but assembles the library routines to cards, as that is what the Generator program requires in order to build the library on tape.
The following procedure assumes a retro-220 emulator configuration with the following:
- At least 5000 words of memory.
- At least two tape drives. Unit designations are specified below.
- A Cardatron Reader designated as unit 1. Format band selection must be by column 1 on a card.
- A Cardatron Punch designated as unit 1.
- A Cardatron Printer designated as unit 2.
Build Procedure
You may wish to download the entire directory of BALGOL files from the repository to your workstation so that the files cited in the following steps can be accessed more conveniently.
Step 1. Assemble the BALGOL-Generator program using GEN-Assembler with an Output Mode of "Object Tape." Pre-load the BALGOL-Generator/BALGOL-Generator-PoolSet.js pool file before loading the BALGOL-Generator/BALGOL-Generator.card file into the assembler. Save the resulting tape image file as BALGOL-Generator/BALGOL-Generator-Object.tape. Copy and save the assembly listing if you wish.
Step 2. Assemble the BALGOL-Main program using BAC-Assembler with an Output Mode of "Object Tape." Pre-load the BALGOL-Main/BALGOL-Main-PoolSet.js pool file before loading the BALGOL-Main/BALGOL-Main.card file into the assembler. Save the resulting tape image file as BALGOL-Main/BALGOL-Main-Object.tape. Copy and save the assembly listing if you wish.
Step 3. Assemble the BALGOL-Overlay program using BAC-Assembler with an Output Mode of "Object Tape." Pre-load the BALGOL-Overlay/BALGOL-Overlay-PoolSet.js pool file before loading the BALGOL-Overlay.card file into the assembler. Save the resulting tape image file as BALGOL-Overlay/BALGOL-Overlay-Object.tape. Copy and save the assembly listing if you wish.
Step 4. Assemble each of the standard library routines in the BALGOL-Library/ directory using BAC-Assembler with an Output Mode of "BALGOL ML Deck." Save the punched card output for each routine as XXXXX-Object.card, where XXXXX is the routine name.
Step 5. Open BAC-220-Generator-Bootstrap.card in a text editor. In the following, "format-6 card" refers to a card image with a "6" in column 1. For each library routine just assembled:
- Insert the format-6 cards for its object deck after the corresponding name and equivalence cards in this bootstrap deck, deleting any existing format-6 cards in the deck for that routine.
- Locate the FINISH pseudo-op word (4 0000 99 0000) near the end of the code just inserted. Leave that word in place, but delete (blank out) any other words that follow it on that card. Adjust downward as necessary the word count in column 3 of that card to account for any words deleted.
- Delete any following cards of object code for that routine.
The code deleted from each library routine should have consisted of a series of sign-2 words followed by a word having a value of 9 0000 00 0000. This appears to have been configuration data a program could have used to build the Generator tape, or at least the initial library. We do not have that program. The sign-2 words represent text similar to the Generator's name and equivalence cards. Presumably the extra words would have been deleted by the builder program, because the Generator will not tolerate them after the FINISH pseudo-op word.
Note that the object deck for the ERROR routine (ERROR-Object.card) requires an additional manual adjustment after it has been assembled and inserted into the Bootstrap deck:
-
Locate the instruction word 6 0037 30 0100, which should be at relative address 0035. This is a branch to the RITE routine.
-
The /44 field of this instruction is the address of the buffer to be written, but at compile time it needs to be relocated relative to the starting physical address of the ERROR routine. Normally this would be done by setting the sign digit of the word to 7, but this word already has a sign digit of 6 to cause relocation of its /04 (address) field. Thus, an alternate method of relocating the /44 field must be used.
-
Insert a new-line in the card image in front of this instruction. It should be the last word on the sixth line of the deck. Adjust the digit in column 3 of the original line to reflect the number of words now contained on that line (should be from 6 to 5).
-
On the new line that has been created, insert the following in front of the existing text:
60x0000yy0zzzz40000040000where x is one plus the number of words originally on the new line (should be 2), yy is the sequence number of the line, and must be one plus the number in the corresponding columns above it (should be 07), and zzzz is the relative address of the first word on that line (should be 0035). The 40000040000 is a "pseudo-op" word that instructs the compiler to relocate the /44 field of the following word. The compiler will drop this pseudo-op word from the compiled code.
-
Adjust the sequence numbers in columns 8-9 on all following cards for the routine so that they increase by one from the number on the preceding card.
-
When finished, the first 47 columns of the card images for the standard ERROR routine object deck should look like this. The first five card images have not been altered; the remaining card images have been modified as discussed above. Vertical bars have been inserted below between words for readability, but must not be present in the card images.
606|00000100000|80000440033|80000300029|25945626453| 606|00000200006|24955000000|80000440033|80000300029| 606|00000300012|24400465659|20000000000|80000440033| 606|00000400018|22044454649|25545440046|25659000000| 606|00000500024|00000000000|20041594963|24854456349| 605|00000600030|00003450000|80412400036|80000420053| 602|00000700035|40000040000|60037300100| | 606|00000800036|00000300036|00000000000|00000000000| 606|00000900042|00000000000|00000000000|00000000000| 606|00001000048|05000000000|00000000000|00000000000| 601|00001100054|40000990000| | |
A similar correction must be made to the object deck for the MONIT routine (MONIT-Object.card) after it has been inserted into the Bootstrap deck.
-
Locate the instruction word 6 0050 30 0200, which should be at relative address 0029. This is another branch to the RITE routine.
-
The /44 field of this instruction needs to be relocated relative to the starting physical address of the MONIT routine, as for the ERROR routine above.
-
Insert a new-line in the card image in front of this instruction. It should be the last word on the fifth line of the deck. Adjust the digit in column 3 of that line to reflect the number of words now contained on that line (should be from 6 to 5).
-
On the new line that has been created, insert the following in front of the text:
60x0000yy0zzzz40000040000where x is one plus the number of words originally on the new line (should be 2), yy is the sequence number of the line, and must be one plus the number in the corresponding columns above it (should be 06), and zzzz is the relative address of the first word on that line (should be 0029).
-
Adjust the sequence numbers in columns 8-9 on all following cards for the routine so that they increase by one from the number on the preceding card.
You can arrange the library routines in any order, but it is probably best to use the same order in which the routines appear in the scanned listing of the compiler and library on the Computer History Museum's web site.
When finished inserting the object code decks, save the updated Generator bootstrap deck. Do not modify the generation control statements at the beginning of the deck, as the intent for this deck is to generate a standard compiler with the default configuration and library. If you wish to generate a compiler with a different configuration or library, create a different generation deck and use that after the Generator and compiler tapes described in this procedure have been created and saved.
Step 6. Start the retro-220 emulator.
Step 7. Designate a tape drive as unit 1 and load the BALGOL-Generator/BALGOL-Generator-Object.tape image file into that drive. Make the drive NOT-WRITE and ready. Designate another tape drive as unit 10 and load a blank tape formatted with 100-word blocks in both lanes. Make the tape write-enabled and ready.
Step 8. Load the BALGOL-Generator/BALGOL-Generator-Object-Store.card deck into card reader unit 1 and make the reader ready. Clear the Processor, enter a CRD instruction (1000 60 0000) into the C register, set the EXECUTE lamp, and click START. The system will read the object deck, load 49 blocks from tape unit 1 to memory, then branch to the Generator's store routine at address 0001. The store routine will write the Generator program from memory in 5 checksummed groups of 10 blocks each to lane 1 of tape unit 10, rewind it, and halt with 1248 00 8421 ("/\") in the C register.
Step 9. Unload the tape image from unit 1. Leave the tape on unit 10 mounted and write-enabled, but change the unit designation on that drive from 10 to 2.
Step 10. Load the BALGOL-Main/BALGOL-Main-Object.tape image file to tape unit 1 and make it ready. Load the BALGOL-Main/BALGOL-Main-Object-Store.card deck into card reader unit 1 and make it ready. Clear the Processor, enter a CRD instruction (1000 60 0000) into the C register, set the EXECUTE lamp, and click START. The system will read the object deck, load 49 blocks from tape unit 1 to memory, then branch to the Main program's store routine at address 0001. The store routine will write the compiler Main program from memory in 5 checksummed groups of 10 blocks each to lane 0 of tape unit 2, rewind it, and halt with 0000 00 2222 in the C register.
Step 11. Unload the tape image from unit 1. Leave the tape on unit 2 mounted and write-enabled.
Step 12. Load the BALGOL-Overlay/BALGOL-Overlay-Object.tape image file to tape unit 1 and make it ready. Load the BALGOL-Overlay/BALGOL-Overlay-Object-Store.card deck into card reader unit 1 and make it ready. Clear the Processor, enter a CRD instruction (1000 60 0000) into the C register, set the EXECUTE lamp, and click START. The system will read the object deck, load 49 blocks from tape unit 1 to memory, then branch to the Overlay program's store routine at address 4000. The store routine will position tape unit 2 forward over the Main program and append 32 non-checksummed blocks of the Overlay program to lane 0 of tape unit 2, rewind it, and halt with 2222 00 2222 in the C register.
Step 13. Unload the tape image from unit 1. Leave the tape on unit 2 mounted and write-enabled.
Step 14. Load the BALGOL-Generator/BALGOL-Generator-Fixup-1-Load.card deck into card reader unit 1 and make it ready. Clear the Processor, enter a CRD instruction (1000 60 0000) into the C register, set the EXECUTE lamp, and click START. The system will read the object deck and do the following:
- Position the tape on unit 2 forward 40 blocks and read blocks 41-43 into memory.
- Position the tape forward 9 blocks and overwrite blocks 53-55 with the data just read from blocks 41-43. This is a copy of the compiler's initial symbol table, as assembled, before the library routines have been inserted into it. The table at blocks 41-43 will be updated by the library build step below. The copy of the initial table is stored at blocks 53-55 for use by the Generator if the library is replaced later.
- Position the tape forward 32 blocks and overwrite block 88 with an EOF sentinel containing 99999999999 (11 nines) in word 0. This marks the end of the library, which is initially empty.
- Rewind tape unit 2 and halt with 9999 00 9999 in the C register.
Step 15. Leave the tape on unit 2 mounted and write-enabled, but change the unit designation from 2 to 10.
Step 16. Designate another tape unit as number 2 and load a blank tape to it formatted as 100-word blocks. Make the unit ready and write-enabled.
Step 17. Load the BAC-220-Generator-Bootstrap.card file prepared above into card reader unit 1. Clear the Processor, enter a CRD (1000 60 0000) instruction into C, set the EXECUTE lamp, and click START. The system will read the Generator callout bootstrap, load the program from unit 10, and generate a version of the compiler according to the generation statements in the card deck. The generated compiler and library will be written to tape unit 2.
Step 18. Since the Generator tape does not yet have a library, one will be built by the Generator from the object decks that were inserted into the bootstrap deck and will be stored on the compiler tape on unit 2. The routine names will be listed on the SPO as they are processed. At the end, the program will rewind both tapes.
Step 19. If a PUNCH LIBRARY statement is present in the bootstrap deck, the Generator will punch a new library deck to Cardatron punch 1, leaving tape unit 2 up-tape at the end. Save the output deck as BALGOL-Library/PUNCH-LIBRARY.card for possible use in modifying the library later.
Step 20. The program will finally halt with 0757 00 7250 ("OK") in the C register if no library deck was punched, or 0725 00 7570 ("KO") if a deck was punched. Leave both tapes mounted and write-enabled.
Step 21. Load the BALGOL-Generator/BALGOL-Generator-Fixup-2-Load.card deck into card reader unit 1 and make it ready. Clear the Processor, enter a CRD instruction (1000 60 0000) into the C register, set the EXECUTE lamp, and click START. The system will read the object deck and do the following:
- Position both tapes forward 41 blocks and copy blocks 42-45 of the compiler tape on unit 2 to the Generator tape on unit 10. This contains the symbol table updated with information for the library routines.
- Position the compiler tape forward to block 85 and the Generator tape forward to block 88, then copy blocks from the compiler tape to the Generator tape until and including a block that has the EOF sentinel of 99999999999 (11 nines) in word 0.
- Rewind and deactivate both tape units (RWL lamps will be lit) and halt with 9999 00 9999 in the C register.
Step 22. Unload tape unit 10 and save the image file as BAC-220-Generator.tape. This is the final version of the Generator tape that can be used to create additional customized compiler tapes as described in Appendix A of the compiler manual.
Step 23. Unload tape unit 2 and save the image file as BAC-220-Compiler.tape. This is a usable version of the BALGOL compiler and library, but has been generated primarily to harvest the library routines and data structures for insertion into the Generator tape. It has the default configuration settings described in Appendix A of the compiler manual, and is the same as would be generated using the deck BAC-220-Generator-Callout.card.
Step 24. A Generator deck to build a compiler for a system with 10,000 words of memory is available from the repository as BAC-220-Generator-Callout-10K.card. The tape image generated from that deck is also available from the repository as BAC-220-Compiler-10K.tape.
Layout of the BAC-220 Generator Tape:
The layout of the final Generator tape produced by the procedure above is:
-
Lane 0 (pre-formatted as 100-word blocks):
- 50 blocks (1..50) of the compiler Main program. These are written in groups of 10 blocks containing 999 words of object code followed by a one-word checksum. It is literally a sum, stored as the negative of the sum (as the 220 would do it) of the 999 other words, discarding overflow. Therefore, summing all 1000 words in the group should yield a zero result. Note that only the low-order bit of the sign digit participates the the sum, and if one, indicates that the other 10 digits represent a negative value. The high-order three bits of the sign digit are ignored by 220 integer addition.
- 2 blocks (51..52) reserved for compiler patches. More importantly, this is where the INPUTMEDIA and OUTPUTMEDIA routines for the compiler are stored, which is why those routines are limited to 200 words in the aggregate. These blocks are prepared by the Generator, then read into high memory addresses by the compiler Main's loader routine located at its addresses 0000-0059. These blocks are checksummed in the same fashion as above, but the checksum word is stored at the beginning of the first block, not the end of the second.
- 3 blocks (53..55) containing a copy of words 3996-4295 from Main, not checksummed. This range of addresses is originally from blocks 41-43 of the assembled compiler tape, and contains in part the compiler's initial symbol table, before entries for the standard library routines have been inserted. It is used by the Generator when the library is being replaced to form a new symbol table for the compiler, updated with entries for the new library, and is then written to blocks 41-43 of the generated compiler tape. The use of these blocks was the most obtuse part of the tape to figure out.
- 3 blocks (56..58) containing addresses 0000-0299 of the Overlay program, not checksummed.
- 29 blocks (59..87) containing addresses 0700-3599 of the Overlay program, not checksummed. The 32 blocks of the Overlay program will be checksummed as one 3200-word unit by the Generator before being written to the generated compiler tape, with the checksum stored in word 99 of block 84 of the compiler tape.
- 3 blocks (88..90) for the Library Table, a directory to the library routines. Not checksummed.
- 1 block (91) having zero as the first word and the rest of the block apparently unused. This appears to be a low-keyword stopper block for the MTS search instruction, which was used by the Overlay module to position the tape rapidly to the routines selected for merging into the executable code.
- n blocks (92..(n+91)) for library routines. Each routine starts at a new block and may span multiple blocks. The first word of each block is the search keyword, with format 0 FFFF 00 SSSS, where FFFF is the library routine number (starting at 0001) and SSSS is the block sequence within that routine (starting at 0000). Each block appears to be individually checksummed with the checksum stored in word 99. Therefore, with the search keyword in word 0 and the checksum in word 99, each block holds up to 98 words of object code.
- 1 block (n+92) having 9 9999 99 9999 as the first word and the rest of the block unused. This serves as an end-of-file sentinel to the Generator and also as a high-keyword stopper block for the MTS search instruction.
-
Lane 1 (also pre-formatted with 100-word blocks):
- 50 blocks of the Generator program, checksummed as described for the compiler Main program above.
The Compiler and Generator programs have a nice utility feature. If you load their object code as assembled to memory -- manually from cards or tape if necessary -- and branch to a certain location, they will write themselves to tape in the form and position required by the Generator program, including the required checksumming. For the Generator and compiler Main, you branch to location 0001. For the compiler Overlay, you branch to location 4000.
Layout of the BAC-220 Compiler Tape:
The layout of a compiler tape produced by the Generator is similar to that of the Generator tape described above, with these exceptions:
-
Lane 0 (formatted with 100-word blocks) contains the compiler and library.
-
The compiler tape omits the skeleton symbol table at blocks 53-55. Therefore, the overlay module starts at block #53 and will have a checksum word as described above. The library table and library start at block 85 instead of block 88.
-
Following the all-9s EOF sentinel block at the end of the library routines, the Generator will write four blocks containing the code for the Object Dump and Program Card Loader routines that are described in Appendix B of the compiler manual. These routines are assembled as part of the generator program.
-
-
Lane 1 (formatted with 100-word blocks) is used only by the compiler's Symbolic Dump diagnostic facility.
Note that the Generator has the ability to modify the compiler for loading from a different lane and to start at a higher block number on the tape. The lane and block numbers cited above are for the default configuration, and should be adjusted based on the COPY COMPILER... and POSITION GENERATED TAPE... statements in the Generator card deck used to configure the compiler.