.if n .po 10 .lg 0 .ta 6m, +6m, +6m, +6m, +6m, +6m .ta 6m, +6m, +6m, +6m, +6m, +6m .if n .po 10 . Paragraph .de PA .sp .ti +5 .ne 2 .. . New page .de np 'fi 'in 0 'sp 2 'tl ''- % -'' 'sp 2 'tl '------'------'------' 'sp 2 'tl '\\*(sn \\*(st''\\fBQed\\fP Tutorial' 'sp 2 'if \\n(be .nf 'in .. . Section start .de SE .ne 5 .sp .ul \\$1 \\$2 .ds sn "\\$1 .ds st "\\$2 .PA .. . Begin example .de BE .sp .nf .in 5 .nr be 1 .. . End example .de EE .sp .fi .in 0 .nr be 0 .. . Start of text proper .sp 10 .ce 25 .ps 16 Programming in \fBQed\fP: A Tutorial .ps 10 .sp 5 Robert Pike .ce 0 .wh -5 np .sp 5 .SE I Introduction \fBQed\fP is a programmable text editor, intended for use primarily by programmers. For the average user, \fBQed\fP's power will likely be unnecessary, even troublesome, and its use is discouraged. For a ``knowledgeable'' user who is willing to learn how to use it properly, however, \fBQed\fP is a powerful tool, both when used as an editor or as a (rather idiosyncratic and low-level) programming language. .PA This document will depend quite heavily on the University of Toronto \fBUNIX\fP manual section for \fBEd\fP, ed(I). \fBEd\fP at U. of T. is a heavily modified, although functionally almost identical, version of the original version 6 \fBUNIX\fP editor. The features of U. of T. \fBEd\fP not found in regular version 6 \fBEd\fP include: the `*' address separator, two character error messages (a query followed by a character which indicates the error, e.g. `?s' for a failed substitution), a simple undo `u' command, and a join `j' command of somewhat greater generality than the \fBPWB\fP version. There is an ``add-on'' document for U. of T. \fBEd\fP which describes the enhancements and modifications, at user level, made at U. of T. This document assumes the reader to be familiar with all these features, and to be quite familiar with \fBEd\fP's capabilities in general and particular. .PA Before beginning the description of \fBQed\fP, some warning should be given. \fBUNIX\fP \fBEd\fP is closely based on a version of \fBQed\fP, running under the \fBGCOS\fP operating system, which was written by Dennis Ritchie and Ken Thompson. When Dennis Ritchie wrote \fBEd\fP, he removed many of the features, including most of the programming capabilities, but left in most of the text editing power. Although the \fBQed\fP described here is significantly more complex and powerful than \fBEd\fP (and quite unrelated to the \fBGCOS\fP version), its increase in power is not proportionate to its increase in complexity. In short, \fBEd\fP is a very powerful editor, and for general editing jobs quite sufficient. \fBQed\fP simplifies some more complicated tasks, and its multifile capability and programmability makes many things possible which cannot be done in \fBEd\fP, but to use it well requires a fairly thorough understanding of its operation, which is fairly intricate. .PA \fBQed\fP has several drawbacks that should be admitted early. \fBQed\fP programs can be difficult to read, even if done carefully, since it operates at about the level of a rather cryptic assembler. It is also fairly easy to damage things like files using \fBQed\fP incorrectly, but it is much harder (we think) to accidentally cause trouble with the U. of T. \fBQed\fP than earlier ones, as the implementers worked very hard at safeguarding. The safeguards are strong enough, though, that occasional users who desire a particular \fBQed\fP feature for one editing job need not feel in danger of making a serious mistake! \fBQed\fP's power can lead the user astray; it has far more power than is needed for most editing jobs. As an illustrative but rather artificial example, consider the problem of reversing the lines of a file, so that the last line appears at the top, and the first at the bottom, with the contents of the lines unchanged. At first, this may seem a problem for \fBQed\fP if it is only to be done once or twice (if it is to be done often, we would certainly write a C program!), but it is very easy to do in \fBEd\fP: .BE g/^/m0 .EE (It will be the convention in this tutorial to show user input in Roman font, and editor output in Italic.) A second, slightly more complicated example is the problem of placing two columnar files, say files generated by `ls', alongside each other in a buffer. \fBEd\fP can again do the job quite well (assume there are 15 lines in each listing): .BE e file1 \fI142\fP r file2 \fI153\fP 1ka 15,30 g/^/m'a\e .ti +3 +ka g/^/ .,+j/ / .EE .ti +5 This preamble may sound discouraging, but it is only promoting realism. When used well and carefully, \fBQed\fP can be a great time saver, fun to work with, and sometimes even elegant. This document was written using \fBQed\fP, storing sequences like \efBQed\efP in registers to save typing, etc. Well, if you've read this far, you must be determined, so you're ready to learn about buffers. .SE II Buffers \fBEd\fP has one buffer \(em one scratch area in which to keep text. \fBQed\fP has 56, labeled by lower case alphabetics `a' to `z', upper case alphabetics `A' to `Z', and for reasons worth ignoring at this point, the characters `{', `|', `}' and `~'. Each buffer has its own associated dot, dollar and filename. The easiest way to see how they work is to chdir to a directory with about ten C source files and type: .BE qed *.c .EE (we've already accomplished something impossible in \fBEd\fP!) \fBQed\fP will print out the character count for each file, and then wait for a command. Type .BE n .EE and look at the output. If you were in the \fBQed\fP source directory, you would see something like .BE .ul a \fB.\fP90 address.c .ul b 90 blkio.c .ul c 618 com.c .ul d 369 getchar.c .ul e 200 getfile.c .ul f 113 glob.c .ul g 695 main.c .ul h 157 misc.c .ul i 134 move.c .ul j 358 pattern.c .ul k 154 putchar.c .ul l 36 setaddr.c .ul m 106 string.c .ul n 407 subs.c .EE The first column is the buffer name, the dot marks the current buffer, the number is the value of dollar in the buffer, that is, the number of lines, and the last column is the file name local to that buffer. The `n' (for ``names'') command is \fBQed\fP's equivalent of `ls -l'. Now do an `f' command. You'll see .BE .ul a \fB.\fP90 address.c .EE In \fBQed\fP, the `f' command tells you more than just the file name. Now change something in the file, say substitute out a tab or delete an empty line, and do another `f': .BE .ul a\(aa\fB.\fP90 address.c .EE The prime tells you that the contents of the buffer are known to differ from the named file. Now try .BE bb f .ul b \fB.\fP90 blkio.c .EE The ``bb'' and ``f'' can be placed on the same line, as \fBQed\fP does not require a newline after most commands. The `bb' says ``change to buffer b''. Buffer `b' is now the current buffer, as indicated by the dot. If you browse around the buffer for a while, you will see that it is really a world unto itself, but changing back to buffer `a' by a ``ba'' command will reset you back to the original file, with dot still at whatever line it was when you ``bb''d. .PA Why have multiple buffers? For one thing, we can copy or move text between buffers. Go back to buffer `a' and isolate a subroutine, marking its beginning with ``ka'' and its last line with ``kb''. Then type .BE \&'a,'b tz0 .EE This is a regular copy command, but the `z' after the `t' tells \fBQed\fP that the text is to be copied to buffer `z'. The `0' is the usual address, but is interpreted in buffer `z' rather than the current buffer. Of course, if the character after the `t' is not a valid buffer name, \fBQed\fP performs the usual copy command. Do another `n', you will see that you are currently in buffer `z', and dot is set to the last line copied. The move `m' command behaves similarly. .SE III "Special Characters (I)" Change to buffer `z' and clear it: .BE bz 0,$d .EE Note that line `0' is a valid address for deletion. ``*d'' also would work here, and both these addressing modes will not generate an error if the buffer is empty. As well, you could have typed .BE bz Z .EE for ``zero''; the `Z' command unequivocally clears the buffer, even its remembered file name. Now do the following: .BE ap g/^[a-zA-Z_].*(/p .ul g/^[a-zA-Z_].*(/p .EE ``ap'' appends the single line of text, printing it afterwards, so buffer `z' now contains a possibly useful command for \fBQed\fP (make sure you know what it does!) which we can call up when desired. Now read some C source into buffer `a' if there isn't already some there, and try out the buffer like this: .BE ba \ebz .ul int *address(deflt) .ul adderr(c) .EE The sequence `\ebz' means ``insert the contents of buffer `z' in my input stream here.'' The final newline is effectively removed from the buffer, so that if you decided later that you wanted to know the line numbers as well, you could tag a ``.='' command on the end: .BE ba \ebz .= .ul int *address(deflt) .ul 2 .ul adderr(c) .ul 89 .EE Although most \fBQed\fP commands can be arbitrarily grouped on a line, the global `g' command, as in \fBEd\fP, still reads the full line for its command list, which in this case is ``p .=''. .PA The above example is very important, as it uses a mixture of buffer input and terminal input to run a command, an all-pervading concept in \fBQed\fP programming. .PA `\ebz' is called a ``special character'', although in some sense it isn't really a character at all, as it gets completely replaced with the contents of buffer `z'. The `\ebz' is interpreted .ul whenever input is expected, not just when commands are being read. Try the following examples: .BE by a \ebz \&. p .ul g/^[a-zA-Z_].*(/p ap \ebz .ul g/^[a-zA-Z_].*(/p !echo "\ebz" .ul g/^[a-zA-Z_].*(/p .ul ! .EE The buffer could contain multiple lines, which would be handled as usual. We could, for example, save in a buffer our example from the introduction, which merged two columnar files alongside each other, and invoke it when desired just as we invoked the global search above. But care must be exercised here, as the newlines in the buffer, except for the last, are also placed in the input stream. If we were to type, with the multiline buffer in `z', the command .BE s/x/\ebz/p .EE mistakenly expecting that buffer `z' had just a single line of text, say a frequently typed word, we would really be saying: .BE s/x/1ka 15,30 g/^/m'a\e .ti +3 +ka g/^/ .,+j/ //p .EE This would, of course, cause an immediate error, and since \fBQed\fP always returns to teletype input when an error occurs, no damage would be done. Sometimes, though, such mistakes can cause strange results! .PA If you did try the above command, the error message would be .BE ?bz2.0 ?x .EE \fBQed\fP gives a traceback on errors. The elements of the traceback are of the form .BE ?bXM.N .EE where X is the buffer name, M the line number, and N the character number of the character at which the error was recognized. In the above example, the substitute found a syntax error (``?x'') when it read the newline, so the error occurred at the beginning of line 2 of buffer `z'. If input is nested, the deepest-called buffer is printed first. .PA It is a good idea to pause here and look carefully over what has been covered so far, as the concept of using a buffer to store regular files or command input interchangeably is really the heart of \fBQed\fP. Before reading on, use \fBQed\fP for a while to familiarize yourself with the system of buffers, and try out a few simple buffers for repetitive editing tasks. .PA \fBQed\fP has a fair number of special characters for various purposes. In the rest of this section we will look briefly at some of the simpler ones to give you some insight into how they behave. First, enter buffer `z' again and append: .BE a \eFa \eFb \&. .EE and then look at what \fBQed\fP has appended to the buffer. The special character `\eFa' means ``the file name for buffer `a','' and, like all special characters, is interpreted whenever input is expected. The special character `\ef' is a shorthand for ``the saved file name in the current buffer.'' Try .BE f junk .ul z\(aa\fB.\fP2 junk w .ul 15 !ls \ef .ul junk .ul ! .EE Idioms such as .BE !cc \ef .EE are very common. If your file name is long, `\ef' can save much typing. If the file name is changed, through an `f' or `e' command, the name actually associated with `\ef' is only changed when the new name is completely read in. Thus, you can type .BE e \ef .EE to reinitialize a buffer, or .BE e /usr/source/s2/\ef .EE to edit the system version of a program. There is another special character like `\ef', but it is more useful for programming. \eB means ``the current buffer name.'' Try .BE !echo \eB .ul z .ul ! .EE .SE IV "Special Characters (II)" The easiest way to gain familiarity with the more abstruse characters is to use them in messages, which are a special case of comments. A comment starts with a double quote `"' and continues until the first following double quote, or the end of the line, whichever is first. The line is ignored by \fBQed\fP, except that dot is set to the addressed line, if there is one: .BE 4 " This comment sets dot to line 4 .EE Messages are just like comments, except that the first character after the double quote is another double quote. If the message ends with a double quote rather than a newline, no newline is printed: .BE " hi "" hi .ul hi "" hi there " \fI hi there \fP [cursor is left on this line] ""Current buffer: b\eB Current buffer: bx .EE This last example is mildly interesting. Can we save the command in, say, buffer `x' and call it back, from any buffer, when desired? .BE bA \ebx Current buffer: bA .EE In principle, it can be done, since the current buffer is the one we are working on, not the one being read for input. But, to put the characters `\eB' in a buffer, we must delay their interpretation so that they are not replaced with the buffer name until read back as command input. In most systems on \fBUNIX\fP, this is done by typing an extra backslash, but things are more civilized in \fBQed\fP. In \fBQed\fP, special characters are .ul delayed, not quoted. Perhaps it's simplest just to state the rules: .sp .in +5 .ti -2 -\ \eX, a special character if X is one of `b', `B', `c', `f', `F', `l', `N', `p', `r', `z' or `"', sometimes (as with `\eb') followed by a buffer name, is interpreted .ul immediately. (We will see what all these special characters are in due course.) .ti -2 -\ \eZ, where Z is not one of the above, undergoes no interpretation at all. In particular, the backslash is not stripped away. .ti -2 -\ \ec is reduced, on scanning, to \e, but not re-scanned. .ti -2 -\ \e\(aaX is equivalent to \eX, but special characters embedded in \eX are not interpreted. .sp .in Things are a little different in regular expressions, but let's ignore them for the moment. These four rules, simple though they are, define the interpretation of backslashes in \fBQed\fP. Note that `\e\eZ', where Z is again not one of the above characters, remains `\e\eZ', but if Z .ul is special, say `f' when the saved file name is ``junk.c'', `\e\ef' becomes `\ejunk.c'. .PA Now we know how to install a `\eB' in our buffer: we delay its interpretation by putting a `c' between the backslash and the `B'. (The `c' is for ``character'', or (it is rumoured) for Mr. E. S. Cape, inventor of the backslash.)\ The `\ecB' will reduce to `\eB' when typed in: .BE bz ap ""Current buffer: b\ecB .ul Current buffer: b\eB bA \ebz .ul Current buffer: bA .EE Since `\ecc' will reduce to `\ec', the number of `c's present is always just the number of times the interpretation is to be delayed. .PA To decide how many delays are necessary, here is the list of input forms that cause characters to be interpreted: .sp .in +5 .ti -2 -\ teletype input .ti -2 -\ commands or text saved in buffers invoked using a special character .ti -2 -\ command lines for the `g', `v', `G', `V' or `h' commands (`g' and `v' are the same as in \fBEd\fP; we'll see the others a little later) .in .sp Note that characters are .ul not interpreted when buffers are read from or written to files, or moved or copied with the `m' or `t' commands. Experience is a great help here, so let's look at some examples: .BE bx s/$/\eB/ appends `x' to current line s/$/\ecB/ appends `\eB' to current line s/$/\eccB/ appends `\ecB' to current line .EE but: .BE g/xxxx/ s/$/\eccB/ .EE appends `\eB' to all lines with ``xxxx''; the extra `c' is because the command is in a global command string. Let's say we want to change all the `\en's to be `\en\et'. There are two ways: .BE *s/\en/\en\et/ " equivalent to *s/\ecn/\ecn\ect/ " or g/\en/ s/\en/\en\et/ .EE No delays are necessary because `\en' and `\et' are not special characters, but delaying them once makes no difference. (Warning: `\en' has special meaning in the replacement text of a substitution in the U. of T. \fBEd\fP.) .PA While we're dealing with globals, it is a good time to introduce the `\eN' special character. It means, simply, a newline, and is useful primarily because we can delay it in the usual way. Commands, such as `r', which deal with filenames must often be followed by a newline, but can be dealt with using `\eN' in globals. The \fBEd\fP sequence .BE g/xxxx/ r\e \&.= .EE can be put all on one line in \fBQed\fP: .BE g/xxxx/ r\ecN .= .EE The newline is delayed. In original version 6 \fBEd\fP, it is impossible to globally substitute a newline into lines, but it's straightforward (by \fBQed\fP standards!) in \fBQed\fP: .BE g/xxxx/ s//\e\ecN/p .EE The `\e\ecN' is a backslash followed by a delayed newline. The `\ecN' becomes `\eN' when scanned by the global, and a newline when read during the substitution. In \fBQed\fP (and U. of T. \fBEd\fP) we could also do this by the functionally slightly different .BE g/xxxx/ s//\e\e /p .EE [Do you see the difference?]. .PA Backslashes in general are handled more reasonably in \fBQed\fP than in other \fBUNIX\fP programs; because special characters are ``delayed'' rather than ``quoted'', the number of characters required to insert a special character, with interpretation delayed .ul n times, is just \fIn\fP+2 or \fIn\fP+3, rather than exponential in \fIn\fP. A \fBtroff\fP line with 31 backslashes, a not-unheard-of occurrence, would in \fBQed\fP have a single backslash followed by 5 `c's. (And would be much easier to understand, text edit, and debug!) .PA In particular, \fBQed\fP handles backslashes differently from \fBEd\fP. As mentioned earlier, the \fBEd\fP command .BE s2/"/\e\en"/p .EE is simply .BE s2/"/\en"/p .EE in \fBQed\fP, because `\en' is not a special character. There are, however, characters which are not ``special'' in the sense we are using here, but are ``magic'' in that they have non-literal meaning. The most obvious are characters such as `.' and `$' in regular expressions, which must be ``quoted'' with a backslash to remove their special meaning and make them literal. (It becomes clear after using \fBQed\fP, or even \fBEd\fP, for a while that .ul all the magic characters in regular expressions and the like should require a backslash to become .ul magic, rather than literal, but the current choice is too ``wired in'' to the minds of most \fBEd\fP users to be changed now.)\ Because they are not special characters, their interpretation need not be delayed \(em they only mean something to the substitute command. None of the magic characters in the substitution .BE s/\e(\e.*\e)xxx$/\e1/ .EE require delaying when typed in or run from a global command: .BE g/xyz/ s/\e(\e.*\e)xxx$/\e1/ .EE [Exercise: Is the following command the same as the above substitution? .BE s/\ec(\ec.*\ec)xxx$/\ec1/ .EE Why or why not? Is this the same as the global substitution? .BE g/xyz/ s/\ec(\ec.*\ec)xxx$/\ec1/ .EE Try it to test your answers.] .PA Because of these magic characters, two backslashes in a row `\e\e' mean a single backslash `\e' in regular expressions; otherwise it would be impossible to substitute in a real backslash before a magic character: .BE a abc xyz def s/xyz/\e\e&/p .ul abc \exyz def up .ul abc xyz def s/xyz/\e\e\e&/p .ul abc \e& def .EE What about sequences like `\e\eB'? Well, `\eB' is not a character at all, but a special character (sorry for the terminology) as it is .ul immediately, at the lowest level of input, replaced by the current buffer name. Since `\e\e' is not a special character, and has non-literal meaning only when found between regular expression delimiters, the substitute itself never sees the second backslash. All interpretation of special characters is done before the substitute sees them. If the current buffer is buffer `a', .BE s/\e\eB/x/ .EE does exactly the same thing as .BE s/\ea/x/ .EE Also, because \fBQed\fP converts `\e\e' to `\e' in regular expressions, .BE s2/"/\e\en"/p .EE is the same as .BE s2/"/\en"/p .EE since `\en' is not a special character. .PA \fBQed\fP saves the last used regular expression and replacement text used in an `s' or `j' command, so that they can be called back using `\ep' (for ``pattern'') and `\er'. `\ep' is handy when you want to change the saved pattern. If, for example, you start searching for ``proc()'' and want the declaration, but find there are very many usages of ``proc()'', it is simple to find an occurrence of ``proc()'' at the beginning of a line: .BE /proc()/ .ul x=proc(); // .ul x=proc()*2; /^\ep/ .ul proc(){ .EE `\ep' is of somewhat limited usefulness, as the null regular expression is essentially the same as ``/\ep/'', but `\er' provides a new convenience. Browsing throught text doing repetitive substitution is simplified considerably by using `\er': .BE s/apples/mangos and pears/p .ul I ain't got no mangos and pears // .ul your mother's apples smelled like they were s//\er/p .ul your mother's mangos and pears smelled like they were .EE There is a danger with `\ep' and `\er': if they contain delayed special characters, each usage of `\ep' or `\er' removes one delay. If the current file name is ``wylbur,'' it may be difficult to deal with ``troff'' font changes: .BE p .ul editors such as Wylbur are so s/Wylbur/\ecfBWylbur\ecfP/p .ul editors such as \efBWylbur\efP are so // .ul Wylbur is also no good for s//\er/p .ul wylburBWylburwylburP is also no good for " Oops .EE This is the sort of trouble which the \e\(aa special character can circumvent. `\e\(aar' means the usual `\er', but with special characters inside uninterpreted. Let's fix up our second substitution above: .BE up .ul Wylbur is also no good for s//\e\(aar/p .ul \efBWylbur\efP is also no good for " Much better .EE `\er' is also handy for fixing a certain class of mistakes: .BE p .ul textp=get(a->text.fdes); s/text/tbuf/p .ul tbufp=get(a->text.fdes); " Oops again us2//\e\(aar/p .ul textp=get(a->tbuf.fdes); .EE ``us2//\e\(aar/p'' is a \fBQed\fP idiom which undoes a substitute and does it again on the second match in the line. .PA Now, as an exercise, use \fBQed\fP for a while until you feel comfortable with the use of backslashes. If you find them confusing, work with \fBQed\fP, doing fancy things if you feel up to it, until the confusion disappears \(em what follows will be much stranger... .SE V "Special Characters (III)" Now that we've established the ground rules, we can begin to use some of the fancier stuff in \fBQed\fP. .PA The special character `\el' returns a line of text from standard input, usually the user at the terminal (i.e. if input is in a buffer, it is temporarily redirected to the terminal). The terminating newline is stripped away. Since it is interpreted immediately, `\el' is rarely of value except when delayed, but let's look at how it behaves in ``immediate mode:'' .BE ""\elMessage\eN .ul Message ""\elMessage .ul Message ""\elMessage s of words .ul Messages of words .EE The extra newline, whether provided by the `\eN' or by a second carriage return, is necessary because the `\el' strips its terminating newline away, but the comment is looking for a newline itself in order to terminate. [Some questions to consider: If `\ebx' is used instead of `\el', the second newline is not required. Why? In the last example above, which characters are returned by `\el'? What is the origin of the others, if any? What would the above examples do if the comments were terminated with a double quote?] .PA Well, `\el' is clearly of little use if not delayed, but it is important to understand how it behaves. .PA An early version of U. of T. \fBQed\fP had only lower case buffer names, and when the names `{' through `~' were added it was necessary to go through the manual changing some of the "`z'"s into "`~'"s, but not all of them. The following single line made the job very simple: .BE g/`z'/ p ""replacement:" s//`\ecl'/p .EE Each line with a `z' is printed, the user is prompted for the replacement, and the response (either a `z' or a `~' in our case) inserted. The single delay ensures that `g' places a literal `\el' in the substitution string, which is then interpreted when each call to the substitute builds its replacement (``right-hand side'') text. This sort of operation can also be performed using an `x' command driven by a global, but \fBQed\fP can be programmed to do most of the work. .PA Here's another example: .BE bz *d ap ""Comment:\ecl" s|$| /* \ecl */|p .ul ""Comment:\el" s|$| /* \el */|p .EE The first `\el' in the comment ``eats'' the input remaining on the line after the `\ebz' which invokes the command: .BE ba a c.code; \ebz \fIComment:\fPstylish .ul c.code; /* stylish */ .EE Of course, if the comment contains the character `|', problems will occur. If we intend typing the comment on the same line as the invocation of the buffer, we want neither the ``Comment:'' message nor the extra `\el' which clears the input line. .BE up .ul c.code; bz s/".*" //p .ul s|$| /* \el */|p ba \ebzstylish .ul c.code; /* stylish */ .EE This latter form is likely more useful, as it can be called from a global (the previous version could, but required the user to type extra newlines). For example, to comment all occurrences of a variable: .BE g/\e{var\e}/ p \ecbz .EE Each line is printed, and the user's response is appended as a comment. No extra `\el' is needed at the end, to ``clear'' the input line, as the `g' reads the line up to and including the terminal newline, so the first `\el' returns the next line typed in. Note that the `\ebz' is delayed so that it is interpreted for each line with ``var''. We could have set up our buffer so that the `\el' was delayed, by inserting a `\ecl' instead. Then, the buffer would be invoked as `\ebz', without a delay. In effect, then, the `c' in the buffer call delays the `\el'. If the buffer had only literal text, no delay would be necessary. Our choice of where to put the delay was made by having the buffer be invocable directly from the keyboard. Just for the record, note that we can achieve the effect of `\ecb' above by typing `\e\(aab', although the manner in which it works is quite different. .PA These examples are somewhat ``low-key'', but begin to show how the parts of \fBQed\fP fit together. Later, we will see how the `\el' can be used to control execution of commands. .SE VI Registers \fBQed\fP has 56 registers, with the same names as buffers: `a' to `z', `A' to `Z', `{', `|', `}' and `~'. Buffers and registers are otherwise unrelated. The registers are used to store simple text and short command sequences. In fact, most of the command buffers we have created so far would be better suited to storage in registers; buffers are generally used for storage of file text proper and multiline command sequences. The two main advantages of using registers to store text are: they can be set and manipulated without leaving the current buffer, and they do not appear in the output from `n' commands, which is significant because a user may typically have twenty or more defined registers. .PA Registers are manipulated with the `z' (for ``zdring''!) command. The character after the `z' is the name of the register being operated on, and the next character is an operation code. The most straightforward operations are assignment and printing: .BE za:procrastination zap .ul procrastination .EE The string being assigned to the register is terminated by a newline. If a newline is to be embedded in the register, `\eN' provides the cleanest mechanism: .BE za:line1\ecNline2 zap .ul line1\e\eNline2 a \eza \&. -,.p .ul line1 .ul line2 .EE Registers are invoked in the obvious way: `\eza' inserts the contents of register `a' into the input stream. Note in the above example that the append could not be done on one line, as the embedded newline in the register would cause the first line (``line1'') of the register to be appended, and the second (``line2'') to be interpreted as command input. This is another example of embedded newlines causing trouble: be careful! .PA There are many operation characters for registers; they are listed in full in the manual section. We can add text at the end or beginning of the register with ``za$'' and ``za^''; increment and decrement the ASCII value of the characters in the register with ``za+N'' and ``za-N'', where N is a number; and do subzdring (!) operations with the ``take'' and ``drop'' functions ``za)N'' and ``za(N''. One particularly handy form is .BE za/regular expression/ .EE which saves in register `a' the string in the current line which matches the regular expression. There are several other register operations we will introduce when required. .PA These operations are quite straightforward; we will see them all used when we start to program \fBQed\fP. .PA Registers can also be manipulated numerically. When so used, assignments stop at the first non-numeric character, comparisons are arithmetic rather than lexical, and so on. The command syntax is similar, except that a number sign `#' is placed between the register name and the operation character: .BE za#:4 za#*-5 zap .ul -20 .EE If desired, a series of operations can be strung together into a single command, with some increase in execution efficiency: .BE za#:4#*-5#p .ul -20 .EE The main difference between ``zap'' and ``za#p'', if register `a' is entirely numeric text, is that `#p' can be appended at the end of a sequence of numeric operations, as above. An error occurs if numeric operations are performed on a register which does not contain only a number. .PA Perhaps the most important use of register numeric operations is in addressing. The operation character `a' causes the register to receive the line number of the address of the command: .BE $za#a .EE assigns register `a' to be the number of lines in the current buffer, and .BE /xxxx/za#a .EE saves in ``za'' the address of the first forward occurrence of ``xxxx''. The `r' operation character (for ``range'') stores the first given address in the named register, and the second address in the register whose name is lexically one greater: .BE 1,$ za#r "or" *za#r .EE puts `1' in register `a' and the value of `$' in register `b'. Neither `a' nor `r' changes the value of dot. These operations are usually used to pass addresses to an execution buffer; if the first line of a buffer is .BE za#r .EE then if the buffer is invoked as .BE -5,.\ebz .EE registers `a' and `b' contain the lines to be operated on by the buffer. .PA Numerical operations are frequently useful in text editing, such as when generating defined constants for a table: .BE a read write open close creat \&. " capitalize ?read?,.s/.*/^/p .ul CREAT za#:0 ?READ?,.g/^/s/.*/#define & \ecza/p za#+1 .ul #define READ 0 .ul #define WRITE 1 .ul #define OPEN 2 .ul #define CLOSE 3 .ul #define CREAT 4 .EE The `^' (caret) character in the right hand side of a substitute behaves like `&', but flips the case of alphabetics in the matched string. .SE VII "Control Structures" The most commonly used control structure in \fBQed\fP is certainly the global command, `g', which is remarkably powerful and versatile, as the previous example demonstrates. The ability to place several commands on a line, and the simplicity of `\eN', make globals even easier to use in \fBQed\fP than in \fBEd\fP. .PA Along with the concept of a line-by-line execution goes that of buffer-by-buffer execution, which is provided in \fBQed\fP by the `globuf' commands `G' and `V'. They are quite simple to use: their format is identical to regular globals, but the regular expression is used to match the output which would be produced by an `f' command in each buffer. Only buffers which contain text or have a remembered file name are tested for a match. If a buffer matches the regular expression, the command list is executed in that buffer. For example, .BE G/.\(aa.* ./w .EE writes out all buffers which have been modified since last written. The white space in the above example is a tab, which is the actual delimiter used by the `f' and `n' commands between the number of lines in the buffer and the file name. Here's a fancier example: .BE G/./ g/thing/ ""\ecB \ecf: " p .EE It scans through all non-null buffers for occurences of ``thing'', and prints the buffer, file name and line for each occurrence. .PA \fBQed\fP also has a loop control structure, the `h' command (for ``\fIh\fPuntil''). `h', like `g', takes a line of commands and executes it repeatedly. It has four forms: .in +10 .ti -5 hN \h'|1.5i'executes the line N times .ti -5 ht \h'|1.5i'executes the line until the truth flag is `true' .ti -5 hf \h'|1.5i'executes the line until the truth flag is `false' .ti -5 ha \h'|1.5i'(`always') executes the line forever, or until an error .in 0 Although the loop is an ``until'', .BE h0 p .EE is guaranteed to execute zero times. .PA The truth flag is set by substitutions and comparisons in registers. When a register is compared to some value, the truth flag is set according to the success of the comparison. When a substitution is made, the truth flag is set if a substitution was performed. As a simple example, say you have prepared a letter to be sent to someone, using \fBQed\fP, only to find that the erase character is a backspace, not `#' as you had been using. To fix the problem, .BE g/^/ hf s/.#// .EE [Why does ``s/.#//g'' not work?] Note that ``huntil''s can be run inside globals, and, in fact, can be nested arbitrarily deep. Globals can also be run from huntils; the only restriction is that globals cannot be called from globals, as \fBQed\fP can only mark a line for a global once. Similarly, globufs cannot be called from globufs. .PA As in globals, huntils stop the scan of the command sequence at the first newline. To build an alphabet in register A: .BE za:a zA: h26 zA$\ecza\ecNza+1 .EE Note that \fBQed\fP code is not always easy to read! If you happen to know that the character below `a' in ASCII is a back quote, you could build the alphabet a little more simply: .BE za:` zA: h26 za+1 zA$\ecza .EE Register `a' could also be used in auto-increment mode to simplify things even further: .BE za:` zA: h26 zA$\ecz+a .EE The `+' between the `z' and `a' in the register call cause the register to be incremented .ul before placed in the input stream. Auto-decrements are also possible (`z-a') as are numerical increments and decrements (`z#+a' and `z#-a'). Only increments and decrements of one unit are possible. .PA As a less frivolous example (one that was used in writing this tutorial), a huntil makes it simple to convert, say, the ``troff'' command ``.ul 5'' to five ``.ul''s, one after each affected line: .BE g/^\e.ul [0-9]+/ zn/[0-9]+/ zn#-1 s/ [0-9]+// h\eczn +a .ul .EE It looks horrible, but it works, and can save much trouble if there are (as in the tutorial) twenty or more places where the fix needs to be made. (The `+' character in regular expressions is like `*', but guarantees at least one match.) Of course, until familiarity with \fBQed\fP is developed, the mental effort required to write a line like this and have it work is probably considerably greater than the physical effort required to type in the changes individually. Even for beginning users, though, saving the complicated patterns and commands such as ``ap\ .ul'' in registers would make the job much more pleasant. .PA Again, care must be taken when invoking registers or buffers in huntils; .BE h20 \ebz " or " h20 \ecbz .EE will likely not do what is expected if buffer `z' contains more than one line. .PA The other major new control structure in \fBQed\fP is the `y' command (for ``\fIy\fPump''; think of ``jump'' pronounced with a Swedish accent). The syntax is: .BE y[tf][N o \(aalabel \(galabel] .EE which translates as follows: If the `t' or `f' is present, jump only if the appropriate condition is satisfied; otherwise jump always. The `N', a number, is interpreted as a line number in the current executing buffer which is to be the next line read for commands. The `o' (for ``out'') causes the current input source, such as a global command string or buffer, to be terminated. If the input source is a buffer, the effect is to return from the buffer; if a global, the execution of the global (or huntil) is stopped. For example, .BE za#:1 h50 za#+1 za#>20 yto .EE executes 21 times, leaving register `a' set to ``21''. The forms .BE y[tf]\(aalabel .EE and .BE y[tf]\(galabel .EE are similar to ``y[tf]N'', but the line to which control is transferred is the first line found, searching forward in the buffer (or backward, if the back quote is the operation character) which begins with the comment .BE "label .EE Initial blanks and tabs are ignored, and the scan of the label stops at the first blank, tab, newline or double quote. If no matching label is found, execution resumes at the first character past the label in the yump command. Note that the label must be matched exactly; it is not interpreted as a regular expression. .PA There are few non-trivial small examples which illustrate the use of yumps, but they will be used extensively later on in the tutorial. For the moment, a remark on style. Clearly, with only a ``goto'', flow of control in \fBQed\fP can become messy if care is not taken. It is recommended that yumps only be used in easily identifiable forms such as .BE yf\(aaelse ... y\(aafi "else ... "fi .EE and .BE "do ... yf\(gado "od .EE or .BE "{ yf\(aa} ... y\(ga{ "} .EE One particularly useful form of labeled yumps is a switch statement based on a line of input from the user. This mechanism makes command interpretation very simple; it is essentially a fancy switch statement: .BE y\(aaX\el "default: ... yo "Xcase1 ... yo "Xcase2 ... yo etc. .EE One other form of yump exists; it is intended primarily to skip the rest of a global or huntil command sequence, without stopping the execution completely. Its form is simply ``yt'' or ``yf''. When invoked, it jumps over the current input source up to and including the next newline. It can also be used as a shorthand in buffers, but such usage is discouraged. .SE VIII "Calling the Shell" \fBQed\fP has two methods of calling the Shell aside from the `!' (``bang'') command: ``crunch'' (`<') and ``zap'' (`>'). Crunch takes the standard output from the Shell command and reads it into the current buffer, as if the Shell were run into a temporary file which was then read in with an `r' command. Like the `r' command, `<' takes an optional address which specifies the line (defaulting to `$') at which the text is read in. .BE < ls .ul 162 .ul ! .EE appends a list of the files in the current directory. One very common usage of the crunch command is to find out what needs to be done, by a command such as .BE bz < grep "\e{var\e}" *.c .ul 434 .ul ! .EE and using buffer `z' as a sort of checklist for making modifications to source files and the like: .BE bz < cc -c *.c | tee /dev/tty .ul ... diagnostic messages ... .ul 282 .ul ! .EE saves the listing of the compile errors so you can let ``cc'' run through everything before fixing typing mistakes, etc. .PA Zap is to crunch what `w' is to `r': it writes the contents of the addressed lines, defaulting to the entire current buffer, out as standard input to the Shell command. It is frequently used to send mail. The letter can be prepared in a buffer, edited as desired, and then sent easily by .BE > mail joe .ul ! .EE or even .BE 0a .pl 1 > nroff | mail joe .EE Zap and crunch work nicely together. We can perform a ``dsw''-like function using crunch to read the files in, modifying the list as appropriate, and sending it out to ``args'': .BE < ls .ul 162 .ul ! ... editing commands ... > args rm .ul ! .EE (Args takes each line on its standard input and makes it an argument to the command, which is then exec'd in the normal manner.) The following commands can initiate the construction of a dependency-list file for ``make'': .BE sh -e .ul % command1 .ul % command2 .ul % command3 .ul % ! .EE In short, the crunch and zap commands are used very frequently. .SE IX "Programming (I)" Now that we've seen all the primitives, we can begin using buffers and registers to build more sophisticated commands. The first step is to assemble a few useful command sequences in registers. Harking back to our function-declaration-finding buffer in section III, define register `f' (for ``function''): .BE zf:-/^[a-zA-Z_].*(/ zfp .ul -/^[a-zA-Z_].*(/ .EE As a global search, this regular expression found all function declarations, provided, of course, that the usual paragraphing style is used. .PA [Exercise: Write another definition to perform this function which uses the ``beginning of identifier'' (``\e{'') metacharacter.] .PA Now, enclosed in slashes, with a leading minus sign (in U. of T. \fBEd\fP, ``\(em/regexp/'' is the same as ``?regexp?''), register `f' finds the first .ul previous function declaration. This seems like an odd concept at first, but works well. For example, to see which function's source is being browsed: .BE \ezf .ul function(x) .EE Or to find the declaration of a local variable: .BE p .ul variable=0; \ezf/ variable/ .ul register variable; .EE (No semicolon is needed between search strings in U. of T. \fBEd\fP.) To print out the ``current function'' on the line printer: .BE \ezf, /^}/ w /dev/lp .EE or .BE \ezf, /^}/ > opr .EE There are fancier things, too. If we want to know which subroutines call ``proc()'', we can use ``\ezf'': .BE g/proc()/\ezf .ul func1(x) .ul func2(y) .ul func3() .EE After using macros like ``\ezf'' for a while, they become familiar to the point that they become idiomatic, a part of the \fBQed\fP language. To help the user develop a personal working environment, \fBQed\fP provides a simple mechanism for initializing. Typing (to the Shell) .BE qed -x qfile file1 file2 .EE causes \fBQed\fP to load the named ``qfile'' into buffer `~' (`tilde') and execute it before reading in the files to be edited and beginning the normal editing session. Typically, the startup file is used to initialize options and registers; it might contain something like .BE ""Qed zc:s@$@ /* \ecl */@p zf:-/^[a-zA-Z_].*(/ b~Z " destroy buffer after execution .EE which prints a message, defines a couple of handy registers, and obliterates itself. If no ``-x'' option is given, \fBQed\fP looks up, in /etc/qedfile, the name of a file containing the default initialization buffer for each user, and executes that. The default file is settable through the ``qedfile'' program, documented in the \fBUNIX\fP Programmer's Manual. .PA Browsing through the startup buffers of a few experienced \fBQed\fP hacks, a few interesting things come to light. One simple but rather pretty option is .BE ob""\e032"+p .EE ASCII 032 is a reverse line-feed on most of the U. of T. terminals; the line above is appears as it would if it were displayed with an `l' command. The `b' (for ``browse'') option defines a special register which is executed, if defined, when a simple newline is typed at the terminal, rather than doing the usual ``+p''. Printing a reverse line-feed before the ``+p'' means that no empty lines appear on the screen when browsing through text. It is sometimes useful to set the browse register to something like ``+b'' for easy paging through text, or to `P' or `L', which cause the line to be displayed in the format of `p' or `l', but with line numbers at the beginning of the line: .BE 22i Line 22 p .ul Line 22 l .ul Line\et22 P .ul 22 Line 22 L .ul 22 Line\et22 .EE These other display formats are sometimes handy in global searches, etc.: .BE g/proc()/ \ezf P .ul 104 func1(x) .ul 118 func2(y) .ul 221 func3() .EE .ti +5 Another nice register to have tucked away (as it is above) is the commenting command from section V: .BE zc:s@$@ /* \ecl */@ p .EE We can call it up when desired: .BE p .ul bizarre(); \ezc(A Kludge) .ul bizarre(); /* (A Kludge) */ g/xxxxx/ p \eczc .ul yyy xxxxx yyy needles .ul yyy xxxxx yyy /* needles */ etc. .EE The following register definition allows the user to specify a buffer by its file name: .BE zb:G/ \ecl/ f\ecN \ezbfile .ul g\(aa\fB.\fP34 file.c .EE We don't even need to type the terminal `.c'! .PA Here is a rather complicated, but conceptually simple, register, ``\ezs'' (for ``search''), which globally searches for a pattern in all the buffers from `a' through `z', and leaves dot at the last occurrence found. For readability, the newlines in the string shown here have been converted from `\eN's to real newlines. Unlike the examples above, the output here is the contents of register `s', so the special characters do not have to be delayed: .BE zB:\eB zP:\el zI:` h26 zI+1 b\eczI $zD#a#=0 yf g/\ezP/ ""\ecB:" PzB:\ecB b\ezB .EE .ft R What does this mean (!)? One step at a time: The first two lines set register `B' to be the current buffer, and register `P' to be the pattern we are searching for. (If there were special characters in the pattern, we would probably have to delay them once more than usual to achieve the desired result.) Register `I', a counter, is set to a back quote, the character below `a' in ASCII. The next line does all the work, and reads something like: .nf .sp .in +5 for 26 times do .in +5 increment zI change to buffer `\ezI' set zD to be the value of `$' if zD != 0 .in +5 globally look for the pattern; .in +2 on every line matched, .in +3 print the buffer name print the line & line number set zB to the current buffer .in 0 .fi .sp After execution, `zB' contains the last buffer name in which a match was found, and \fBQed\fP automatically keeps track of the line number on which the match was found. The last line of `zs' therefore changes back to buffer `\ezB', which leaves dot at the last line printed, similar to ``g/xxx/p''. .PA Got that? .PA Make sure you understand how the `s' register operates, as it utilizes many of the standard \fBQed\fP programming techniques, such as nesting a global inside a huntil. To load the command into a register, of course, you would have to delay the special characters one more time. .PA Well, that was instructive, but rather revolting. If you understood how the search register works, you're doing very well, but it's not a good example of how to program \fBQed\fP, just a pedagogical one. Here's how to really do it: .BE G/^[a-zA-Z]/ g/\el/ ""\ecB:"P .EE You'll find as you gain experience that huntils are rarely used, but they do have their moments. .PA Using the register is quite easy; just type `\ezs' followed by the pattern being searched for: .BE \ezs^func() .ul a:86 func() .ul b:102 func() { f .ul b .209 junk.c .EE .ti +5 [Exercise: Set up your startup buffer to include the original definition of ``zs'' using delayed `\eN's where necessary. Is a delayed newline necessary at the end of the register? Why or why not? (Hint: where does the newline at the end of the invocation line end up?) Define a second register like `s', but which executes a definable register, say `e' for ``execute'', rather than just printing the line. You can use our intelligent version here. What useful things might be put in register `e'?] .PA Registers can also be used to call the Shell. Register `d', defined below, calls ``pwd'' to get the current directory, saving the result in register `e', so that the user can quickly return after changing working directory. .BE zd:ovr zB:\ecB\ecN bX 35 yf zW:35 zD: | zD)\ezW \ezL,\ezM s/^/\ezD/ " Turn spaces into periods zD+14 \ezL,\ezM s/^ *\e(\ezD\e)$/\e1/ zD-14 \ezL,\ezM s/^\ezD// zL:\eN zM:\eN zC:\eN zD:\eN zW: .EE (Another new command (sorry): `zC#l' sets register C to the length of the current line.)\ This buffer illustrates how command buffers use the (zL,zM) address pair. Clearing the registers afterwards is a good practice for program buffers to follow. [Exercise: Why is there no `\eN' on the end of the last line?] To invoke this program on a suitable buffer full of, say, words, one to a line, we save it away in ``/usr/rob/q/right.q'' and type: .BE ba " where the data is *p .ul excle .ul ficatings .ul criminter .ul con .ul explasence .ul des .ul ofh .ul fultesibe .ul shispensitment .ul dedgearing .ul expers " yes, they're random words \ezrright *p .cs I 24 .ul excle .ul ficatings .ul criminter .ul con .ul explasence .ul des .ul ofh .ul fultesibe .ul shispensitment .ul dedgearing .ul expers .cs I .EE As the Ronco man would say, ``Isn't that amazing!" .PA Can we do anything useful with all this power? Well, we can write a buffer ``un'' (for ``run'' or ``unix'') which pipes the addressed lines out to a shell command line, and replaces them in the buffer with the output of the command: .BE " un.q -- replace addressed lines of current buffer by result " of passing them through pipeline " Looks in z| for pipeline; if empty, prompts & reads from terminal " Called as addr1, addr2 \e zrun; defaults to (1,$). z|= yf'fi ""<> " z|:\el "fi zL#=\ezM yf 1,$zL#r ovr \ezL,\ezM > \ez| > /tmp/qed zT#t " zT gets return status \ezMr /tmp/qed !rm /tmp/qed ovs zT#=0 yt'else ""Invalid status return - lines not deleted y'fi "else \ezL,\ezMd "fi zL:\eNzM:\eNzT: ""!\eN .EE The prompt is reminiscent of ``crunch-zap.'' The ``yf\(aaelse'' tests the status return of the command, and decides not to delete the original lines if the status was bad. Using the ``\ezrun'' combination, we can process the data in a buffer through any arbitrary pipeline, such as .BE *p .ul excle .ul ficatings .ul criminter .ul con .ul explasence .ul des .ul ofh .ul fultesibe .ul shispensitment .ul dedgearing .ul expers \ezrun sort .ul ! *p .ul con .ul criminter .ul dedgearing .ul des .ul excle .ul expers .ul explasence .ul ficatings .ul fultesibe .ul ofh .ul shispensitment .EE To send out only a portion of the buffer to the pipeline, the usual convention is used: .BE \&.,/xyz/ \ezrun sort .EE .PA Well, if you've made it this far, you're certainly ready to become a \fBQed\fP hack. Have fun!