* Cleanup of character IO interface Committing this branch for further testing. I know at least that the TTY output stream somehow is defaulting to :XCCS, which is wrong, but I haven't yet found the interface for that. * Clean out \NSIN etc No top-level calls to the NS specific functions, just to the generic \OUTCHAR etc. Updated full.database * MODERNIZE: added dragging for fixed-menu windows They can be dragged by their title bars * UNICODE: Added Greek to the default set Also made spelling of default-externalformats consistent with FILEIO * FASLOAD: EOL conversion in FASL::READ-TEXT EOL's printed as LF's will be read as EOL * LLREAD: Added meta as a CHARACTERSETNAME meta,a maps to 1,a now. But slowly propagating this to TEDIT, SEDIT, etc will make it easier to change the coding of meta characters, e.g. as part of a Unicode transition. * APRINT FILEIO LLREAD: \OUTCHAR now a closed function Removed the macro * LLKEY: call CHARCODE.DECODE directory in \KEYACTION1 Minor cleanup, avoid typical user entry and APPLY* * WHEELSCROLL: re-enable on AFTERMAKESYS/SYSOUT FORMS Also sets up mappings in the \COMMANDKEYACTIONS, whatever that is * ABASIC: NILL and ZERO change from LAMBDA NOBIND to LAMBDA NIL So that things like Masterscope don't break * MASTERSCOPE: Added WHEREIS as last-resort for CONTAINS Looks at the WHEREIS database, if present, for FNS and FUNCTIONS if it has no other information. . WHO CONTAINS ANY CALLING FOO works, but not the inverse: . WHO DOES FUM CONTAIN. We still need to figure out why the CONTAINS table isn't populated * POSTSCRIPTSTREAM: use standard \OUTCHAR conventions Now uses generic \OUTCHAR to get the proper function from the stream (or default) * Recompile with right EXPORTS.ALL Some of the macros weren't correct. * Fix POSTSCRIPTSTREAM Cleaner separation between external \OUTCHAR and internal BOUT * POSTSCRIPTSTREAM gets its own external format * Minor fix * Compile-time warning about EXPORTS.ALL * MODERNIZE: Modern button fn has same args as the original For Notecards #343 * Fixed another glitch in the MODERNIZE arglist thing \TEDIT.BUTTONEVENTFN actually takes a second STREAM argument. I don't see where it is ever called with that. The modernize replacement binds that argument, but it isn't being passed to the original. * FILEWATCH: added missing record field * Update FILEWATCH.LCOM * Eliminating record/type name conflicts Mostly just qualifying references, more work to get BIGBITMAP stuff out of ADISPLAY and to eliminate ambiguity of LINE record (now XXLINE in XXGEOM) * Compile away open calls to \OUTCHAR, add loadups/full.database Mostly new LCOMS where \OUTCHAR calls were compiled open * Remove garbage library/XCCS Old tools for reading wikipedia XCCS tables, sources/XCCS will deal with XCCS external format * Next step: Remove open input-character calls, factor XCCS to separate file XCCS is the default, but can be swapped out (eventually) by setting a few variables, without recompiling everything * Lots of residual cleanup for XCCS isolation * Delete old file MACINTERFACE (migrated to MODERNIZE) * Eliminate straggling NS calls: LAFITE, READINTERPRESS * Typo * READINTERPRESS: removed CHARSET * MODERNIZE: Interface to control title-bar response (for Notecards) * Many changes for external format name consistency Very close to the end of this * Put :FORMAT in file info, fix TEDIT plaintext hardcopy I distributed :FORMAT :XCCS as the default marking, but somehow one of the variables seems to get revert during the loadup. This is correct, as far as it goes. * Getting the format in the file-info This is all very twisty, different variables set in different places. It now seems to do the right thing, at least for new files. Marks them with :FORMAT :XCCS. * Another fileinfo glitch * CLIPBOARD -UNICODE: Make UTF8 to UTF-8 to match standards * MODERNIZE: fix bug in MODERWINDOW * External format as MAKEFILE option, LOAD applies the file's format (MAKEFILE 'XX '((FORMAT :UTF-8))) will dump XX as a UTF-8 file. LOAD will load it back to XCCS internal. * Compilers respect DEFINE-FILE-INFO format * MODERNIZE: little glitch * Delete old FILEIO.LCOM * More edge cases of external format thru MAKEFILE, PRETTY, PRETTYFILEINDEX etc. * FILEBROWSER: Can SEE UTF-8 Lisp sourcefile * INSPECT: Better macro for inspecting readtables * recompile changed files and do new loadup Co-authored-by: rmkaplan <ron.kaplan@post.harvard.edu>
71 lines
3.7 KiB
Plaintext
71 lines
3.7 KiB
Plaintext
New architecture for character input-output and alternative external formats
|
|
|
|
Ron Kaplan, May 2021
|
|
|
|
The Medley system was built with the Xerox Character Coding standard as the target for multi-byte input and output and for the internal mapping of character codes to glyphs.
|
|
|
|
This is now quite out of date, and our goal is to move to more modern conventions like Unicode and UTF-8.
|
|
|
|
The coding conventions are embodied in macros that test a stream to see if it is XCCS, and to do special open-coded processing (often with the help of locally bound variables for encoding information) if it is.
|
|
|
|
If it isn't XCCS, then the macros instead apply functions that are obtained from fields in the stream. This is optimized for the default XCCS set up because in that case a separate function call is avoided, the action itself is open coded.
|
|
|
|
The new architecture recognizes that there may be an advantage to specifying a system default for character processing that avoids function calls but that doesn't depend on support (binding of special variables as opposed to accessing stream fields on each call) to get that last measure of efficiency.
|
|
|
|
Thus, there are 4 generic macros corresponding to the 4 character IO operations:
|
|
|
|
\INCCODE
|
|
\OUTCHAR
|
|
\BACKCHAR
|
|
\PEEKCCODE
|
|
|
|
Each of these is defined to fetch a corresponding field from the stream (OUTCHARFN, INCCODEFN, PEEKCCODEFN, BACKCHARFN). If that field is NIL, then each of these passes to a corresonding default macro:
|
|
|
|
\DEFAULTINCCODE
|
|
\DEFAULTOUTCHAR
|
|
\DEFAULTBACKCHAR
|
|
\DEFAULTPEEKCCODE
|
|
|
|
These default macros can then be redefined to make a wholesale switch of the default encoding standard.
|
|
|
|
The macro \OUTCHAR, for example, is defined as
|
|
if the stream has an OUTCHARFN, apply it. Otherwise do the \DEFAULTOUTCHAR
|
|
and so on for each of the others.
|
|
|
|
For the current XCCS default, \DEFAULTOUTCHAR is defined to call \XCCSOUTCHARFN.
|
|
|
|
The corresponding stream fields can be set directly, but the preferred interface is to wrap up the 4 functions for a given format in an EXTERNALFORMAT datastructure. The function
|
|
|
|
(\EXTERNALFORMAT stream formatname)
|
|
|
|
applies the information in the format into the stream. A particular (non-default) format can be specified as an optional parameter when a stream is opened, and each file device can have its own default external format. Then there is also a variable that holds the name of the name of the system-wide default, currently :XCCS.
|
|
|
|
If the default external format is applied to a stream, the relevant function fields are set to NIL to kick off the default macro for that particular function, otherwise the function is copied from the external format to the stream.
|
|
|
|
An external format has the following fields:
|
|
|
|
NAME
|
|
INCCODEFN
|
|
PEEKCCODEFN
|
|
BACKCHARFN
|
|
OUTCHARFN
|
|
EOL
|
|
|
|
The function (\INSTALL.EXTERNALFORMAT format) registers the given format under its name, so it can be retrieved when the name is given to \EXTERNALFORMAT.
|
|
|
|
If EOL is not NIL, then it is an end-of-line convention that will override whatever a stream might have had by default. (The value of EOL is one of the constants LF.EOLC, CR.EOLC, CRLF.EOLC.)
|
|
|
|
The system now includes external formats for
|
|
:XCCS (the global default)
|
|
:THROOUGH (untransformed bytes)
|
|
|
|
It probably would make sense to also include a :KEYBOARD external format, to generalize that as well.
|
|
|
|
UNICODE defines external formats for UTF8 with or without character translation, and also UTF16 (big-end and little-end). When we finally make the swap, we would make :UTF8 be the default, redefine the macros, and recompile all the callers.
|
|
|
|
The Japanse external formats that used to be included in the basic system are now provided by a JAPANESE in the library.
|
|
|
|
Finally, there is another macro \INCHAR that applies \CHECKEOLC to the result of \INCCODE.
|
|
|
|
|