Copyright 1999 by Craig A. Finseth. Contact the author with questions about distribution rights.
This web site contains the full text of the book "The Craft of Text Editing." That book was published in 1991 by Springer-Verlag & Co. By arrangement between the author and the publisher, the book version is now out of print and all rights have been returned to the author. Note that there may be some slight differences in typograhic corrections between this version and the printed one.
If you wish to cite this work, please use the following URL:
http://www.finseth.com/craft
This book is also available in print form. It has ISBN 978-1-4116-8297-9 (10-digit: 1-4116-8297-1). It is available from Lulu and from Amazon.com:
If you should notice typos or formatting problems, please let me know. I am not, however, planning on revising or updating the book anytime soon. Typo corrections and minor changes will continue to be made indefinitely.
Here is a .pdf version that contains better bookmarking than the one on Lulu.
Here is a gzip'd .tar file that contains the complete work.
Here is a .gz file that contains a PostScript version of the complete work. This version is frozen as of June 2000 and will not reflect any corrections made after that time. Thanks to Fekete Krisztian for the conversion.
Here is a .gz file that contains a tar file of a LaTeX version version of the complete work. This version is frozen as of June 2000 and will not reflect any corrections made after that time. Thanks to Fekete Krisztian for the conversion.
| Preface | ||||
| Introduction: What Is Text Editing All About? | ||||
| One: Users | ||||
| Two: User Interface Hardware | ||||
| Three: Implementation Languages | ||||
| Four: Editing Models | ||||
| Five: File Formats | ||||
| Six: The Internal Sub-Editor | ||||
| Seven: Redisplay | ||||
| Eight: User-Oriented Commands: The Command Loop | ||||
| Nine: Command Set Design | ||||
| Ten: Emacs-Type Editors | ||||
| Epilogue | ||||
| Appendix A: A Five-Minute Introduction to C | ||||
| Appendix B: Emacs Implementations | ||||
| Appendix C: The Emacs Command Set | ||||
| Appendix D: The TECO Command Set | ||||
| Appendix E: ASCII Chart | ||||
| Bibliography | ||||
| Book Index | ||||
The chapter quotes comprise the verse "Jabberwocky" by Lewis Carroll, from the work Through the Looking Glass.
Annex is a registered trademark of Xylogics.
CP/M is a registered trademark of Digital Research.
DEC, Tops-20, VT52, VT100, VT200 and VAX/VMS are registered trademarks of Digital Equipment Corp.
FinalWord and MINCE are registered trademarks of Mark of the Unicorn.
IBM and IBM PC are registered trademarks of IBM Corp.
Apple ][ is a registered trademark of Apple Computer, Inc.
Macintosh is a trademark licensed to Apple Computer, Inc.
MS/DOS is a registered trademark of Microsoft Corp.
TTY is a registered trademark of Teletype Corp.
UNIX is a registered trademark of AT&T
| Preface | ||||
| Questions to Probe Your Understanding | ||||
| Acknowledgements | ||||
| Introduction: What Is Text Editing All About? | ||||
| 1 The Basic Get_Line | ||||
| 1.1 Version One | ||||
| 1.2 Version Two | ||||
| 1.3 Version Three | ||||
| 1.4 Version Four | ||||
| 2 The Forest | ||||
| Questions to Probe Your Understanding | ||||
| One: Users | ||||
| 1.1 User Categories | ||||
| 1.1.1 Amount of Experience | ||||
| 1.1.2 Type of Experience | ||||
| 1.2 "Religion" | ||||
| 1.3 User Goals | ||||
| 1.4 Physiological Constraints | ||||
| 1.5 Applying These Physiological Constraints | ||||
| 1.6 Users Who Have Handicaps | ||||
| Questions to Probe Your Understanding | ||||
| Two: User Interface Hardware | ||||
| 2.1 Display Types | ||||
| 2.1.1 TTY and Glass TTY | ||||
| 2.1.2 Basic Displays | ||||
| 2.1.3 Advanced Displays | ||||
| 2.1.4 "Memory Mapped" Displays | ||||
| 2.1.5 Graphics Displays | ||||
| 2.2 Keyboards | ||||
| 2.2.1 Special Function Keys | ||||
| 2.2.2 Extra Shift Keys | ||||
| 2.2.3 Key Placement | ||||
| 2.2.4 Example Keyboards | ||||
| 2.3 Graphical Input | ||||
| 2.3.1 Touch Sensitive Display | ||||
| 2.3.2 Tablet | ||||
| 2.3.3 Mouse | ||||
| 2.3.4 Trackball | ||||
| 2.3.5 Joystick | ||||
| 2.3.6 A Different Mouse | ||||
| 2.3.7 Other Devices | ||||
| 2.3.8 Conclusion | ||||
| 2.4 Communications Path Issues | ||||
| 2.4.1 Speed and Character Format | ||||
| 2.4.2 Flow Control | ||||
| 2.4.3 Echo Negotiation | ||||
| 2.4.4 Fancy Modems | ||||
| Questions to Probe Your Understanding | ||||
| Three: Implementation Languages | ||||
| 3.1 General Considerations | ||||
| 3.1.1 Availability and Implementation Quality | ||||
| 3.1.2 Text Handling Power | ||||
| 3.1.3 Support for Extensibility | ||||
| 3.1.4 Large Project Support | ||||
| 3.1.5 Efficiency | ||||
| 3.2 Specific Language Notes | ||||
| 3.2.1 TECO | ||||
| 3.2.2 Lisp | ||||
| 3.2.3 C | ||||
| 3.2.4 PL/1 | ||||
| 3.2.5 Other Systems Languages | ||||
| 3.2.6 Fortran | ||||
| 3.2.7 Pascal | ||||
| 3.2.8 Basic | ||||
| 3.2.9 Ada | ||||
| 3.2.10 Sine | ||||
| 3.2.11 Custom Editor Languages | ||||
| Questions to Probe Your Understanding | ||||
| Four: Editing Models | ||||
| 4.1 One-Dimensional Array of Bytes | ||||
| 4.2 Two-Dimensional Array of Bytes | ||||
| 4.3 List of Lines | ||||
| 4.4 Paged Models | ||||
| 4.5 Objects | ||||
| 4.6 Dealing with Real Text | ||||
| Questions to Probe Your Understanding | ||||
| Five: File Formats | ||||
| 5.1 Text Files | ||||
| 5.1.1 Line Boundaries | ||||
| 5.1.2 Line Contents | ||||
| 5.1.3 End of File | ||||
| 5.2 Binary Files | ||||
| 5.3 Structured Files | ||||
| 5.4 Where to Store the "Extra" Information | ||||
| 5.4.1 In-Band | ||||
| 5.4.2 Out-of-Band | ||||
| 5.4.3 Conclusion | ||||
| 5.5 The Additional Information | ||||
| 5.5.1 Fonts, Sizes, Attributes | ||||
| 5.5.2 Line, Paragraph, Page, and Other Formats | ||||
| 5.5.3 Non-Text Objects | ||||
| 5.6 Internationalization | ||||
| Questions to Probe Your Understanding | ||||
| Six: The Internal Sub-Editor | ||||
| 6.1 Basic Concepts and Definitions | ||||
| 6.2 Internal Data Structures | ||||
| 6.3 Procedure Interface Definitions | ||||
| 6.4 Characteristics of Implementation Methods | ||||
| 6.4.1 No Management | ||||
| 6.4.2 Extra Space at the End | ||||
| 6.4.3 Buffer Gap | ||||
| 6.4.3.1 Multiple Gaps and Why They Don't Work | ||||
| 6.4.3.2 The Hidden Second Gap | ||||
| 6.5 Implementation Method Overview | ||||
| 6.6 Buffer Gap | ||||
| 6.7 Linked Line | ||||
| 6.8 Paged Buffer Gap | ||||
| 6.9 Other Methods | ||||
| 6.10 Method Comparisons | ||||
| 6.10.1 Storage | ||||
| 6.10.2 Crash Recovery | ||||
| 6.10.3 Efficiency of Editing | ||||
| 6.10.4 Efficiency of Buffer/File I/O | ||||
| 6.10.5 Efficiency of Searching | ||||
| 6.10.6 Multiple Buffers | ||||
| 6.10.7 Paged Virtual Memory | ||||
| 6.10.8 Conclusions | ||||
| 6.11 Editing Extremely Large Files | ||||
| 6.12 Difference Files | ||||
| Questions to Probe Your Understanding | ||||
| Seven: Redisplay | ||||
| 7.1 Constraints | ||||
| 7.2 Procedure Interface Definitions | ||||
| 7.2.1 Editor Procedures | ||||
| 7.2.2 Display Independent Procedures | ||||
| 7.3 Considerations | ||||
| 7.3.1 Status Line | ||||
| 7.3.2 End of the Buffer | ||||
| 7.3.3 Horizontal Scrolling | ||||
| 7.3.4 Line Wrap | ||||
| 7.3.5 Word Wrap | ||||
| 7.3.6 Tabs | ||||
| 7.3.7 Control Characters | ||||
| 7.3.8 Proportionally Spaced Text | ||||
| 7.3.9 Attributes, Fonts, and Scripts | ||||
| 7.3.10 Breaking Out Between Lines | ||||
| 7.3.11 Multiple Windows | ||||
| 7.4 Redisplay Itself | ||||
| 7.4.1 The Framer | ||||
| 7.4.2 The Basic Algorithm | ||||
| 7.4.3 Sub-Editor Interaction | ||||
| 7.4.4 The Advanced Algorithm | ||||
| 7.4.5 Redisplay for Memory-Mapped Displays | ||||
| Questions to Probe Your Understanding | ||||
| Eight: User-Oriented Commands: The Command Loop | ||||
| 8.1 The Core Loop: Read, Evaluate, Print | ||||
| 8.1.1 The Evaluate Procedure | ||||
| 8.1.2 Move by a Character | ||||
| 8.1.3 Insert a Character | ||||
| 8.1.4 Second-Level Dispatch | ||||
| 8.1.5 Accept an Argument | ||||
| 8.1.6 Philosophy | ||||
| 8.1.7 A Minimalist Command Set Design | ||||
| 8.2 Errors | ||||
| 8.2.1 Internal Errors | ||||
| 8.2.2 External Errors | ||||
| 8.2.3 Exiting | ||||
| 8.3 Arguments | ||||
| 8.3.1 Numeric (Prefix) Arguments | ||||
| 8.3.2 String (Suffix) Arguments | ||||
| 8.3.3 Positional Arguments | ||||
| 8.3.4 Selection Arguments | ||||
| 8.4 Rebinding | ||||
| 8.4.1 Rebinding Keys | ||||
| 8.4.2 Rebinding Functions | ||||
| 8.5 Modes | ||||
| 8.5.1 Modes and Dynamic Rebinding | ||||
| 8.5.2 Implementing Modes | ||||
| 8.6 Changing Your Mind | ||||
| 8.6.1 Command Set Design | ||||
| 8.6.2 Kill Ring | ||||
| 8.6.3 Undo | ||||
| 8.6.4 An Undo Heresy | ||||
| 8.6.5 Redo | ||||
| 8.7 Macros | ||||
| 8.7.1 Again | ||||
| 8.7.2 Keystroke Recording | ||||
| 8.7.3 Macro Languages | ||||
| 8.7.4 Redisplay Interaction | ||||
| Questions to Probe Your Understanding | ||||
| Nine: Command Set Design | ||||
| 9.1 Responsiveness | ||||
| 9.2 Consistency | ||||
| 9.3 Permissiveness | ||||
| 9.4 Progress | ||||
| 9.5 Simplicity | ||||
| 9.6 Uniformity | ||||
| 9.7 Extensibility | ||||
| 9.8 Modes | ||||
| 9.9 Use of Language | ||||
| 9.10 Guideline Summary | ||||
| 9.10.1 Overall | ||||
| 9.10.2 Modes | ||||
| 9.10.3 Use of Language | ||||
| 9.11 Structure Editors | ||||
| 9.12 Programing Assistance | ||||
| 9.13 Command Behavior | ||||
| 9.13.1 Does Down Move the Point or the Text? | ||||
| 9.13.2 Scrolling vs. Paging | ||||
| 9.13.3 Page Breaks | ||||
| 9.13.4 How Many Ways Can You Move by a Word? | ||||
| 9.13.4.1 Moving by Words | ||||
| 9.13.4.2 Deleting by Words | ||||
| 9.13.5 Where Do Sentences and Paragraphs End? | ||||
| 9.13.6 How to Search | ||||
| 9.13.7 Commands to Handle Typos | ||||
| 9.13.7.1 Capitalization Commands | ||||
| 9.13.7.2 Twiddling | ||||
| Questions to Probe Your Understanding | ||||
| Ten: Emacs-Type Editors | ||||
| 10.1 "What Do You Mean, 'Emacs-type?' " | ||||
| 10.2 The Command Set | ||||
| 10.3 The Extended Environment | ||||
| 10.4 Extensibility | ||||
| Questions to Probe Your Understanding | ||||
| Epilogue | ||||
| Questions to Probe Your Understanding | ||||
| Appendix A: A Five-Minute Introduction to C | ||||
| A.1 Case Conventions | ||||
| A.2 Data Types and Declarations | ||||
| A.3 Constants | ||||
| A.4 Pre-defined Constants | ||||
| A.5 Procedure Structure | ||||
| A.6 Statements | ||||
| A.7 Operators | ||||
| A.8 Standard Library Functions Used in This Book | ||||
| A.9 Non-Standard Library Functions Used in This Book | ||||
| Appendix B: Emacs Implementations | ||||
| Appendix C: The Emacs Command Set | ||||
| C.1 Notation | ||||
| C.2 Default GNU-Emacs Command List | ||||
| C.2.1 Base Commands | ||||
| C.2.2 Help Commands | ||||
| C.2.3 Control-X (^X) Commands | ||||
| C.2.4 Control-X 4 Commands | ||||
| C.2.5 Meta (^[) Commands | ||||
| C.3 The Author's Command Set | ||||
| Appendix D: The TECO Command Set | ||||
| D.1 General notation: | ||||
| D.2 Commands | ||||
| D.3 E-Commands (most file commands are here) | ||||
| D.4 F-Commands | ||||
| D.5 Special Q-registers, names are of the form "..x" | ||||
| D.6 FS Variables | ||||
| Appendix E: ASCII Chart | ||||
| Bibliography | ||||
| 1 Current | ||||
| 2 Thesis | ||||
| 2.1 Emacs-Type Editors | ||||
| 2.1.1 ITS EMACS | ||||
| 2.1.2 Lisp Machine Zwei | ||||
| 2.1.3 Multics Emacs | ||||
| 2.1.4 MagicSix TVMacs | ||||
| 2.1.5 Other Emacs-Type Text Editors | ||||
| 2.2 Non-Emacs Display Editors | ||||
| 2.3 Structure Editors | ||||
| 2.4 Other Editors | ||||
| Book Index | ||||
Just over eleven years ago I was faced with selecting a topic for my thesis. At the time, I was a student at the Massachusetts Institute of Technology and was working on my bachelor's degree in Computer Science and Engineering. One of the degree requirements was a thesis, and you can't have a thesis without a topic.
During my four years at M.I.T., a new type of text editor had come into being and widespread use. This type of text editor was called "Emacs," and it was a major step forward in many ways. Implementations of this type of editor were appearing on many computer systems. Some people even used an implementation as the basis for their thesis. I took a different tack. The idea that I settled on for my thesis was a description of the technology that underlies all text editors, but with a special emphasis on Emacs-type editors. The thesis was written and published as a technical memo (Finseth 1980).
* * *
Ten years later, I was reading the USENET News news group Comp.editors, one of the many facets of that worldwide electronic bulletin board. A discussion thread had started up in which both sides of the discussion were citing my thesis as the authority in the field. Further inquiries (not by me: I was just reading along) showed that no one in that group was aware of any other document that described general text-editing technology.
My thesis was ten years old: it predated most personal computers and workstations. There had even been a chapter in an early draft that attempted to prove that it was not possible to implement an Emacs-type text editor on a small computer. (I invented a way, threw out the chapter, and with some friends started a software company to market such an editor. Oh well.) It was clearly time for a complete rewrite, and that rewrite is what you are reading now.
If you don't have a copy of my thesis (or the Technical Memo: the two have identical content), you won't miss anything. This book has all of the information from the earlier document, and is now updated. It also has a whole lot more. Every part has been completely rewritten and expanded, and major sections have been added.
As with my thesis, this book is written in an informal, almost chatty style. It is addressed directly to "you," who are assumed to care about how text editors are implemented. Be warned, however, that it also contains opinions about the "right" and "wrong" way of doing things and that these opinions hold that many of the current directions and trends are -- shall we say? -- not the "right" way. You should keep in mind that you should not accept everything said in here as the gospel truth, but understand why I say what I say and make your own informed judgment.
This book is addressed to anyone who implements large software systems or who wants to know the considerations that go into such systems. It focuses around text editors. Although not required, an understanding of programming will be helpful.
Each chapter ends with a set of questions and problems designed to probe your understanding of the material that was just presented. And, true to the Socratic method, some of these questions also introduce new material. The level of difficulty of the questions ranges from very easy to quite difficult, and each question is labelled to help you gauge how much effort is required. Just as with most programming issues, most questions have no single correct answer.
I would like to thank those people who helped me in various ways:
Plus, of course, all of those people that I have left out. Special thanks to my wife Ann and daughter Kari, who put up with my typing away all the time.
Craig A. Finseth
St. Paul, Minnesota
February 1991
Copyright 1999 by Craig A. Finseth.
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
In its most general form, text editing is the process of taking some input, changing it, and producing some output. Ideally, the desired changes would be made immediately and with no effort required beyond the mere thought of the change. Unfortunately, the ideal case is not yet achievable. We are thus consigned to using tools such as computers to effect our desired changes.
Computers have physical limitations. These limitations include the nature of user-interface devices; CPU performance; memory constraints, both physical and virtual; and disk capacity and transfer speed. Computer programs that perform text editing must operate within these limitations. This book examines those limitations, explores tradeoffs among them and the algorithms that implement specific tradeoffs, and provides general guidance to anyone who wants to understand how to implement a text editor or how to perform editing in general.
I do not present the complete source code to an editor, nor is the source code available on disk (at least from me: see Appendix B). For that matter, you won't even see a completely worked out algorithm. Rather, this book teaches the craft of text editing so that you can understand how to construct your own editor.
The first chapters discuss external constraints: human mental processes, file formats, and interface devices. Later chapters describe memory management, redisplay algorithms, and command set structure in detail. The last chapter explores the Emacs-type of editor. The Emacs-type of editor will also be used whenever a reference to a specific editor is required.
This range of topics is quite broad, and it is easy to lose sight of the forest with all of those trees. The remainder of this introduction will sketch the outlines of the forest by examining an editor-in-miniature: a get-line-of-input routine. We will start with a basic version of the routine, then make it more elaborate in a series of steps. By the end, you will see where the complexity of a text editor arises from.
The program examples are written in the ANSI version of the C language. Appendix A provides a brief introduction to the C language and explains all of the features used in examples.
The Get_Line routine accepts these inputs:
and produces these outputs:
The editing performed by this routine is on the input buffer. This first version assumes that you are creating a new item from scratch each time.
FLAG Get_Line(char *prompt, char *buffer, int len)
{
char *cptr = buffer;
int key;
if (len < 2) return(FALSE); /* safety check */
printf("%s: ", prompt);
for (;;) {
key = KeyGet();
if (isprint(key)) {
if (cptr - buffer >= len - 1) Beep();
else {
*cptr++ = key;
printf("%c", key);
}
}
else if (key == KEYENTER) {
*cptr = NUL;
printf("\n");
return(TRUE);
}
else Beep();
}
}
Version One accepts input until the user presses the Enter key. If a user's input will overflow the input buffer, the input is discarded and the program will sound an error beep. Once the Enter key has been pressed, the program appends a NUL character to terminate the string and returns True. Non-printing characters other than Enter also cause the program to sound an error beep. Simple, straightforward, and useless, as there is no way for the user to correct any typing mistakes.
Here is version Two. It adds editing:
FLAG Get_Line(char *prompt, char *buffer, int len)
{
char *cptr = buffer;
int key;
if (len < 2) return(FALSE); /* safety check */
printf("%s: ", prompt);
for (;;) {
key = KeyGet();
if (isprint(key)) {
if (cptr - buffer >= len - 1) Beep();
else {
*cptr++ = key;
printf("%c", key);
}
}
else {
switch (key) {
case KEYBACK:
if (cptr > buffer) {
cptr--;
printf("\b \b");
}
break;
case KEYENTER:
*cptr = NUL;
printf("\n");
return(TRUE);
/*break;*/
default:
Beep();
break;
}
}
}
}
Version Two starts developing problems that can no longer be swept under the rug.
Version One glossed over exactly what is meant by the Enter key. That's sort of okay. Most keyboards have only one key labelled "Enter" or "Return" or something similar. It almost always sends a Carriage Return character. The program can compare against just that character and almost always operate "correctly," i.e., as the user expects. However, most keyboards have at least two keys for erasing: Back Space and Delete. Some people and computer systems use one of these. Other people and computer systems user the other. (We will ignore any extra "erase" or "delete character" keys that you might find. For now.) The program can handle this problem in several ways:
If you picked the first option, just over half of your users will be upset with you. The second option is much better: almost all users will like you, and this part of your program need not be operating system specific at all. (I often select this option when writing small programs that should have a minimum of operating system-dependant code.) The third option is a fine solution. Most users will like you, and you are building on other work (i.e., the operating system) instead of reinventing the wheel.
If you picked the fourth option, you have already learned what an Emacs-type editor is about. Implicit in this option is recognizing that users should be able to control their environment as much as possible. Yes, it is more work to write such programs and, yes, it sometimes overlaps the existing operating system, but it can be well worth the effort.
Another problem appears in the statement:
printf("\b \b");
This statement is a crude attempt at erasing a character. As it turns out, there are pretty powerful conventions regarding how printing characters and newlines are handled by operating systems and output devices. These characters all move the cursor to the right or to the start of the next line. However, when you want the cursor to back up in any way or you wish to control it in any other way, you are on your own: there are no industry-wide conventions for specifying these operations. And, with no conventions to rely upon, your program has to implement a method of coping with the range of output devices.
Version Three assumes that the input buffer contains some text. This text is used for the response if the user just presses Enter (i.e., the text is the default value):
FLAG Get_Line(char *prompt, char *buffer, int len)
{
char *cptr = buffer;
FLAG waskey = FALSE;
int key;
if (len < 2) return(FALSE); /* safety check */
for (;;) {
ToStartOfLine();
ClearLine();
printf("%s: %s", prompt, buffer);
key = KeyGet();
if (isprint(key)) {
if (!waskey) {
*buffer = NUL;
waskey = TRUE;
}
if (cptr - buffer >= len - 1) Beep();
else {
*cptr++ = key;
*cptr = NUL;
}
}
else {
switch (key) {
case KEYBACK:
if (!waskey) {
*buffer = NUL;
waskey = TRUE;
}
if (cptr > buffer) {
--cptr;
*cptr = NUL;
printf("\b \b");
}
break;
case KEYENTER:
printf("\n");
return(TRUE);
/*break;*/
default:
Beep();
break;
}
}
}
}
Version Three returns the supplied response if the user just presses the Enter key. Otherwise, the supplied response is erased completely the first time a printing key or Back Space is pressed. The only other changes worth noting are that the prompt has been moved to the inside of the loop and a few terminal interface routines have been added. The first one moves the "cursor" to the beginning of the line. The next clears the line.
This version adds a number of features:
This version of the routine also has a slight change to the interface: the addition of a separate default value parameter.
FLAG Get_Line(char *prompt, char *buffer, int len, char *default)
{
char *cptr = buffer;
FLAG isinsert = TRUE;
FLAG waskey = TRUE;
int key;
if (len < 2) return(FALSE); /* safety check */
strcpy(buffer, default);
for (;;) {
ToStartOfLine();
ClearLine();
printf("%s: %s", prompt, buffer);
PositionCursor(strlen(prompt) + 2 + (cptr - buffer));
key = KeyGet();
if (isprint(key)) {
if (!waskey) {
cptr = buffer;
*cptr = NUL;
waskey = TRUE;
}
if (isinsert) {
if (buffer + strlen(buffer) >= len - 1) Beep();
else { /* move rest of line and insert */
memmove(cptr + 1, cptr, strlen(cptr) + 1);
*cptr++ = key;
*cptr = NUL;
}
}
else {
if (*cptr == NUL) {
/* end of input, so append to buffer */
if (buffer + strlen(buffer) >= len - 1)
Beep();
else {
*cptr++ = key;
*cptr = NUL;
}
}
else *cptr++ = key; /* replace */
}
}
else {
switch (key) {
case KEYBACK:
if (!waskey) {
cptr = buffer;
*cptr = NUL;
waskey = TRUE;
}
if (cptr > buffer) {
xstrcpy(cptr - 1, cptr);
cptr--;
*cptr = NUL;
}
break;
case KEYDEL: /* delete the following char */
if (cptr < buffer + strlen(buffer))
xstrcpy(cptr, cptr + 1);
else Beep();
break;
case KEYENTER:
printf("\n");
return(TRUE);
/*break;*/
case KEYLEFT:
if (cptr > buffer) cptr--;
waskey = TRUE;
break;
case KEYRIGHT:
if (cptr < buffer + strlen(buffer)) cptr++;
waskey = TRUE;
break;
case KEYSTART: /* move to start of response */
cptr = buffer;
waskey = TRUE;
break;
case KEYEND: /* move to end of response */
cptr = buffer + strlen(buffer);
waskey = TRUE;
break;
case KEYQUOTE: /* insert the next character,
even if it is a control char */
if (!waskey) {
cptr = buffer;
*cptr = NUL;
waskey = TRUE;
}
key = KeyGet();
if (isinsert) {
if (buffer + strlen(buffer) >= len - 1)
Beep();
else { /* move rest of line and insert */
memmove(cptr + 1, cptr,
strlen(cptr) + 1);
*cptr++ = key;
*cptr = NUL;
}
}
else {
if (*cptr == NUL) {
/* end of input, so append */
if (buffer + strlen(buffer) >= len - 1)
Beep();
else {
*cptr++ = key;
*cptr = NUL;
}
}
else *cptr++ = key; /* replace */
}
break;
case KEYCLEAR: /* erase response */
cptr = buffer;
*cptr = NUL;
waskey = TRUE;
break;
case KEYDEFAULT: /* restore default response */
strcpy(buffer, default);
cptr = buffer;
waskey = FALSE;
break;
case KEYCANCEL: /* abort out of editing */
return(FALSE);
/*break;*/
case KEYREDISPLAY: /* redisplay the prompt and resp */
break;
case KEYINSERT: /* set insert mode */
isinsert = TRUE;
break;
case KEYREPLACE: /* set replace mode */
isinsert = FALSE;
break;
default:
Beep();
break;
}
}
}
}
Version Four does all that was claimed for it, but not as well as one would like. In particular:
The examples presented in this chapter bumped into these problems:
These and other questions will be addressed in the remainder of this book.
Modify the latest version of Get_Line to accept only numeric responses. What sort of error messages should be given? (Easy)
Modify the latest version of Get_Line to accept only responses from a list that is passed in as a parameter. What sort of error messages should be given? (Easy)
What are two good formats for such a list (Easy for those familiar with C, Medium otherwise)
What is the appropriate degree of control (key definitions, enable / disable features, etc.) that the calling program should have over the input editing? (Medium)
"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
The saying goes: "Business would be great if it weren't for customers." Well, programming would be easy if it weren't for users. In the simple case, there would be exactly one user for your program, - yourself - and you would use it only once. Most programs, however, are used many times by many people. You must take those users into account when designing your program.
This chapter will only review those aspects of users that are most relevant to text editing: full discussions of users and design can and do fill many books in themselves, some of which are listed in the Bibliography. This chapter (and this book) does not address the question of non-people users.
Each user can be placed in a category. Each category is described in terms of the amount and the type of experience. It is important to understand users: each user creates a model of how a new program works based on his or her experience with other programs combined with the "hints" that your program's user interface gives to him or her. It is up to you to either match your program's behavior to your users' model(s) or to give them enough information so that they generate a model that is well-matched to your program.
The amount of experience that a user has is a point on a continuous scale. All users start with no experience and accumulate experience as they learn. Although the scale is continuous, I have divided it into five regions in order to simplify discussion. Also, this list is not intended as a self-rating scale: most of you who are reading this book will be programmers.
Neophyte users barely know what a computer is. They lack understanding of such "basic" terms as "file" and "file name" (the concepts behind these terms are actually quite sophisticated). This lack of understanding does not mean that they are unintelligent people, only that they have never had a reason to learn these concepts. If you are designing a program for this type of user, you may feel both blessed and cursed. Cursed because it can be so difficult, and blessed because this area of program design has such a pressing need for good designs.
Experience from the field of artificial intelligence can shed more light on this issue. AI researchers found it (comparatively) easy to write programs that can handle advanced mathematics such as freshman calculus. However, as the researchers pushed on to handle such easy (to most people) areas as filling in coloring books, the programming problems got harder and harder. Some of this difficulty is due to the fact that the task of teaching college-level courses is well understood--especially by college professors--but teaching coloring is not. For example, how many textbooks have you seen on "how to color"? More to the point, computers have been designed to process information in a certain way, one that is mathematically elegant, but not necessarily related to how people's minds work. As people write programs for more and more "basic" tasks, this difference becomes increasingly apparent.
Many programs have been (mis-)designed for neophyte users. They often offer a few simple commands, yet leave intact such difficult concepts such as that of a "file." They solve the wrong problem, sort of like travelling to a place where a foreign language is spoken, and trying to communicate by speaking your native language slowly and distinctly. As a program designer, you must understand the thought structure of your users, and design programs that match that structure. The blessing comes from designing programs that are very different from "conventional" programs and which are well-matched to their users.
Novice users have used a computer before, perhaps for text editing, word processing, spread sheet, or database applications. In any event, novice users have some familiarity with the idea of typing things into a box and seeing a response that somehow reflects their typing. They understand how a shift key works, that a lowercase letter 'l' is not the same as a digit '1', and so forth. They even have some understanding of the idea of "context:" that keys do different things at different times. Users with this amount of experience are able to operate almost any program that has a good design and a decent manual.
Basic users are like novice users, only more so. They understand such programming concepts as thread of control, variables, and statements like "A = A + 1" (in fact, many people call such users "programmers"). These users can operate any program, even one with a poor design. Given source code to the program they are able to customize and extend it, albeit in what might be an awkward fashion.
Power users know one or more application programs thoroughly. They understand not only how to use those programs fully, but can often go beyond the bounds of what the original designers intended. They may write large programs often in the form of application macros, but do not design these programs. These users understand the fine points of the programs that they use.
Programmer-level users understand the theory of programming. When writing a large program, they design the program before implementing it. They generalize, applying their experience and their knowledge of one program to guess how another program will operate.
The amount-of-experience scale is one-dimensional: people start at the beginning and proceed along the scale as they gain experience. This type of experience scale is more like a collection of baseball cards. A user can collect the experience types (cards) in any order, and two people with the same number of experience types (cards) may have no experience types (cards) in common.
The experience types do not necessarily carry over: experience gained on one type of system may or may not prove useful on another. Actually, experience gained on one system may make it more difficult to learn another. And, if users grow to like one type of system, they may then dislike another one, thus making any experience transfer problematic.
These experience types can have a major effect on the design of your programs, as it is usually important for new programs to appear and operate in a manner similar to existing programs. Thus, the (possibly bad) designs of those existing programs may have to be carried into the design of your program.
This section might also be titled "religious preference." In the computer field, "religion" is a technical term that refers to the usually irrational and extreme preference of one program, style, or method to another. Although you cannot really do anything about this phenomenon, you can keep it in mind when analyzing comments on your design.
It has been observed that people often "get religion" over the first application (for example, a word processor) that they use. I can't recall the number of people who have tried to convince me that the program that they just discovered (i.e., the first one they used) is the best one in the world. This form of "religion" is normal and derives from the facts that (1) the move from manual to automated methods (e.g., from typewriters to word processors) involves a major increase in capabilities: even the simplest word processor provides vastly more capabilities than does a typewriter, and (2) new users do not have the experience to realize that all programs (e.g., word processors) are not equal. This form of religion usually fades away over time as new users gain experience.
In a hauntingly close parallel to the "second system effect" (Brooks 1982), the "second program users" are the ones to watch out for. These people started using one program, then gave that program up in favor of a second one. The problem is that they think that since the second program is better than the first one (which it usually is), it must therefore be better than all the rest.
There is nothing in particular that you can do about users that feel religious about a program: rational arguments are in general ignored. You can, however, be aware that such users exist, and recognize when you are dealing with one.
Knowing your user's experience is essential, but a program design must incorporate knowledge of what task or tasks the user is trying to accomplish. For text editors, he or she might want to create:
The frequency of doing these tasks can range from occasionally to continuously. Different tasks can be performed by the same user with different frequencies.
The style of doing these tasks can also vary. One person may do all of one task, then start on the next. Another person may be frequently switching among two or more tasks.
Users are people. There are limits to what people can do. These limits must be considered when designing a program.
Hands have a limited reach. The very act of reaching for one key draws a hand away from other keys. Thus, commands that you expect to follow one another should be assigned with that constraint in mind. Function keys are often difficult to find and awkward to press. While there are almost always two shift keys, most keyboards only have one control (or equivalent) key and may only have one of other types of shift keys. Thus, it is difficult to press some shifted keys (such as control-P) with just one hand.
Non-keyboard devices such as mice draw a hand far away from the keyboard -- and you don't in general know whether it is the left or right hand that is drawn away. A sequence such as control-mouse button may be very difficult for some (i.e., left-handed) users to type.
Eyes can focus on a limited area of high resolution surrounded by a large area of lower resolution. However, areas of strong contrast such as reverse video are still visible in low-resolution areas. Blinking items are not only visible, but will draw the eye to them. "Status" displays should therefore change as quietly as possible so as not to draw the eye away from the text under edit. For example, it may make sense to place such status areas on the top part of the display if insert/delete line operations cause visible motion of the bottom part.
The mind (or brain), however, places the greatest constraints on editor design. It is only capable of processing a few thoughts ("instructions") per second. In order for users to be productive, it is important that these thoughts be directed as much as possible to useful editing operations. There are several things to consider regarding these thoughts.
First, mental effort (thought) is required to translate between the display representation of the text being edited and the user's internal representation. The WYSIWYG ("what you see is what you get") principle reduces this effort by reducing the amount of thought required. Note that in general WYSIWYG does not mean "fancy output on a graphics display." Rather, it means "it is what it appears to be, no more and no less."
Second, the mind has expectations: it sees (and in general senses) what it expects to see. In extreme cases, if something totally unexpected happens, it can take many seconds for the mind to even recognize that there is an unexpected image, in addition to the time required to process the image and make a decision. Thus, it is important for the program to anticipate what the mind will expect to see and to arrange the display accordingly.
Third, it takes mental effort to handle special cases. For example, if the delete operation deletes everything except for newlines, it takes effort to remember that difference and to monitor each command that is being given to ensure that it conforms to the restriction. Fourth, it takes mental effort to plan ahead. The design of the editor should make it easy for the user to change his or her mind.
Last, it takes mental effort to track modes. (Chapter 9 goes into modes in detail.) Each time a new mode is introduced, it takes mental effort to track the state of the mode and adds effort to the process of switching modes.
The mind's short-memory can hold from five to seven "chunks" of information (Norman 1990). These chunks are organized in a cache-like form. When the chunk cache fills up, chunks must be stored in "main memory," a process that takes time. Considering that some of these chunks are used to remember what is being edited, why the editing is being done, and other such context, it becomes clear that the editor should be designed to use as few of these "chunks" as possible.
The mind is poor at thinking numerically. It is much easier to think in terms of "put that there" than "put object 12856 at location 83456." These last two points mean that the computer should do as much remembering as possible for the user.
Let us examine how these principles apply to a particular user: me. I select myself as the example for the simple reason that I understand how my mind works better than I understand anyone else's.
First, I almost always work with plain ASCII files. Hence, I can take advantage of WYSIWYG on even a simple ASCII terminal.
Second, the program/computer combination that I use can (mostly) keep up with my typing in real time.
Third, the Emacs command set that I use is very regular, so my mind need only keep track of a few special cases.
Fourth, the basic paradigm behind the Emacs command set is "move to desired position, make desired change." This paradigm applies even in the case where I made a mistake, as I simply add the mistake to the list of changes to be made and continue to apply the pardigm. I never have to change mental gears. The penalty for making a mistake is thus minimized.
Fifth, the program minimizes what I need to remember: the text being edited is there to be seen, exactly as is, and there are very few state variables to track. In addition, the Emacs command set is defined mainly in terms of objects (character, word, sentence, etc.) and has a convenient way of saying "some," "a lot," "a whole lot," and "a huge amount." (Various aspects fo Emacs command set are discussed in later chapters.)
Going beyond these principles, I have used the Emacs command set so long (thirteen years) that I quip that most of my editing is performed by my spinal cord and not my brain. Although this quip is not true since the spinal cord can only handle purely reflex actions, we will look closely at how my mind functions when editing text. The mind of any other experienced user should operate in a similar fashion.
As I write this text, part of my mind is articulating the point that I am trying to make, while another part is expanding those words into their component characters. Call these parts the "source process." Another part of my mind is translating those characters into finger motions. Call this part the "keystroke process." Other parts of my mind are reading the text as it appears on the screen, turning it back into words, and matching these words against the original word stream. Call this the "feedback process."
These three processes work in any sort of writing: using a computer, typewriter, or pen. All people who write use them. However, if the resulting text is to have few errors, one of two things must have happened: either the user made very few mistakes (thus minimizing the number of errors to be corrected) or the user must have written slowly, giving the feedback loop enough time to recognize an error before too much time has elapsed and the error becomes difficult to correct (such as an omitted character on the previous line or page).
With the advent of computers, and their ability to make seamless corrections, a third option appeared: a new, fast feedback loop. This loop operates by giving the keystroke process the ability to recognize that it made a mistake. This extra ability is not useful without seamless editing, as it takes a long time to use the eraser or correction tape. However, with (a lot of) practice, a fourth process can be "running:" the "editing process."
The editing process takes the feedback from the keystroke process and inserts editing commands into the character stream created by the source process. Here is an example of how this editing might operate to correct an error when writing the text "the quick red fox."
(Other users may have variations on this process. For example, they may always delete all of any word with an error and retype the word.) With the extra fast feedback loop, the fingers were kept typing at full speed all the time. Granted, an extra five characters were typed, but consider what would happen without the extra loop. It could well be that the entire phrase would have been typed before the error was noticed. The source process would have already started on the next phrase. When the feedback process notices the error, the smooth typing of characters would stop as the user's mind determines exactly which corrections are required and how to perform them. It must then start the pipeline going again. The stopping, correcting, and starting again takes several seconds. A fifty word-per-minute typist is typing about five characters per second. The Emacs correction string would take one second to type. There is thus a direct saving of some seconds and an indirect saving due to not having interrupted the smooth flow of thinking.
Note that the design of the command set played an important part in making this loop usable. For example, if no "go backward word" operation were available, the editing process would have to compute how many characters were in the "output buffer," an operation that is quite time-consuming (quick: how many letters in "brown"?) as well as not well matched to how the mind works.
Some recent industry trends illustrate how some "user friendly" designs clash with this editing process. Consider a typical, modern window system. In some ways, it acts to frustrate an experienced user. For example, when a user closes a modified file, the computer may put up a dialog box that says "Discard changes? Yes, No, Cancel" (or words to the same effect). This prompt will be displayed in a beautiful dialog box, neatly centered on the screen. Each response will have its own button. Unfortunately, even if the user is expecting the dialog box, he or she may have wait for the system to catch up for these reasons:
For these reasons, an experienced user's editing process may be interrupted. These interrupts no doubt contribute to the feeling of sluggishness that many experienced users still feel when using such systems. The challenge is to design your program so that experienced users can productively use your program. The steps that you can take to faciliate this use include:
In general, the goal is for an experienced user to be able to accurately predict which responses will be required, and to reliably supply those reponses in advance of the prompts. In this way, experienced users can continue to do their work, without being slowed down by the system.
When someone has a significantly reduced ability to do something, that person is considered to be handicapped in that area. The reduced ability might be physical, such as reduced hand motion or poor eyesight, or it might be mental, such as a reduced ability to remember things.
While the number of people who have severe handicaps in many areas is small, a large number of users have at least limited handicaps in a few areas. As it is important for programs to accommodate as wide a range of users as possible, programs must accommodate users with handicaps.
It is also important to keep in mind that those users that have severe and/or multiple handicaps can benefit greatly from the use of computers.
Sometimes, even users without a handicap benefit from designs intended to aid users with handicaps. For example, adding a wheelchair ramp to an old building also allows other people to roll heavy objects up the ramp instead of having to use stairs.
The main design principles to follow to take into account users with handicaps are:
It is not surprising that these are also good design rules for users without handicaps.
(Some of these questions refer to marketing decisions. A designer must also take into account those people who are not yet users. Remember that purchasers are "users" too.)
Consider the case where the higher you go in an organization, the less computer experience people have. Assume that product purchase decisions are made at a higher level than the product user. How does this inversion affect product design? Product marketing? (Medium)
Many product reviews include "feature checklists" or "scoreboards." These checklists in general include all features found in all related products. What are the pros and cons of these checklists for manufacturers? For users? (Medium)
I have observed that, all other things being equal, people will buy the more expensive of two application programs. Why? (Easy)
Productivity falls off as computer response time increases. However, the fall-off is not linear, but happens in a series of thresholds, where slight increases in response time cause large drops in productivity. Why do these thresholds exist? What information do you need about human physiology in order to calculate where the thresholds are? (Hard)
How would you design a program to best be used by someone with dyslexia? What about the entire computer system? It is okay to be extreme and to make it less usable by other people. (Medium)
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"
User interface hardware is the collection of devices you use when interacting with the computer. The currently available user interface hardware usually consists of a display screen for output and a keyboard and perhaps a mouse or other graphical input device for input. This chapter will first discuss the output side: the screen. It will then discuss the input side: the keyboard. Finally, it will discuss the communications paths that tie the two parts together.
In the old days (i.e., the early 1980s), almost all displays were part of character-based terminals. Differences in capabilities among the terminals were often crucial. These differences play an important part in the types of redisplay schemes that are workable (redisplay is discussed in Chapter 7). Thus, it is worth reviewing the old display types.
A TTY is the canonical printing terminal. Printing terminals have the property that what is once written can never be unwritten. A glass TTY is the same as a TTY except that it uses a screen instead of paper. It has no random cursor positioning, no way of backing up, and no way of changing what was displayed. They are quieter than printing terminals, though.
When a text editor is used on one of these displays, it usually maintains a very small window (e.g., one line) and either echos only newly typed text or else constantly redisplays (i.e., reprints) that small window. Once a user is familiar with a display editor, however, it is possible -- in a crunch -- to edit from a terminal of this type, but this is not generally a pleasant way to work.
Although one would hope that this type of display was gone for good, it does crop up from time to time in poorly implemented window schemes. Some window schemes offer window interfaces that resemble printing terminals -- all too well.
You may encounter one other type of "write only" scheme: a Unix-style output stream. As an editor writer, you may want to check for this and either:
You may want to alter your output if you feel that the user wants to create some sort of "audit trail" type file. On the other hand, you would not want to alter your output if the user is attempting to diagnose problems by recording the data that is sent to the display.
A basic display has, as a bare minimum, some sort of cursor positioning. It will generally also have "clear to end of line" operation (put blanks on the screen from the current cursor position to the end of the line that the cursor is on) and "clear to end of screen" (ditto, but to the end of the screen) functions. These functions can be simulated, if necessary, by sending spaces and newlines. A typical basic terminal is (was) the DEC VT52.
Such displays are quite usable at higher speeds (for example, over a 9600 bps connection) but usability deteriorates rapidly as the speed decreases. It requires patience to use basic displays over a 1200 bps connection, and a dedication bordering on insanity to use them at 300 bps.
Advanced displays have all of the features of the basic displays, along with editing features such as "insert" and "delete line and/or character." These features can significantly reduce the amount of data sent to the display for common operations. A typical advanced (circa 1980) terminal is the DEC VT100. Most terminals currently manufactured are at least as powerful as this one.
There is a subtle difference among some of the advanced terminals. An "insert line" operation adds one or more blank lines at the cursor: the lines that "drop off" the bottom of the screen are lost. A "delete line" operation deletes one or more lines at the cursor: blank lines are inserted at the bottom. A "scroll window" operation (move lines x through y up/down n lines) affects only the specified lines: the other ones remain stationary.
The "scroll window" operation is more pleasing than the others to see when there is some stationary text being displayed at the bottom of the screen. With "insert/delete line," the appropriate number of lines must be deleted and then inserted; the text at the bottom thus moves within the display's memory. Such jumps are often visible to the user. With "scroll window," the whole thing is performed as one operation and the lines at the bottom do not jump.
This designation covers a wide range of displays. Their common characteristic is that display memory can be read or written at near-bus speeds. The display is usually built into the computer that is running the text editor. Many personal computers and workstations follow this design. But be warned: some computers have very fast display hardware, but the software that is used to interact with the display is very slow. It is probably better for a redisplay scheme to consider such displays to be "advanced" or even "basic." Examples of such displays are the ROM BIOS calls on the IBM PC and Sun workstations. In both cases, third-party drivers operate many times faster than the manufacturer-supplied ones.
The use of such fast displays has several implications for the redisplay process. First, many of the advanced features are typically not available. However, it may be possible to emulate the missing features quickly enough that the lack of advanced features is almost always not significant. Second, it may be possible to use the display memory as the only copy of the data on the screen. (This optimization is discussed in Chapter 7.) Third, if reading from the screen does not cause flicker but writing does, the screen can be read and the incremental redisplay process will run and compare the buffer against it, changing it only when necessary. Finally, if you can write to the screen without flicker, the redisplay process merely boils down to copying the buffer onto the screen, as copying is generally faster than comparing.
Most personal computer and workstation displays are actually bitmap-oriented graphics displays. Software is used to make them appear to display characters. With a graphics display -- and the appropriate software -- a program can not only display text, but display text using proportional spacing (where different letters take up different amounts of space), take advantage of different sizes, styles, and display fonts, and even incorporate graphical elements.
This section presents a review of salient keyboard features. Although most of us won't ever get the chance to design a keyboard, we all purchase keyboards, and more importantly we design programs with existing keyboards in mind.
The keyboard is the main way of telling the computer what to do. In some cases, it is the only way of doing so. Many thousands of characters will be entered in the course of a normal working session. Someone who types for a living (such as a typist, writer, or computer programmer) can easily type ten million characters each year.
The keyboard should thus be tailored for the ease of typing characters. While this statement might seem trite, there are a large number of keyboards on the market (i.e., most) which are pretty poor for entering characters. Below is a discussion of the various keyboard features and why they are or are not desirable.
N-KEY ROLL-OVER is a highly desirable feature. Having it means that you don't have to let go of one key before striking the next. The codes for the keys that you did strike will be sent out only once and in the proper order. (The n means that this roll-over operation will occur even though every key on the keyboard has been pressed before the first one is released.) The basic premise behind n-key roll-over is that you will not hit the same key twice in a row. Instead, you will hit a different key first and the reach for that key will naturally pull your finger off the initial one. N-key roll-over loosens the timing requirements regarding exactly when your finger has to come off the first key. Thus, typing errors are reduced. Note that n-key roll-over is of no help in typing double letters. Note also that shift keys are handled specially and are not subject to roll-over.
Some keyboards implement "2-key roll-over/n-key lockout." This means that only the first two keys of a continuous sequence will be sent and the rest ignored (until all keys are released). This "feature" is actually a way of turning the statement "we don't offer n-key roll-over" into a positive-sounding statement "we offer 2-key ..."
AUTO-REPEAT means that if a key is pressed and held down, the code for that key is sent repeatedly. It is a very desirable feature. It can cause problems (say, if you put something down on the keyboard), but such problems are worth living with. Older terminals sometimes followed typewriter design in that only certain keys would repeat (such as space, 'x', and dash). Repeating just these few keys is not useful. Other terminals repeat the printing characters but not the control characters. This is also not useful. As we will see later, it is the control characters that we are most likely to want to repeat.
There are three parameters associated with auto-repeat: the initial delay to the first repeat, the rate at which a key will repeat, and the acceleration of the repeat. Ideally, the user should be able to set these parameters. If they cannot be set, the values selected by the manufacturer become an additional consideration.
"TYPEABILITY" (I trust that the English language has not sunk to the point where this is considered to be a valid word) is the single most critical feature. It is simply the ability to type the useful characters without moving your fingers from the standard touch-typing position (the "asdf" and "jkl;" keys). As more and more people who use (computer) keyboards are touch typists and can thus type reasonably fast, they should not be slowed down by having to move their hands out of the basic position. It can take one or two seconds to locate and type an out-of-the-way key. The row above the digits is out of the way, as are numeric key pads and cursor control keys. One second is from three to ten characters of time (at 30 - 100 words per minute). Thus, it takes less time in general to type a four- or five-character command from the basic keyboard than to type one "special" key.
Because of the desire for typeability, it is worth at least considering doing away with such keys as Shift Lock or Caps Lock. They are rarely, if ever, used, and the keyboard space that they occupy is in high demand. (Yes, I realize that my anti-uppercase bias is showing here.)
Keyboard manufacturers have done other things that reduce typeability. Two examples are illustrative. First, the timing on the shift keys can be blown. The result of doing so is that when "Foo" is desired, "FOo," "fOo," and "foo" are as likely to result. The other example is having a small "sweet spot" on each key. Missing this "sweet spot" will cause both the desired and the adjoining key to fire or not. Thus, striking "i" could cause either "io" or nothing to be sent.
PACKAGING or physical keyboard design is also very important. Sharp edges near the keyboard or too tightly packed keys can cause errors and fatigue. Can the keyboard be positioned so as to be comfortable? Is there a palm ledge (this may be either good or bad)? Does the keyboard meet "ergonometric" standards? (In my experience, "ergonomic" standards equate to "hard to use.")
Keyboard manufacturers seem to have decided that a plethora of special keys is more useful than adding shift keys. Thus, you can get keyboards with Insert Line or "cursor up" or -- gasp -- PF1 (if not LF1, F1, and RF1). These keys, when pressed, will either do the function that they name, do something totally random, or send a (usually pre-defined and unchangeable) sequence of characters to the program.
With the advent of windowing systems, manufacturers have realized that the keyboard/display combination simply does not have the information required to properly perform the function locally. They have also decided that random operations don't sell devices well. This is actually a change from the terminals made a few years ago.
That leaves us with character sequences. Ideally, the sequences would be programmable. Thus, the editor could save the current set of programmed sequences (if any), load a set that would not interfere with any editing commands, then restore the user's settings upon exit. However, this is the real world and it is often the case that the sequences are not programmable.
Given this, the keys may or may not be useful. For example, the "cursor up" key might send Escape 'E'. You may wish this particular sequence to perform a "move to end of sentence" operation (I do). Thus, pressing the "cursor up" key will move you to the end of the sentence!
Okay, you say, I won't use Escape 'E' to move to the end of the sentence. You then look up all of the sequences that may be sent by function keys and design your command set around them. All is well and good until you try to use a different keyboard. Your new keyboard will in general use different sequences than the old one. The sequences may even conflict: for example, the "cursor down" key on the new terminal might send Escape 'E'.
We got into this situation for two reasons: a major one and a minor one. The minor one is easy to deal with. All that we have to do is tell the editor which keyboard we are using and have the editor perform any required adjustments. On UNIX systems, for example, the required information can be found in the /etc/termcap or terminfo facilities.
The major reason why we are in this situation is that the program cannot tell when we are pressing a function key and when we are typing the same sequence of characters explicitly. After all, there are only 128 or 256 possible characters, and they must be shared by regular keys and function keys.
Some systems that support directly attached terminals use timing information to make this determination. If a string of characters comes in with no delays between them, they assume (usually correctly) that it is a single function-key press. This timing approach does not work if the terminal (or other computer) is coming in via a network.
The problem could best be solved by standardizing the character sequences sent by function keys so as to (1) have a single, obscure prefix (say, Escape, control-_) and (2) have a consistent syntax so that all devices can easily determine when the sequence is over. Command set designers would just have to live with the hole in the command set, but that would be a small price to pay.
Aside from the problems of compatibility with whatever software is being run, the placement of the function keys is also a problem. As was mentioned before, keys that are off to one side take a long time to hit. Thus, typing is slowed down considerably. The keys are best used for infrequently used functions or functions where the extra time is not a significant factor (e.g., Help).
There is yet one more problem. Additional keys are not free and so the number of them that you'll want to pay for is limited. However, it is desirable to have the ability to specify a large number of functions (i.e., have a large number of codes that can be specified by the user). The number of function keys required grows linearly with the number of codes.
The other way to increase the number of codes available to the user is to provide extra shift keys. Shift keys are keys that modify the actions of the other keys. Shift and Control are the two most common examples of such keys. The IBM PC has an Alt key, the Apple Macintosh has its "cloverleaf" key, and some terminals have a Meta key option.
As an example, a Meta key would set the top (value 128 decimal) bit of the character that is typed. Thus, while typing shift-A would send the code for uppercase A (65 decimal), meta-shift-A (often abbreviated as simply meta-A or ~A) would send the code 128 + 65 or 193 decimal. A user can thus specify 256 codes instead of the usual 128 from a full ASCII keyboard.
The number of possible codes grows exponentially with the number of extra shift keys. Thus, 512, 1024, and even 2048 code keyboards (with 2, 3, or 4 extra shift keys) are conceivable. You will have to use system-dependent techniques to take advantage of this extra information.
Finding room on the basic keyboard for these extra shift keys is not easy. That is one reason why the removal of the Shift Lock key was suggested earlier. These keys must be on the basic keyboard in order to preserve touch-typeability.
A computer is not a typewriter. There are things that you do with a computer that simply do not apply to typewriters. Hence, a computer keyboard should have more keys than a typewriter, and yet these keys must be conveniently placed.
Several computer manufacturers have achieved good keyboard designs. Unfortunately, most of them have retired their good designs in favor of poor ones. (See the next section for examples.) Here are some of my criteria for good key placement:
These are the positions that have come to be accepted as standard for computer keyboards. However, some manufacturers have gotten scared that their computers might actually resemble computers. Thus, necessary keys such as Escape and Control get moved out to the far reaches of the keyboard, and "<" and ">" characters get moved from their convenient, traditional positions above "," and "." to who knows where.
Dvorak keyboards are an underground fad. Their proponents swear by them and claim significant performance improvements (i.e., you can type faster on them). As the story goes, the standard QWERTY layout was designed to slow typing on the early typewriters in order to keep the mechanism from jamming. And, since jamming is no longer a consideration, one can (and Dvorak did) design a layout that is "better." Regardless of the truth of the story (and I believe it to be true), all keyboard layouts can take advantage of the improvements in technology. For example, modern keyboards are actually a grid of switches. These switches are scanned electronically. Their travel, feel, and other characteristics can be adjusted as desired. They have been adjusted so that both key travel and effort are much reduced from old, manual typewriters. Hence, hand and finger motions are reduced overall and the benefits to be gained from switching layouts are thereby reduced.
Considering that there are hundreds of millions of existing keyboards that use the QWERTY layout, and that there are billions of people trained to use it, it becomes clear that only an enormous gain in productivity (e.g., greater than 100%) would be able to justify a switch to another layout. And while there are a number of isolated success stories, not even the proponents of Dvorak layouts offer any controlled studies that show the requisite gains (Norman 1990). Hence, these keyboards are not being adopted on a large scale.
This section will briefly review a number of widely available keyboards. The keyboards reviewed are the ones actually named: the review does not transfer to "clones." The comments are, of course, my personal opinions.
DEC VT100 terminal: The keyboard layout is excellent. The feel is klunky. The control keys don't repeat.
DEC VT200 terminal: The keyboard layout is poor (badly placed Escape key and "<" and ">" keys). The feel is pretty good.
IBM PC 83-key keyboard: This is the one sold with the original IBM PC. Its layout is almost excellent (the "\"/"|" key placement is a little weird). It makes a clacking sound which I happen to like although many people do not. The feel is excellent. If only they didn't try to "improve" it with...
IBM PC 101-key keyboard: This is the only one that you can get from IBM now, and it is enough in itself to keep me from buying a new IBM PC. The Escape and control keys are very poorly placed. The feel is excellent.
Apple Macintosh original "slab" keyboard: The layout isn't too bad you consider that Apple intended this machine to be its own universe, and not try to incorporate outside software. On the whole, however, it suffers from not having quite enough keys (especially Escape), so that terminal emulator programs are awkward to use. The feel is fair.
Apple Macintosh "Standard" keyboard: Perfect layout, enough keys, great feel.
Apple Macintosh "Enhanced" keyboard: This keyboard is for people who like the IBM PC 101-key keyboard. Enough said.
Sun Microsystems SPARCstation keyboard: Excellent layout, poor feel, too many function keys.
Another way of interacting with a computer is by means of a graphical input device. The advantage of a graphical input device is that it can reduce the number of commands needed. Such a device is used for pointing at sections of the screen. It is possible to specify items (i.e., "operate on that") without having to specify the numerical address of the location or a command string to move there.
When a graphical input device is used, the screen is treated as one menu with the device pointing to one entry. A cursor is used to provide feedback to the user about which menu "item" is currently selected. There are usually one or more flags that can be specified conveniently from the device. These flags provide control information and are analogous to shift keys.
The basic way to use these devices is to track the position implied by the graphical input device with the cursor. When a signal is given, the action implied by the current position is performed. The screen is logically broken up into two or more sections. One section has the text that is being edited. Moving the cursor here provides a convenient way to move the point around; typing a character could cause it to be inserted wherever the cursor is. Other portions of the screen can specify menus of possible actions to select from. Graphical input is thus a very sophisticated way of specifying a position as an argument to a function.
The following sections discuss the advantages and disadvantages of a variety of graphical input devices. Bear in mind that the comments are generalizations: there are exceptions to each of the advantages and disadvantages mentioned.
A Touch Sensitive Display (TSD) is just what it sounds like. The screen is covered with a special transparent material (or a grid of LEDs and receptors or other devices) that you touch with your finger: the absolute (x,y) coordinates of where you touched are then reported. The only available flag is the "touch/no touch" flag. (Actually, experimental pressure-sensitive displays exist that report all three positions and three pressure axes.) The well-engineered touch-sensitive displays are quite pleasant to use for low-usage applications. For high-usage purposes such as text editing, it is tiresome to keep raising your hand to the screen, and your finger tends to cover the most interesting part of the display (i.e., the part that you are about to edit).
A tablet is a special surface that reports the position of the input device as an (x,y) coordinate. The input device can be a "puck" (a small box) or a special pen. At least one flag ("touch/no touch") is always available: some pucks have four, sixteen or even more extra flags. Tablets are very handy for converting paper documents such as maps into computer form. They are less useful for text editing, as they tend to be large and therefore require a long reach and a lot of uncluttered desk space.
A mouse is a small box on wheels (or, in some cases, on felt pads over a special pad). As you move it around on the floor, desk, books, a leg, or most anything else, it reports the relative movement of the mouse (i.e., "I was just moved n units up and m units left"). It can have several flags (buttons), although the correct number is one, as having extra buttons means that program designers will try to put extra functions on them. And, while the functions themselves are not a problem (I do advocate extra shift keys, after all), the presence of these functions usually implies a poor program design. Fortunately, if the mouse has extra buttons, the software can easily correct this "defect" just by making them all do the same thing.
A trackball is an upside-down ("dead") mouse. Instead of moving the wheels by moving the box, you spin the slightly larger wheel directly.
A joystick is a small stick mounted on a couple of potentiometers. They typically can report either absolute position, first derivative (relative movement) or second derivative (acceleration). As the stick is moved only over a small distance, it is difficult to construct one with good resolution and that avoids "stickiness" and "jumpiness." It is generally not as nice to use as a mouse or trackball. Flags are simulated by regular keyboard keys.
Finally, an imaginary but useful device should be considered. That device is a foot-operated mouse (perhaps called a "rat?"). Using your feet rather than your hand to operate the mouse solves one of the most nagging problems of any of these devices, which is that your hands must leave the keyboard with the usual, aforementioned results. Of course, this device makes it harder to edit with your feet up on your desk...
New types of input devices appear all the time. Thus, no listing of such devices can ever remain complete. An example of a recent such device is "pen" input. The points to remember are that each device should be judged on its own strengths and weaknesses and that the devices should be judged on how they help your users: not whether the devices are "neat" or "new."
These devices all assume a reasonably high bandwidth connection to the computer (say, 2400 bps or faster). If you have a slow-speed connection, the cursor tracking must be performed in the local display device, which must somehow be programmed with the knowledge of when to report events and what to do with the cursor (sometimes the cursor changes shape as it crosses from one part of the screen to another). In this way, it is possible to supply the necessary immediate feedback. A slow-speed connection would be quite satisfactory for communicating the significant events, but probably not satisfactory for the screen refresh that would follow, say, the selection of a menu.
This section covers a number of miscellaneous issues concerning the communications path between the computer and the display/keyboard device.
It almost goes without saying that the faster the communications path, the better. Consider it said.
It also almost goes without saying that a full-duplex communications path is necessary. Fortunately, we are long past the days when users were forced to wait until the computer let them type. Except on automated teller machines.
If the communications are over an asynchronous serial path, character format is an issue. The considerations are:
Operating system designers make the quite valid and reasonable assumption that they should be doing some processing of the input characters. Fortunately, they usually also offer the ability to turn such processing off. A text editor should follow these steps:
On entry to the editor:
On exit:
In this way, the text editor has complete control over what happens with the input characters. This places an extra burden on you as the writer of the editor, as you must replace the operating system handlers with versions of your own that mimic the existing functions. On the other hand, your versions will probably differ from the operating system versions in a number of crucial ways. For example, if the operating system lets you suspend your process (for example, under a Unix that supports job control), you need to restore the terminal and input processing parameters before you turn control back to the operating system. When resumed, you need to return the settings back to those used by the editor (noting any changes such as a new window size) and probably refresh the display. If you hadn't replaced the normal handlers, the user would find yours to be a very unfriendly program to use.
The faster the communications path, the less time the display has to process each character. As the speed of the communications path is increased, a point will be reached when the display can no longer keep up in real time. This is the point at which flow control is required. There are three methods currently in use to implement flow control.
The first is in-band control. Two characters are reserved for flow control purposes, typically the control-S and the control-Q characters. The first is used to mean "hold on, I can't keep up and my buffer is almost full." The second means "okay, I've caught up and you can proceed." This method works for the most part, but has the annoying property of using up two valuable control characters. Using any control characters causes problems for some programs. For example, there exist some communications protocols that use all 256 characters and allow no characters to be reserved.
The second method is out-of-band control. This method uses a variety of mechanisms, none of which interfere with sending data. Examples of such methods are hardware "handshake" lines and network protocol mechansims. This method is clearly superior to in-band.
The final method is flow control avoidance. This method takes advantage of the facts that displays take different amounts of time to process different characters and that some characters (called padding characters) take very little time to process. The program send the data as a mix of useful characters and padding characters. The specific mix is computed so that the average time required to process each character is less than the time taken to send a character over the communications path and that the terminal's input buffer does not overflow.
For example, let's say that we have these (fairly typical) figures:
If we were just sending full lines of text to the display, we would send 81 characters in 81 msec. These 81 characters would take 80 * .6 msec + 17 msec = 65 msec to process. Hence, no padding would be required.
On the other hand, if we were just sending single-character lines of text to the display, we would send 2 characters in 2 msec. These 2 characters would take 1 * .6 msec + 17 msec = 17.6 msec to process. Padding would be required as the 17.6 msec processing time is greater than the 2 msec transmission time. As it turns out, 18 padding characters will be sufficient (1 * .6 msec + 17 msec + 18 * .1 msec = 19.4 msec, which is less than the 20 msec of transmission time). It is not difficult to calculate the correct number of padding characters required, given the character mix and the communications path speed.
This third method is the preferred method for text editors, as it works on any communications path (i.e., even those with no out-of-band flow control) and it allows full use of all input characters. If used over a network, it has the disadvantage of creating a modest additional amount of network traffic.
The ideal method would be for the editor to determine whether out-of-band flow control is used along the entire communications path. If such control is in use, no padding characters need to be sent. Unfortunately, it is usually not possible to reliably determine the type of flow control in use.
Echo negotiation was devised for the Multics computer system. It is a protocol for use by computer networks which can cut down on response time by reducing communications overhead. It is potentially useful in an environment where the user's terminal is at one node and the computer which is running the text editor is at another. In such an environment, it can take a long time to send a character back and forth, and yet it takes little more time to send many characters.
Echo negotiation requires that it be easy to describe exactly what is to be done with each character to a communications processor/terminal combination and that the combination be capable of doing enough of the editing to make it worthwhile.
Typically, echo negotiation can only be used when the editing point ("cursor") is at the end of a line. The text editor sends a list of approved characters to the terminal or other nearby processor. As long as the user types only those characters and does not reach the end of a screen line (thus necessitating a wrap), the terminal can safely echo the input characters to the display and hold onto the input text. When any non-approved character is typed (or the line fills up), the terminal reports all of the held input characters and the reason why the input was sent on (i.e., non-approved character or line wrap) to the text editor. The editor then processes the input data and the cycle repeats.
The Xylogics Annex terminal server incorporates an advanced version of echo negotiation called the LEAP Protocol. It incorporates all of the above design.
Both standard echo negotiation and the LEAP protocol suffer from the same problem. This problem is severe enough to call into question the desirability of using them at all: Echo negotiation is only potentially useful when the terminal is separate from the computer that is running the text editor and when the computer is overloaded. The principle behind echo negotiation is that waking up the text editor process for each character is inefficient. In extreme cases, the wake-up may take so long that input echoing is significantly delayed. The fix that echo negotiation offers is to perform the updates in batches, thus waking up the text editor process fewer times and thereby reducing overhead.
The problem with the fix is inherent in its own success. With no echo negotiation, input is echoed slowly but evenly (user typing is in general much slower than process-switching times) and the text-editing process tends to stay in memory. With echo negotiation, input is echoed quickly until the non-approved character is typed, then a (comparatively) long pause is encountered while the text-editing process must be woken up and possibly even swapped in (we are talking about a situation where resources are tight, after all). Even though the average per-character processing time might be lower, the variance in per-character times is much larger with echo negotiation. It is usually the case that the variance is so high that the system as a whole becomes unpleasant if not impossible to use. In one extreme test that I performed, I found the variance of times to be so great that editing was all but impossible: until you stopped typing for many seconds, you could never tell whether the computer had processed all of your input and hence couldn't safely continue typing (editing commands -- not new text to be inserted). In conclusion, echo negotiation is not a good feature to include.
High-speed modems (9600 bps and higher) are starting to become quite common. The main problem with them is that the advertising for them is focused around file-transfer protocols and dumping large quantities of text through them. The manufacturers add a variety of compression techniques to improve their modems' throughput in these areas.
However, text editing is interactive. Low response time is more important than high throughput. This is where the compression schemes implemented by the modems can cause problems. The simple solution is to turn off all such compression. Do not forget to turn off control-S / control-Q flow control while you're at it.
Devise at least three different ways of encoding cursor positioning coordinates. Which is the most extensible? (Easy)
Why can character-oriented displays handle blinking text more easily than graphics displays? Does it matter? (Easy)
If you could change one physical attribute of the display (e.g., size, phosphor) that you use most, what would it be? (Easy)
Give an example of an application that can make effective use of function keys. (Easy)
Devise an efficient, extensible encoding scheme for function keys. (Easy)
Some keyboards (such as that used by the IBM PC and compatible computers) assign a priority to shift keys and only pay attention to the highest priority key pressed. For example, pressing both Control and Shift gives the same code as does just pressing Control. Is this better or worse than giving a different code to the combination key presses? Why? (Easy)
How does the amount of buffering affect the need for padding? Does it matter where in the system additional buffering is placed? (Medium)
A fourth way to handle flow control used to be common practice but is no longer. It is called "ETX / ACK" after the codes for the characters that were used to implement it. In this method, the sender sends a block of text followed by an ETX character. It then waited for the receiver to return an ACK character. Why has this scheme dropped from favor? How does it interact with terminals on computer networks? (Medium)
He took his vorpal sword in hand:
Long time the manxome foe he sought --
The choice of implementation language has a major effect on the design of a text editor. In some environments, only one language is available. In such environments, you do the best that you can and your editor may end up different from what it would be if the ideal language was available. However, most environments offer at least two languages. You thus have a choice, and this chapter offers guidance in making that choice. Of course, this may be a choice between Scylla and Charybdis...
The general considerations in selecting a language to use for implementing a text editor are:
Each of these considerations will be explored in detail.
You can only use those languages that are supported on the system that your text editor is first implemented upon. Nonetheless, you should be thinking about the second, third, and later systems that your text editor will be ported to, and which languages all of those systems support in common.
In addition to the mere presence of a language processor on a system, you should take into consideration the quality of implementation of such systems. An implementation's speed of operation, quality of diagnostics, quality of code produced, and other such factors can make a large difference in the usability of the language on a particular system.
It may appear redundant to say that a text editor must handle text, but consider a spread sheet program: most of its work is in handling control flow, figuring redisplay, and setting up to execute commands. Only a small fraction of its time is spent in the floating point instructions that most users think is the program's "real work."
At any given moment, a text editor -- or most any other similar interactive program -- is mainly doing all of the following:
Most of these operations involve processing text in some way or other. Text editors differ from other applications only in that the "executing the commands" item also involves manipulating text.
It is important to note that "text