A program is a detailed set of instructions read by both a human and a machine. The computer reads only the code, while the human concentrates on the comments. Good style pertains to both parts of a program. Well-designed, well-written code not only makes effective use of the computer, it also contains careful constructed comments to help humans understand it. Well-designed, well-written code is a joy to debug, maintain, and enhance.
Good programming style begins with the effective organization of code. using a clear and consistent organization of the components of your program you make them more efficient, readable, and maintainable.
People have been writing books for hundreds of years, and during that time they have discovered how to organize the material to efficiently present their ideas. Standards have emerged. For example, if I asked you when this book was copyrighted, you would turn to the title page. That's where the copyright notice is always located.
A book's title page contains the name of the book, the author, and the publisher. On the reverse of the title page is the copyright page, where you find things like the printing history and Library of Congress information.
At the beginning of every well-documented program is a section known as the heading. It is, in effect, the title page of the program. The heading consists of a set of boxed comments that include the name of the program, the author, copyright, usage, and other important information. The heading comments are fully discussed in See File Basics, Comments, and Program Headings .
A program should have a table of contents as well, listing the location of each function. This is difficult and tedious to produce by hand, however it can be produced quite easily by a number of readily available tools, as discussed later in this chapter.
Technical books are divided into chapters, each one covering a single subject. Generally, each chapter in a technical book consists of a chunk of material that a reader can read in one sitting, although this is not a rule: Donald Knuth's highly regarded 624-page Fundamental Algorithms (Addison-Wesley, Reading, MA, 1968) contains only two chapters.
Similarly, a program is divided into modules, each a single file containing a set of functions designed to do some specific job. A short program may consist of just one module, while larger programs can contain 10, 20, or more. Module design is further discussed later in this chapter.
Each chapter in a technical book is typically divided into several sections. A section covers a smaller body of information than a chapter. Sections in this book are identified by section heads in bold letters, making it easy for a reader to scan a chapter for a particular subject.
Just as a book chapter can contain several sections, a program module may contain several functions. A function is a set of instructions designed to perform a single focused task. A function should be short enough that a programer can easily understand the entire function.
A technical book should have a good index that lists every important subject or keyword in the book and the pages on which it can be found. The index is extremely important in a technical book because it provides quick access to specific information.
A program of any length should have a cross reference, which lists the program variables and constants, along with the line numbers where they are used. A cross reference serves as an index for the program, aiding the programmer in finding variables and determining what they do. A cross reference can be generated automatically by one of the many cross reference tools, such as xref , cref , etc.
A glossary is particularly important in a technical book. Each technical profession has its own language, and this is especially true in computer programming (e.g., set COM1 to 1200,8,N, I to avoid PE and FE errors). A reader can turn to the glossary to find the meaning of these special words. Every C program uses its own set of variables, constants, and functions. These names change from program to program, so a glossary is essential. Producing one by hand is impractical, but as you'll see later in this chapter, with a little help from some strategically placed comments, you can easily generate a glossary automatically.
Some of the program components described above can be generated automatically. Consider the table of contents, for example. On UNIX systems, the ctags program will create such a table. Also, there is a public domain program, called cpr , that does the job for both DOS and UNIX.
A cross reference can also be generated automatically by one of the many cross reference tools, such as xref , cref , etc. However, you can also generate a cross reference one symbol at a time. Suppose you want to find out where total-count is located. The command grep searches files for a string, so typing:
invokes the vi editor to list the files that contain the word total_count . Then you can use the vi search command to locate total_count within a file. The commands next ( :next ) and rewind ( :rew ) will flip through the files. See your vi and UNIX manuals for more details.
Borland C++ and Borland's Turbo-C++ have a version of grep built in to the Integrated Develop Environment (IDE). By using the command Alt-Space you can bring up the tools menu, then select grep and give it a command line, and the program will generate a list of references in the message window. The file corresponding to the current message window line is displayed in the edit window. Going up or down in the message changes the edit window. With these commands, you can quickly locate every place a variable is used.
You can also partially automate the process of building a glossary, which is a time-consuming task if performed entirely by hand. The trick is to put a descriptive comment after each variable declaration. That way, when the maintenance programmer wants to know what total_count means, all he or she has to do is look up the first time total_count in mentioned in the cross reference, locate that line in the program, and read:
Another analogy to books is helpful here. Consider the documentation for a piece of equipment like a laser printer. This typically consists of two manuals: the Operator's Guide and the Technical Reference Manual.
The Operator's Guide describes how to use the laser printer. It includes information like what the control panel looks like, how to put in paper, and how to change the toner. It does not cover how the printer works.
A user does not need to know what goes on under the covers. As long as the printer does its job, it doesn't matter how it does it. When the printer stops working, the operator calls in a technician, who uses the information in the Technical Reference Manual to make repairs. This manual describes how to disassemble the machine, test the internal components, and replace broken parts.
The public interface of a module is like an Operator's Guide. It tells the programmer and the computer how to use the module. The public interface of a module is called the "header file." It contains data structures, function definitions, and #define constants, which are needed by anyone using the module. The header file also contains a set of comments that tells a programmer how to use the module.
The private section, the actual code for the module, resides in the c file. A programmer who uses the module never needs to look into this file. Some commercial products even distribute their modules in object form only, so nobody can look in the private section.
Because a library is a collection of modules, you could use a collection of header files to interface with the outside world. The advantage to this is that a program brings in only the function and data definitions it needs, and leaves out what it doesn't use.
As you can see, this can result in a lot of #include s. One of the problems with this system is that it is very easy to forget one of the #include statements. Also, it is possible to have redundant #include s. For example, suppose the header file XmILabel.h requires XmISeparator.h and contains an internal #include for it, but the program itself also includes it. In this case, the file is included twice, which makes extra, unnecessary work for the compiler.
Also, it is very easy to forget which include files are needed and which to leave out. I've often had to go through a cycle of compile and get errors, figure out which include file is missing, and compile again.
This is much simpler than the multiple include file approach taken by X Windows System. Also, there is no problem with loading a header file twice because there is only one file and only one #include statement.
The problem is that this file is 3,500 lines long, so even short 10-line modules bring in 3,500 lines of include file. This make compilation slower. Borland and Microsoft have tried to get around this problem by introducing "precompiled" headers, but it still takes time to compile Windows programs.
Borland's Turbo Vision library (TV) uses a different method. The programmer puts #define statements in the code to tell the TV header which functions will be used. This is followed by one #include directive.
The file tv.h brings in additional include files as needed. (The #define s determine what is needed.) One advantage over multiple include files is that the files are included in the proper order, which eliminates redundant includes.
This system has another advantage in that only the data that's needed is brought in, so compilation is faster. The disadvantage is that if you forget to put in the correct #define statements, your program won't compile. So while being faster than the all-in-one strategy, it is somewhat more complex.
Part of what makes books readable is good paragraphing. Books are broken up into paragraphs and sentences. A sentence forms one complete thought, and multiple sentences on a single subject form a paragraph.
Similarly, a C program consists of statements, and multiple statements on the same subject form a conceptual block. Since "conceptual block" is not a recognized technical term, you may just as well call them paragraphs. In this book, paragraphs are separated from each other by a blank line. You can separate paragraphs in C in the same way.
Omitting paragraphs in code creates ugly, hard-to-read programs. If you've ever tried reading a paper without paragraphing, you realize how easy it is to get lost. Paragraph-less programming tends to cause the program to get lost:
Note that the paragraphs here are not defined by the syntax of the language, but by the semantics of the program. Statements are grouped together if they belong together logically. That judgement is made by the programmer.
Good paragraphing improves the aesthetics, hence the readability, of a program. But there are also aesthetic issues at the level of the sentence; or in C, the statement. A statement expresses a single thought, idea, or operation. Putting each statement on a line by itself makes the statement stand out and reserves the use of indentations for showing program structure.
Figuring out this code is like extracting a fossil from a rock formation. You must take out your hammer and chip at it again and again until something coherent emerges. This kind of programming obscures the control flow of the program. It hides statement beginnings and endings and provides no paragraph separations.
In clearly written English there are limits on the optimum length of a sentence. We've all suffered through the sentence that runs on and on, repeating itself over and over; or, through a structure whose complexity demonstrates more the confusion than the cleverness of the author (although it should be noted that, as in the present example, a demonstration of confusion can be the whole point), just get all bollixed up.