
|
 |
|
Michael Abrash's Graphics Programming Black Book, Special Edition
by Michael Abrash
The Coriolis Group
ISBN: 1576101746 Pub Date: 07/01/97
|
 | Make sure you understand what really goes on when you insert a seemingly-innocuous function call into the time-critical portions of your code.
|
In this case that means knowing how DOS and the C/C++ file-access libraries do their work. In other words, know the territory!
LISTING 1.4 L1-4.C
/*
* Program to calculate the 16-bit checksum of the stream of bytes
* from the specified file. Obtains the bytes one at a time via
* getc(), allowing C to perform data buffering.
*/
#include <stdio.h>
main(int argc, char *argv[]) {
FILE *CheckFile;
int Byte;
unsigned int Checksum;
if ( argc != 2 ) {
printf(usage: checksum filename\n);
exit(1);
}
if ( (CheckFile = fopen(argv[1], rb)) == NULL ) {
printf(Cant open file: %s\n, argv[1]);
exit(1);
}
/* Initialize the checksum accumulator */
Checksum = 0;
/* Add each byte in turn into the checksum accumulator */
while ( (Byte = getc(CheckFile)) != EOF ) {
Checksum += (unsigned int) Byte;
}
/* Report the result */
printf(The checksum is: %u\n, Checksum);
exit(0);
}
Know When It Matters
The last section contained a particularly interesting phrase: the time-critical portions of your code. Time-critical portions of your code are those portions in which the speed of the code makes a significant difference in the overall performance of your programand by significant, I dont mean that it makes the code 100 percent faster, or 200 percent, or any particular amount at all, but rather that it makes the program more responsive and/or usable from the users perspective.
Dont waste time optimizing non-time-critical code: set-up code, initialization code, and the like. Spend your time improving the performance of the code inside heavily-used loops and in the portions of your programs that directly affect response time. Notice, for example, that I havent bothered to implement a version of the checksum program entirely in assembly; Listings 1.2 and 1.6 call assembly subroutines that handle the time-critical operations, but C is still used for checking command-line parameters, operning files, printing, and the like.
 | If you were to implement any of the listings in this chapter entirely in hand-optimized assembly, I suppose you might get a performance improvement of a few percentbut I rather doubt youd get even that much, and youd sure as heck spend an awful lot of time for whatever meager improvement does result. Let C do what it does well, and use assembly only when it makes a perceptible difference.
|
Besides, we dont want to optimize until the design is refined to our satisfaction, and that wont be the case until weve thought about other approaches.
Always Consider the Alternatives
Listing 1.4 is good, but lets see if there are otherperhaps less obviousways to get the same results faster. Lets start by considering why Listing 1.4 is so much better than Listing 1.1. Like read(), getc() calls DOS to read from the file; the speed improvement of Listing 1.4 over Listing 1.1 occurs because getc() eads many bytes at once via DOS, then manages those bytes for us. Thats faster than reading them one at a time using read()but theres no reason to think that its faster than having our program read and manage blocks itself. Easier, yes, but not faster.
Consider this: Every invocation of getc() involves pushing a parameter, executing a call to the C library function, getting the parameter (in the C library code), looking up information about the desired stream, unbuffering the next byte from the stream, and returning to the calling code. That takes a considerable amount of time, especially by contrast with simply maintaining a pointer to a buffer and whizzing through the data in the buffer inside a single loop.
There are four reasons that many programmers would give for not trying to improve on Listing 1.4:
- 1. The code is already fast enough.
- 2. The code works, and some people are content with code that works, even when its slow enough to be annoying.
- 3. The C library is written in optimized assembly, and its likely to be faster than any code that the average programmer could write to perform essentially the same function.
- 4. The C library conveniently handles the buffering of file data, and it would be a nuisance to have to implement that capability.
Ill ignore the first reason, both because performance is no longer an issue if the code is fast enough and because the current application does not run fast enough13 seconds is a long time. (Stop and wait for 13 seconds while youre doing something intense, and youll see just how long it is.)
The second reason is the hallmark of the mediocre programmer. Know when optimization mattersand then optimize when it does!
|