
|
 |
|
Michael Abrash's Graphics Programming Black Book, Special Edition
by Michael Abrash
The Coriolis Group
ISBN: 1576101746 Pub Date: 07/01/97
|
Chapter 16 There Aint No Such Thing as the Fastest Code
Lessons Learned in the Pursuit of the Ultimate Word Counter
I remember reading an overview of C++ development tools for Windows in a past issue of PC Week. In the lower left corner was the familiar box listing the 10 leading concerns of corporate buyers when it comes to C++. Boiled down, the list looked like this, in order of descending importance to buyers:
- 1. Debugging
- 2. Documentation
- 3. Windows development tools
- 4. High-level Windows support
- 5. Class library
- 6. Development cycle efficiency
- 7. Object-oriented development aids
- 8. Programming management aids
- 9. Online help
- 10. Windows development cycle automation
Is something missing here? You bet your maximum gluteus somethings missingnowhere on that list is there so much as one word about how fast the compiled code runs! Im not saying that performance is everything, but optimization isnt even down there at number 10, below online help! Ye gods and little fishes! We are talking here about people who would take a bus from LA to New York instead of a plane because it had a cleaner bathroom; who would choose a painting from a Holiday Inn over a Matisse because it had a fancier frame; who would buy a Yugo instead ofwell, hell, anythingbecause it had a nice owners manual and particularly attractive keys. We are talking about people who are focusing on means, and have forgotten about ends. We are talking about people with no programming souls.
Counting Words in a Hurry
What are we to make of this? At the very least, we can safely guess that very few corporate buyers ever enter optimization contests. Most of my readers do, however; in fact, far more than I thought ever would, but that gladdens me to no end. I issued my first optimization challenge in a Pushing the Envelope column in PC TECHNIQUES back in 1991, and was deluged by respondents who, one might also gather, do not live by PC Week.
That initial challenge was sparked by a column David Gerrold wrote (also in PC TECHNIQUES ) concerning the matter of counting the number of words in a document; David turned up some pretty interesting optimization issues along the way. David did all his coding in Pascal, pointing out that while an assembly language version would probably be faster, his Pascal utility worked properly and was fast enough for him.
It wasnt, however, fast enough for me. The logical starting place for speeding up word counting would be Davids original Pascal code, but Im much more comfortable with C, so Listing 16.1 is a loose approximation of Davids word count program, translated to C. I left out a few details, such as handling comment blocks, partly because I dont use such blocks myself, and partly so we can focus on optimizing the core word-counting code. As Table 16.1 indicates, Listing 16.1 counts the words in a 104,448-word file in 4.6 seconds. The file was stored on a RAM disk, and Listing 16.1 was compiled with Borland C++ with all optimization enabled. A RAM disk was used partly because it returns consistent timesno seek times, rotational latency, or cache to muddy the watersand partly to highlight word-counting speed rather than disk access speed.
|
Listing
| Time to Count Words
|
|
16.1 (C)
| 4.6 seconds
|
16.2 & 16.3 (C+ASM)
| 2.4 seconds
|
16.2 & 16.4 (C+ASM w/lookup)
| 1.6 seconds
|
These are the times taken to search a file containing 104,448 words, timed from a RAM disk on a 20 MHz 386.
|
Table 16.1 Word count timings.
|
|
LISTING 16.1 L16-1.C
/* Word-counting program. Tested with Borland C++ in C
compilation mode and the small model. */
#include <stdio.h>
#include <fcntl.h>
#include <sys\stat.h>
#include <stdlib.h>
#include <io.h>
#define B UFFER_SIZE 0x8000 /* largest chunk of file worked
with at any one time */
int main(int, char **);
int main(int argc, char **argv) {
int Handle;
unsigned int BlockSize;
long FileSize;
unsigned long WordCount = 0;
char *Buffer, CharFlag = 0, PredCharFlag, *BufferPtr, Ch;
if (argc != 2) {
printf(usage: wc <filename>\n);
exit(1);
}
if ((Buffer = malloc(BUFFER_SIZE)) == NULL) {
printf(Cant allocate adequate memory\n);
exit(1);
}
if ((Handle = open(argv[1], O_RDONLY | O_BINARY)) == -1) {
printf(Cant open file %s\n, argv[1]);
exit(1);
}
if ((FileSize = filelength(Handle)) == -1) {
printf(Error sizing file %s\n, argv[1]);
exit(1);
}
/* Process the file in chunks */
while (FileSize > 0) {
/* Get the next chunk */
FileSize -= (BlockSize = min(FileSize, BUFFER_SIZE));
if (read(Handle, Buffer, BlockSize) == -1) {
printf(Error reading file %s\n, argv[1]);
exit(1);
}
/* Count words in the chunk */
BufferPtr = Buffer;
do {
PredCharFlag = CharFlag;
Ch = *BufferPtr++ & 0x7F; /* strip high bit, which some
word processors set as an
internal flag */
CharFlag = ((Ch >= a) && (Ch <= z)) ||
((Ch >= A) && (Ch <= Z)) ||
((Ch >= 0) && (Ch <= 9)) ||
(Ch == \);
if ((!CharFlag) && PredCharFlag) {
WordCo u nt++;
}
} while (BlockSize);
}
/* Catch the last word, if any */
if (CharFlag) {
WordCount++;
}
printf(\nTotal words in file: %lu\n, WordCount);
return(0);
}
|