Christopher B. Browne's Home Page

8. Liars, **** Liars, Statisticians, and Benchmarks

Benchmarks are a problem, and while the gratuitous use of benchmarks is most notable in computing, the effects of poor use of benchmarks extends well beyond computing. The problem also applies nicely to school grades and the usage of standardized tests.

When they are useful, benchmarks commonly have only a small area in which they can be validly applied.

See also Open Directory - Computers: Performance and Capacity: Benchmarking

8.1. BogoMIPS - Largely Bogus, Very Useful In Narrow Context

When Linux systems boot up, they display a "BogoMIPS" value. This value specifically reflects the number of cycles eaten up by a particular timing loop.

BogoMIPS are a nice example of a fairly "good" benchmark in that within the very name the reader gets some indication that the values are somehow "Bogus."

8.2. IQ - Is it Intelligence?

Similarly, "intelligence quotient" tests are very good at indicating how good people are at writing "intelligence quotient" tests. Extending this to try to indicate some "actual intelligence" is rather misleading. And even if we assume this can be done successfully, extending this to provide expected performance in other areas is not likely to provide much useful information, because "intelligence" represents too many distinct things to readily measure in the single metric of "an IQ of 114."

And even if the IQ of "114" was an accurate reflection of "intelligence," the relationship between this and anything practical is highly questionable.

(Note that the correct interpretation of "an IQ of 114" is usually something like: An IQ of 114 indicates that the subject's test results on a particular test were 1.4 standard deviations higher than the mean. )

8.3. How Fast/Slow is X?

People argue over whether the X Window System is horribly slow or not.

The answer depends so much on what applications are being run, what displays are serving them, and the nature of the network, that making snap judgements about its speed or lack thereof is foolish.

On the Thesis that X is Big/Bloated/Obsolete and Should Be Replaced discusses this issue in greater detail.

8.4. How fast is that CPU?

"MIPS" is commonly described by terms such as "Meaningless Indicator of Processing Speed."

Sales people love to sell customers on the number of MHz of processing speed that a CPU provides. Unfortunately for the typical customer, that number will only be vaguely related to the overall performance of the computer system. System performance typically depends on such things as:

If the other components are not "souped up" correspondingly, doubling the processor speed may have little or no impact on overall system processing throughput. As Eric Raymond's Building the Perfect Box suggests, most Linux users will likely get the very best system performance out of buying a CPU that is a couple of "generations" behind the latest and greatest (e.g. Spend $150 on a P133 rather than $500 on a P200), and spending the difference on more memory and SCSI support.

With the complexity of modern computer systems, the speed of the CPU is falling in overall importance.

Intel obviously doesn't want you to believe that...

8.5. The Balance

The web page on the Reiserfs File System presents a very interesting mix. The author presents benchmarks comparing the Linux ext2 file system to his own that is based on balanced trees.

Since both systems provide fairly good performance on "average" sorts of files, it proves necessary to benchmark primarily using cases that would be considered unrealistic in practice.

One conclusion that can be drawn from the situation is that most people probably won't care to "upgrade" from ext2 to reiserfs any time soon, at least not for performance reasons. ext2 is certainly more mature.

Another is that benchmarks must both be carefully constructed and interpreted. I believe that Hans Reiser does so; he has been pretty honest not to use his benchmarks to claim superiority of his file system except in those respects that it provides better performance.


If this was useful, let others know by an Affero rating

Contact me at