Random computery things

Comparison of processor performances


There are many factors which affect the real-world performance of a processor. For scientific applications the most relevant one is probably the CPU's microarchitecture (along with its clock-rate), which determines how fast a number-cruncher the machine is. The compiler one uses may also play a important role in performance, and, depending on the particular application, cache and memory access limitations may become crucial bottlenecks. Ignoring any form of parallelism, this is.

Any measure of actual performance for a given code is by necessity a particular combination of different performance factors, and as such it can never be guaranteed to be transferable to other codes, or even to different uses of the same code. But it does give a rough idea of the performance, so here goes my CASINO-based comparison.

Name Details Architecture Year FMS Performance
DEC Alpha EV5.6 - AXP (64-bit) 1996 - 73±11%
DEC Alpha EV6 - AXP (64-bit) 1998 - 162±17%
DEC Alpha EV6.7 - AXP (64-bit) 1999 - 159±11%
Pentium III (Coppermine) 650MHz model P6 (32-bit) 1999 6.8.3 95±9%
Pentium III (Coppermine) 866MHz model P6 (32-bit) 1999 6.8.10 94±10%
Celeron (Coppermine) 900MHz model P6 (32-bit) 2000 6.8.10 78±9%
Intel(R) Pentium(R) 4 1.5 Willamette NetBurst (32-bit) 2001 15.0.7 95±4%
Intel(R) Pentium(R) 4 1.7 Willamette NetBurst (32-bit) 2001 15.0.10 92±5%
Intel(R) Pentium(R) 4 1.9 Willamette NetBurst (32-bit) 2001 15.1.2 90±6%
Intel(R) Pentium(R) 4 2.53 Northwood NetBurst (32-bit) 2002 15.2.4 95±3%
Intel(R) Pentium(R) 4 2.4 Northwood NetBurst (32-bit) 2002 15.2.7 84±5%
Intel(R) Xeon(TM) 2.4 Prestonia NetBurst (32-bit) 2002 15.2.7 92±6%
Intel(R) Pentium(R) 4 2.8 Northwood NetBurst (32-bit) 2002 15.2.9 100±1% [ref]
Intel(R) Xeon(TM) 3.6 Irwindale NetBurst (64-bit) 2005 15.4.3 99±8%
Intel(R) Pentium(R) 4 650 Prescott 2M NetBurst (64-bit) 2005 15.4.3 90±3%
Intel(R) Pentium(R) D 830 Smithfield NetBurst (64-bit) 2005 15.6.2 109±4%
AMD Athlon(tm) 64 3500+ Winchester K8 (64-bit) 2004 15.47.0 141±4%
Intel(R) Xeon(R) 5160 Woodcrest Core (64-bit) 2006 6.15.6 193±4%
Intel(R) Core(TM)2 E6600/E6700 Conroe Core (64-bit) 2006 6.15.6 200±7%


  1. The errors are standard errors, obtained (in most cases) from four different CASINO calculations run just once on each system, and may well be underestimated.
  2. The 'performances' are per core, and (supposedly) independent of the clock-rate. Examples of use:
    • To compare the estimated performance of four 3GHz Woodcrest cores against that of a 667MHz EV6.7 processor, one would do
          4 × 3000 × (193±4) ÷ [ 667 × (159±11) ]
      revealing that the Woodcrest is expected to be about 22±2 times faster than the Alpha on a perfectly parallel task. (Notice that error propagation is prone to inaccuracies due to the size of the above errorbars.)
    • It would take a (4.8±0.2)GHz Pentium 4 to reach the absolute performance of one core of a 2.4GHz Core 2 Duo, indicating a change for the better in Intel's product line.
    • The 64-bit Athlon 3500+, which ran at 2.2GHz, is roughly equivalent to a (3.4±0.2)GHz 64-bit Pentium 4. It seems that the '3500+' corresponds to an actual measure of relative speed.
  3. The excellent performance per cycle of the Alphas is noteworthy, particularly given that they were designed over a decade ago.

Evolution of the CASINO source code


Using archived copies of CASINO and a silly script to count non-comment, non-blank lines in Fortran 90 source files, I've produced the following graph [click to enlarge].


Which is scary. Disregarding the backflow jump of early 2005 and the subsequent 2006 tidy-up for v2.0, we have an accelerated growth. This may sound fine -doesn't this show a steady increase in the number of implemented features? Certainly, yes, but it's also indicating that we really need to start caring seriously about code structure, modularity, sustainability, etc. The Potential for Disaster must grow about exponentially with code size (especially for codes written by physicists, I would expect).