Useless Microoptimizations Homepage Forum Don't get confused, this is just my homepage, not really a message board. I implemented it as a forum for reasons you can find here.
The user CPU time results are pretty usable, the wall clock results are partly rough. That is particularly true for those tests that don't much much CPU time, for example the pure file transfer http tests. These vary greatly and I didn't have enough time yet to run them often enough.
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Sat Oct 01, 2005 8:51 am
The graphs have been updated with my 3800+ running with memory speeds simulating the Opteron with its pc2100.
That way you can see what effect the bigger cache of the Opteron has, and other differences between the architectures (the numbers fluctuate a little different from what you would expect for cache size reasons only).
Overall it says that the bigger cache is pretty much useless except for low-quality video. Not even the big C++ compilations in the Mozilla subdirectories show much of an effect. Money is obviously better spent elsewhere.
The Linux kernel compilation shows some improvement from the bigger cache, which is interesting because the FreeBSD kernel build does not. Still, 7% savings is all the bigger cache buys you. For the same money you can usually buy a 10% higher clocked CPU which will show the same improvments for this test but much better improvements for the others.
The different performance characteristic between the Linux and the FreeBSD kernel builds is probably rooted in the include file structure. FreeBSD's include files are much more straightforward, Linux's are deeply nested and actual language constructs expand via long detours through macros and typedefs.
I should delete the old Opteron run (the 2 GB one), it is erratic. It has been taken when the suite was on FreeBSD-6.0-beta2 and while at the time I didn't see a difference in timings it now becomes apparent that subsequent little fiddlings threw these numbers off.
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Sat Oct 01, 2005 10:32 am
I have a half-verified stable overclock of the 3200+ @ 2.7 GHz with 1:1 RAM coming in, first results in the charts. [ETA: not stable]
Now that I cleaned up the Opteron results and the 3800+ simulating the Opteron's RAM some differences become apparent:
the real Opteron outperforms the 3800+ on some of the tests where the foreground is running a CPU eater (Lisp) and the background is plain http.
however, the 3800 X2, although at super-slow RAM with its 512 KB cache, shreds the real Opteron when there is a huge number of plain http (no CPU eating) processes both in the foreground and the background with no CPU eaters around
An interpretation:
the bigger cache helps a lot to "defend" a foreground CPU eater against light-CPU I/O intensive backgrounds.
however, something in the architecture of the X2, or in the Via socket 939 chipset, really helps extreme multitasking with I/O and few CPU. (note however that most I/O here is simulated since I use spare files on disk and the localhost interface for http).
I will add a run where 16 and 256 background plain http fetchers are running with the Lisp in the foreground. That should shed some light on this.
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Mon Oct 03, 2005 11:02 am
Results added:
Full Pentium-M 1.3 Ghz run
Pentium-4 2.8 Northwood
Pentium-4 2.8 Northwood with hyperthreading
Results for a working run of the 3800+ at 2.4 GHz in the Asus board with a SMP kernel
The Pentium-4 results are mostly as expected, but the hyperthreaded variant shows a huge advantage in the tests that don't involve any CPU eaters, which just put huge numbers of data transactions through http together (the "plain http" tests). You need the "wall clock" charts to see this, not the user CPU time.
If you mix CPU eaters and plain http, then you overall get a big advantage out of Hypterthreading, but only for the CPU eater. The background transactions with Hyperthreading are slower than without. The advantage for the foreground CPU eater is huge, though.
[Note that there is a typo in the cart labels at this time. The run with Hyperthreading off is the one which says "1 CPU". Although it says "HT on", it has Hyperthreading turned off in the BIOS and and the single-CPU kernelis running. I'll correct the label ASAP.]
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Wed Oct 05, 2005 3:52 pm
New results for a dual 2.4 GHz Opteron with DDR400 in a Thunder K8W board.
New results for 2.8 GHz Xeon with Hyperthreading (= 4 logical CPUs) coming in. Looks very decent for some of the non-CPU intensive stuff like many plain http connections at the same time.
Chart reorganization:
New reference machine is X2 3800+ SMP in the Asus board.
Omit incomplete runs of 3800+ in the Asus board. I will get new results to compare memory bandwidth and timings soon. I have to do that in the DFI board, the Asus board is too random when fiddling with the RAM.
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Tue Oct 11, 2005 4:27 pm
A lot of different timings for X2 3800+ @ 2.4 GHz and 200 MHz RAM coming in.
Short analysis: spend your money on other things than fancy RAM. Flowers to overclock your girlfriend have a higher chance to make your computer noticably faster.
I will post a writeup with actual RAM recommendations soon.
Joined: 09 Feb 2005 Posts: 114 Location: Boston, MA, USA
Posted: Thu Jun 22, 2006 7:52 pm
Results for Opteron 875 x4 (4 dual = 8 cores) and a Pentium-M at 1.73 GHz (Inspiron Laptop) are in, as are results for a Sempron to show how much the smaller cache hurts.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum