AMD's dual core Opteron & Athlon 64 X2 - Server/Desktop Performance Preview
by Anand Lal Shimpi, Jason Clark & Ross Whitehead on April 21, 2005 9:25 AM EST- Posted in
- CPUs
"Order Entry" Stress Test: Measuring Enterprise Class Performance
One complaint that we've historically received regarding our Forums database test was that it isn't strenuous enough for some of the Enterprise customers to make a good decision based on the results.In our infinite desire to please everyone, we worked very closely with a company that could provide us with a truly Enterprise Class SQL stress application. We cannot reveal the identity of the Corporation that provided us with the application because of non-disclosure agreements in place. As a result, we will not go into specifics of the application, but rather provide an overview of its database interaction so that you can grasp the profile of this application, and understand the results of the tests better (and how they relate to your database environment).
We will use an Order Entry system as an analogy for how this test interacts with the database. All interaction with the database is via stored procedures. The main stored procedures used during the test are:
sp_AddOrder - inserts an Order
sp_AddLineItem - inserts a Line Item for an Order
sp_UpdateOrderShippingStatus - updates a status to "Shipped"
sp_AssignOrderToLoadingDock - inserts a record to indicate from which Loading Dock the Order should be shipped
sp_AddLoadingDock - inserts a new record to define an available Loading Dock
sp_GetOrderAndLineItems - selects all information related to an Order and its Line Items
The above is only intended as an overview of the stored procedure functionality; obviously, the stored procedures perform other validation, and audit operations.
Each Order had a random number of Line Items, ranging from one to three. Also randomized was the Line Items chosen for an order, from a pool of approximately 1500 line items.
Each test was run for 10 minutes and was repeated three times. The average between the three tests was used. The number of Reads to Writes was maintained at 10 reads for every write. We debated for a long while about which ratio of reads to writes would best serve the benchmark, and we decided that there was no correct answer. So, we went with 10.
The application was developed using C#, and all database connectivity was accomplished using ADO.NET and 20 threads - 10 for reading and 10 for inserting.
So, to ensure that IO was not the bottleneck, each test was started with an empty database and expanded to ensure that auto-grow activity did not occur during the test. Additionally, a gigabit switch was used between the client and the server. During the execution of the tests, there were no applications running on the server or monitoring software. Task Manager, Profiler, and Performance Monitor were used when establishing the baseline for the test, but never during execution of the tests.
At the beginning of each platform, both the server and client workstation were rebooted to ensure a clean and consistent environment. The database was always copied to the 8-disk RAID 0 array with no other files present to ensure that file placement and fragmentation was consistent between runs. In between each of the three tests, the database was deleted, and the empty one was copied again to the clean array. SQL Server was not restarted.
144 Comments
View All Comments
liebremx - Thursday, April 21, 2005 - link
Anand, great reading as always.
I have an observation:
On the 'Development Performance - Compiling Firefox' section you write
"This particular test is only single threaded, ..."
Why not launch a multithreaded build?
"make -j3 -f client.mk build_all"
Jalf - Thursday, April 21, 2005 - link
Makes good sense for AMD to keep their (server) dualcore chips pricey. AMD has limited manufacturing capacity, and they have best singlecore solution. In other words, they might as well keep the dualcore prices high, to a) make more money in cases where people are willing to fork over lots of money, and b) keep people who are on a budget interested in their singlecore offerings, at least until their new fab goes online.GentleStream - Thursday, April 21, 2005 - link
I have some comments about the Firefox compile test. First, thanks alot for including it. Now I have some comments about it. First, you are using GNU make and it supports parallel compiles. So, you should be able to replace the line:make -f client.mk build_all
with the line:
make -j 2 -f client.mk build_all
to perform a parallel compile using 2 processors. The -j option specifies how many processors or threads you are using. You can do parallel compiles on a single processor machine as well as multi-processor or multi-core machines. It is often the case that using -j 2 or -j 3 on a single processor machine will give the best results because of it's allowing the overlaping of cpu computations and I/O.
You don't say whether you did a debug or optimized build. I would recommend doing both the debug and optimized builds and reporting the results of both. When doing parallel optimized compiles, you may want to make sure you are not swapping although for the server tests it looks like you have plenty of memory - 4 GBytes. I did not see immediately how much memory you were using for the X2 tests. Anyway, I would recommend doing both debug and optimized compiles with -j n where n is 1, 2, 3, and 4 or perhaps just 1, 2, and 4. Since compiles are essential to development work and also embarassingly parallel, this should provide a really good comparison of the multitasking capabilities of these systems.
Hope you can do this or at least some of it and thanks alot for adding a really good compile test to your test suite.
Dave
michaelpatrick33 - Thursday, April 21, 2005 - link
The server market is where AMD is going headed to get large margins in their chips. With Supermicro joining the AMD camp (they must have seen the performance of the Opteron dualcore, blinked their eyes and said, "we're in") Dell is left alone holding Intel only product lines. Intel will not have a response on the server front until Q1 2006. That is troubling for Intel because it give AMD six months of market buildup and Fab36 time to come online and increase volume tremendously. It should be interesting.Imagine a 4800+ on a 939 DFI board running at 2-2-2-8 1t timings versus the P4 Extreme dualcore. Drooling just thinking about having either processor, but especially the AMD
erwos - Thursday, April 21, 2005 - link
"AMD would probably have problems delievering a lower cost dual core in quantities ."This is exactly it. Why should AMD let demand outstrip supply? Just jack up the price until you've got just enough demand to consume your supply.
I mean, yes, I'd love an Athlon64 X2 5000+ with 1mb of cache for ~$250, but that's life. AMD stockholders should be pleased with this decision.
There's also the impending move to socket M2 to consider... the Athlon64 X2 makes sense for people with very low-end A64's, but M2 is going to be the better upgrade path for FX and/or 3800+ users. I would be surprised to see any 939 Athlon64's past 5200+.
eetnoyer - Thursday, April 21, 2005 - link
While our desires as desktop users are for high volumes of X2s at low prices, we have to balance that with what AMD as a company needs to survive...money. AMD is currently capacity constrained with regard to dual-core CPUs with only Fab30. They have entered into agreements with both IBM and Chartered for additional capacity (probably on the lower end chips), but that won't come online until late this year. Just before production starts to ramp at Fab36.In the meantime, AMD has stated that their order of priority goes Server -> Mobile -> Desktop with the profitability motive in mind. For most users that will be heavily into the multi-tasking benefits of dual-core CPUs, spending $5xx for the low-end X2 vs $1000 for the PEE 840 will be a no-brainer. Seeing how that is a small minority of users, AMD can reasonbly supply the demand for them while still maintaining the highlest level of availability of dual-core Opterons at much better ASPs. Remember that AMD wants to capture as much market share in the server market as possible while Intel has no response.
As a share-holder, I hope that the demand for dual-core Opteron is deafening based on the incredible price/performance ratio (thus limiting their ability to produce X2 in high quantity). As a middle-of-the-road desktop user, I'm quite content with my mildly OC'd A64 for the next year or two.
ksherman - Thursday, April 21, 2005 - link
w00t! Ill have to read it later tho...MrHaze - Thursday, April 21, 2005 - link
Certainly impressive.I think it is important to remember that the "Athlon64 X2" was actually an Opteron running ECC RAM at 2T on a less-than-stable motherboard. I think it is best think of this as a comparison of Intel's dual cores, AMD's single cores, and a hog-tied Athlon64 X2.
Makes you wonder how an actual X2 with fast memory on a fast motherboard will perfom.
Regardless, I'm really excited about the upgrade potential, and I hope that AMD sticks with socket 939 for a long while.
Mr.Haze
kirbalo - Thursday, April 21, 2005 - link
Great review Anand...Thanks for fixing your gaming bar charts...they were wacked before!Tapout1511 - Thursday, April 21, 2005 - link
Sure would have been nice if they had included a single core A64 at 2.2GHz w/ 1MB cache (3500+ right?) to illustrate instances where the extra core was useful and when it wasn't.Oh well.