AMD's B3 Stepping Phenom Previewed, TLB Hardware Fix Tested
by Anand Lal Shimpi on March 12, 2008 12:00 AM EST- Posted in
- CPUs
The first B3 Stepping Phenom
We managed to get our hands on a 2.2GHz engineering sample B3 stepping Phenom:
AMD will begin shipping production B3 Phenoms later this quarter, presumably at higher clock speeds than the 2.2GHz - 2.3GHz launch parts. Our B3 sample was very similar to our B2 chip in that we could get it stable at 2.6GHz but didn't have much luck getting it to run comfortably any faster. We suspect that it'll take a move to 45nm before AMD can really start to push the clock speed on Phenom.
To get an idea of how much of a performance hit the software/BIOS TLB fix incurs we took a small selection of our normal CPU tests and ran with the patch enabled/disabled on a B2 stepping Phenom 9600 (2.3GHz):
SYSMark 2007 | DivX | CineBench R10 | 3dsmax 9 | WinRAR | |
AMD Phenom 9600 (B2 Stepping) - TLB Fix Disabled | 117 | 74.3 fps | 7396 | 7.20 | 1348 KB/s |
AMD Phenom 9600 (B2 Stepping) - TLB Fix Enabled | 105 | 72.0 fps | 7031 | 6.47 | 367 KB/s |
Performance Impact | -10.3% | -3.1% | -4.9% | -10.1% | -72.8% |
The smallest performance impact was a meager 3.1% reduction, but we suspect that 10%+ would be far more typical. WinRAR is a particularly extreme case where performance dropped by over 70%, which AMD indicated would happen given the heavy memory access nature of file decompression applications.
The new B3 stepping Phenom shouldn't perform any differently to a B2 stepping chip with the TLB fix disabled, but to confirm we ran the most extreme test once more:
512 256MBWinRAR | |
AMD Phenom 9600 (B2 Stepping) - TLB Fix Disabled | 1348 KB/s |
AMD Phenom 9600 (B2 Stepping) - TLB Fix Enabled | 367 KB/s |
AMD Phenom B3 @ 2.3GHz | 1357 KB/s |
As expected, all is good with B3. The TLB Fix option actually disappeared from the Gigabyte 780G's BIOS upon inserting a B3 chip, it's like the problem never existed.
Final Words
With the TLB erratum fixed in B3, AMD is one step closer to a competitive Phenom part. Unfortunately Phenom still suffers from low clock speeds and that's something AMD will be working on in the coming months. It will take a combination of higher clock speeds and very competitive pricing to really save Phenom.
29 Comments
View All Comments
aguilpa1 - Thursday, March 13, 2008 - link
It has already been done. There are tons of sites that have already benchmarked the Phenom (errata fix disabled) against the core 2. Fixing the TLB via hardware doesn't magically make it any faster. There is only a slight increase but its not significant.Redoing all the benchmarks just to prove a slight increase but still lagging behind overall is just beating a dead horse at this point.
crimson117 - Wednesday, March 12, 2008 - link
Clock for Clock is an irrelevant metric. So what if 2.0GHZ on a C2D is faster than 2.0GHZ on a Phenom?Performance per Dollar or Performance per Watt are much more relevant metrics.
backtomac - Wednesday, March 12, 2008 - link
All those metrics are important. Each individual will have a differing importance on each metric.flipmode - Wednesday, March 12, 2008 - link
Says you. It's relevant to at least two people here.JarredWalton - Thursday, March 13, 2008 - link
Actually, I'd say clock-for-clock is one of the worst comparisons to make, short of two things:1) If available clock speeds are similar (they're not - Core 2 Quad tends to have about a 33% advantage in clock speed)
2) If you want to look purely at the architectural performance
While item two looks interesting at first, you have to remember that architecture and design ultimately have a large impact on clock speed. Which is better: more pipeline stages and higher clock speeds, or fewer pipeline stages with lower clock speeds? If you think you know the answer, go work for Intel or AMD. In truth, there is no correct answer - both approaches have merits, and so we end up with a balancing act.
Pentium 4 (NetBurst) is often regarded as going too far in the way of pipeline stages. Which Prescott certainly had some problems due to the pipeline stage count, Northwood and the current Penryn are actually not that far off in terms of stages. The difference is that Penryn (and Core 2 in general) have made numerous changes to the underlying architecture that makes the pipeline stage count less important now.
Clock for clock, I'd imagine an updated 486 core could compete very well in today's market. That is, IF you could actually make such a core. Just think about it: four pipeline stages, give it some more cache, add in SSE and x64 support, put two or four cores on a chip, and then run that sucker at 3.0GHz! But each stage is the old 486 requires so much work to be done that you could never actually get such a design to scale to 2.0GHz on current technology, let alone 3-4GHz.
So when someone says clock-for-clock comparisons are irrelevant, I largely tend to agree. Why don't we do a "clock-for-clock" comparison of a tractor-trailer diesel engine and a formula one engine? Or a "clock-for-clock" comparison of apples and oranges? The latter takes things to an extreme to illustrate a point, but in the case of the former all you really could end up determining is that large diesel engines and racing engines are vastly different.
K10 and Penryn might not be quite so different, but they are dissimilar in enough ways that the best way to compare them really ends up being a large selection of real world performance metrics. Sure, a 2.4GHz Penryn and a 2.4GHz Phenom X4 gives us some idea of how the designs match up, but at the end of the day what really matters is price, performance, stability/reliability, and power requirements (the latter also impacting noise).
flipmode - Sunday, March 16, 2008 - link
Whether or not there is value in comparing IPC is pretty subjective. I happen to disagree with you - I find it valuable, at least for the time being while both AMD and Intel are offering CPUs at comparable clockspeeds (1.6GHz to 3.2GHz, generally). If AMD's were all less than 2.5GHz and Intel's were all more than 2.5GHz then it would be much less useful info to me to know how they performed at the same clockspeed since they didn't operate at the same clockspeed. But it's not the end of the world if Anandtech chooses not to look at such things.mindless1 - Friday, March 14, 2008 - link
Clock for clock is quite relevant because prices change and people overclock. It doesn't mean someone only picks which has more performance per MHz or which has higher MHz or any such thing, rather within a family it is quite relevant to know how it performs clock per clock then the user does the math to further evaluate other alternatives.murphyslabrat - Wednesday, March 12, 2008 - link
However, it does give a foundation for comparing prices and clockspeeds not explicitly compared. It also helps to evaluate potential gain from overclocking.You are right, there are better methods. This one (clock-for-clock performance), while not a very valuable metric in and of itself, does allow better extrapolation.
Cygni - Wednesday, March 12, 2008 - link
Its called a PREview for a reason. ;) Im sure there will be AT rundown of the chip later. This short blurb is only to tell us about the TLB fix.