Intel CEO Paul Otellini's memorable "shame on us... mea culpa, we screwed up" March 2007 speech to Morgan Stanley investors came after his company's marketing fog machine could no longer conceal the truth that, depending on your point of view, Intel was peddling technology that it knew to be somewhere between four and eight years behind AMD's. AMD told you so, and so did I, but Intel's marketing is capable of overpowering reason. Intel manages to thrive by setting expectations that match its technology, and raising those expectations every two years by just enough to make you see your Intel-based PC or server as wanting. Otellini got stuck apologizing because AMD got a chance to show buyers Opteron's potential. The market's expectations followed, as they naturally will when people buy technology that never needs replacing. Given the choice between buying well and buying often, the market chose the former.
Intel's smokescreen is back in overdrive. Those who do a light amount of homework before buying are getting Intel's same old message: Higher clock speeds, bigger cache, manufacturing process shrink, and faster front-side bus make the world go 'round. That latest speed bump makes your one year old computer look pretty sad (on paper). And when Intel goes "tock," it's rip and replace time to get those extra cores and the broader bus. Intel put the cherry on top by getting everyone worked up over CPU power draw to the exclusion of total system power draw. Intel sets the market's agenda. It tells buyers what matters.
AMD designs technology that will enable the workloads that you'll be running in two or three years. It strikes many as improbable when I tell them that AMD-based hardware, servers in particular, get faster over time as operating systems and application developers start unlocking the potential of the platform. When I say this, I may not take enough care to point out that AMD is committed to raising that potential between major revisions of its CPUs and whole system platforms. Intel can't catch up because AMD presents a moving target with meaningful point enhancements between major architecture revisions. AMD ticks and tocks as well, but it's the market that swings AMD's pendulum.
AMD is getting bolder about letting the market, in this case, IT, know that even through the gathering fog, AMD has a clear picture of what matters most to system buyers. You don't hear much about it these days, but price/performance matters. AMD's record-making results with Quad-Core Opteron on SPECweb2005 sets a realistic bar for server performance, a record that, notably, Intel misses by a hair. But Quad-Core Opteron comes in 41 percent lower in cost than quad-core Intel Xeon in two-socket servers. AMD servers cost less to build. Whether these savings will be passed on to you as a lower total system price is up to the OEM and its tendency to maintain artificial price parity between its similar Intel and AMD offerings. Not that I'm suggesting there's any pressure to do that.
AMD's not handing performance per watt to Intel. AMD's published benchmarks show quad-core AMD systems skunking Intel Core 2 Xeon on floating point synthetic benchmarks, by margins of between 13 and 50 percent, but quad-core Opteron lags Core 2 Xeon's integer performance by an impressive margin. Intel's butt-kicking compiler scored quad-core Xeon an earnest 20 percent lead over Quad-Core Opteron on SPECint_rate2006 (peak). The best AMD-targeted compiler from Portland Group couldn't close the gap. But interestingly, when the playing field was leveled a bit by using the gcc open source compilers, AMD pulled to within 9 percent of Core 2 Xeon on SPECint_rate2006 (base). The likelihood that you'll encounter architecture-optimized applications in the wild is mighty slim, but AMD gets candor points for showing this shortcoming of its own making. If AMD cares about closing the integer benchmark gap, AMD needs to contribute benchmark-winning optimizations to GNU.
AMD counters the gearhead-level speeds and feeds derived from synthetic benchmarks with IT-relevant load metrics. In Web transactions, virtualization, and parallel workloads, Quad-Core Opteron outperforms quad-core Xeon by margins of 9 to 16 percent. But there's a point worth noting: AMD scored these wins with a 2.3GHz CPU and 2MB of Level 3 cache. Intel lost out to AMD with quad-core Xeon CPUs running at 2.83GHz with 12MB of cache. The configuration differences between the two architectures give AMD what you call headroom. AMD is holding manufacturing process shrink, CPU clock speed, bigger cache, and additional cores as cards to play on IT's behalf when the time is right.
The time is right. Later this year, AMD will roll out "Shanghai," a Quad-Core Opteron built on a new 45 nanometer process, matching Intel's in scale while using a simpler method. Shanghai raises the ceiling on CPU clock speed to a level that AMD didn't disclose, and lowers power at idle by 20 percent. That's a ridiculous metric for a two-socket server, but in an eight-socket server, the likelihood that a socket will be idle is higher. AMD surprised me by borrowing a page from Intel's playbook, doubling its Level 3 CPU cache to 6MB. That will make a serious difference in the performance of applications optimized for Intel CPUs.
I was particularly struck by AMD's claim that Shanghai would deliver 25 percent faster times for world switch (switching from one guest OS instance to another) than the present Quad-Core Opteron. This, combined with a 10 percent boost in memory bandwidth, will give AMD a leg up in virtualization.
Shanghai marks the server debut of the coherent HyperTransport 3 (cHT3) bus. cHT3 is faster and more scalable than the HyperTransport 1 bus implemented in present Quad-Core Opteron servers, which probably contributes to reduced world switch time and increased memory bandwidth, both measures that are sensitive to the speed of the interconnects among CPUs.
The Shanghai CPU, which AMD projects will be available this year, will be a drop-in replacement for Quad-Core Opteron. Given where the economy is likely to be when Shanghai shows up, chip swap-upgradeable servers are a really smart investment.
I've saved the best part for last. If AMD hewed to Intel's "tick tock" strategy, which dictates a substantial architecture revision (tock) every other year, then with Shanghai, 2008 will certainly go down as an AMD "tock." In 2009, an architecture revision code-named "Istanbul" will carry AMD's 45 nanometer Opteron to six cores. In 2010, AMD will knock Intel's tocks off: A 12-core large scale enterprise CPU named "Magny-Cours" is slated for the first half of that year, and will deliver on AMD's big iron availability and reliability strategy. A 6-core edition of this CPU, "Sao Paolo," will roll out at the same time. I'm setting up a separate briefing for these parts, because believe me, they're game changers.
AMD is always cautious about projecting too far ahead, fearing that system buyers might suspect Intel-like obsolescence by design. AMD is smart to come out swinging by laying out present and future technology despite the risk, but amid the excitement over architectures to come, the value of AMD's long-term commitment to buyers can't be set aside. The Quad-Core Opteron server you buy today will upgrade to Six-Core, and even when 2010 comes around, 2008's Quad-Core Opteron servers will remain state of the art relative to Intel. New systems based on that platform will still be sold, and parts will remain plentiful. Isn't it nice to see clearly again?