There's a good deal that's special about AMD's new Shanghai server CPU. It's fabulous science, fun for those of us who get dewy-eyed over the prospect of a 25 percent faster world switch time and immersion lithography. It makes the x86 battle interesting again because it carries AMD into territory that it must fight hard to win--the two-socket (2P) server space--and where innovation is sorely needed. AMD beat Intel's next-generation server architecture to market while closing performance, price, and power efficiency gaps between Core 2 and Shanghai. Just as it did in the old days, AMD now claims that its best outruns Intel's best despite having a lower clock speed.
Shanghai, the name given to AMD's 45 nanometer quad-core Opteron, pulled into port well ahead of schedule, affording AMD an opportunity it rarely has. Shanghai confronts Intel's forward-reaching marketing (Nehalem in 2009) with product that customers can buy now. (See "The Nehalem CPU's secret weapon.") System makers started receiving Shanghai in volume quantities in October.
[ See related story: "AMD launches 'Shanghai' quad-core Opteron" ]
While OEMs are notoriously tight-lipped about release schedules, several will undoubtedly tap the buzz of AMD's Shanghai launch, putting an array of Shanghai systems in the market before year's end. Intel's now-famous "we have no one to beat but ourselves" line will have to be rewritten for Nehalem's debut. By the time Nehalem bows, Shanghai will have been in the wild long enough that Intel won't be able to use Barcelona (AMD's 65nm quad-core Opteron preceding Shanghai) benchmarks in competitive marketing.
Intel's messaging is all about the future, but AMD takes an interesting view that's more in line with the perspective of buyers: Squeeze the longest possible life out of the gear you bought two years ago, and keep the machine you buy today upgradable to state-of-the-art performance with nothing but a CPU swap. Shanghai uses the same 1,207-pin socket (Socket F) as dual-core Opteron, and that's not incidental. You can drop dual-core Opterons in a Shanghai server, or Shanghai CPUs in a dual-core Opteron server. As long as you're using the manufacturer's newest BIOS, the chips will just work. AMD is committed to continuous support for Socket F through the lifespan of Istanbul, its planned six-core CPU. Self-sufficiency and investment protection make a nice couple, and it'll be a pleasure to see those values return to the 2P space.
Shanghai represents AMD's first major speed update in a while, with the clock ceiling raised from 2.3GHz to 2.7GHz across the entire Shanghai product line. The average power utilization for even the fastest Shanghai CPU remains the same as Barcelona's 75 watts, while 55 watt and 105 watt parts will appear in 2009. A 105-watt Shanghai brings to mind a factory-overclocked CPU specially tuned for sci/tech, high-performance computing, and workstations. That's just my guess. I think that AMD wants to make it clear that while it is taking a fresh run at the 2P market, it still rules the roost in high performance x86 computing.
Shanghai took advantage of a smaller manufacturing process (thinner wires, smaller transistors) to make room for a healthy 6MB of Level 3 cache while supplying each core with an independent 512KB Level 2 cache. The precision of AMD's Immersion Lithography process reduces transistor power leakage, giving rise to AMD's claim of a 35 percent reduction in idle power utilization relative to Barcelona. Lowering the idle power floor makes the dynamic power management capabilities first seen in Barcelona really shine.
Another feature new to Shanghai is Smart Fetch, which allows cores to spend more time in a halted state by copying cores' L1 and L2 cache contents to Level 3 cache before halting them. AMD says that this happens transparently and that it lowers CPU power consumption by up to 21 percent, but I hope to see it surface as runtime down-coring, in which unneeded cores can be powered down under user control. Taking a 2P Shanghai system sublimely green would be to down-core it to two cores (one per socket), powering up new cores as needed.
In discussions about Shanghai, AMD refers to Barcelona the same way that Intel once tipped its hat to the short-lived 32-bit Core Duo (Pentium M) CPU. AMD credits Barcelona for shouldering the "heavy lifting" in Shanghai's design, which was considerably sweetened by process shrink and other enhancements. The way Barcelona went down made no one happy -- not engineers, not management, and certainly not OEMs. Shanghai should set all of that right again with a newfound commitment to stay in close touch with OEMs and major accounts.