The current approach taken by x86 CPUs -- to stuff as many processor cores and as much cache memory as will fit on one chip -- will prove impossible to scale beyond a certain point. And adding more, big, hot processor cores may not be the best fit for server roles that call for managing large workloads over long periods of time.
Over the next several years, I believe that AMD, IBM, Intel, and Sun will gather around an objective epitomized by Sun's UltraSPARC T2 (Niagara 2) CPU: an approach to maximizing throughput that Sun refers to as CMT (Chip Multi Threading).
The x86 world has seen the likes of chip multi-threading on a smaller scale with Intel's Pentium 4 (Netburst) architecture. Intel's Hyper Threading split one physical CPU into two logical processors, but it was implemented on a complex and unwieldy processor architecture.
Intel shelved Hyper Threading when it scrapped Netburst in favor of Core microarchitecture, but Intel's move to a simpler CPU core paves the way for Hyper Threading's return. To actually implement multiple threads, however, Intel may have to forego its obsession with equipping each new generation of CPU with more cores, bigger caches, and faster clock speeds.
It could take years, but Intel and AMD will go wide on threading. All CPU makers will, because unlike a processor core, which has to be wired into cache, memory, and I/O, hardware threading splits the gross resources provided by cores in a way that suits virtualization particularly well. And unlike virtualization extensions, which require that system software be aware of the means by which the CPU supports multiple virtual partitions, chip multi-threading doesn't require anything other than whatever is already programmed into OSes and virtualization solutions to deal with multiple discrete processors.
In the end, the biggest gains in server throughput will come from taking more of that work away from software.