Mar 29
Intel preps meaty and modular ‘Core 3′ platform
Due before year’s end, the next-gen ‘Nehalem’ architecture succeeds Core 2 with up to eight cores on a single processor, and much more.
Intel is set to shift into overdrive when the new ‘Nehalem’ microarchitecture is released towards the end of this year. A radical revision of the current Core microarchitecture, Nehalem will be the blueprint for almost all Intel-based computers - desktops, notebooks and servers - from 2008 until at least 2010.
Those systems will be powered by anywhere from two to eight cores on a single processor, compared to Intel’s current quad-core solution of strapping two dual-cores chips together. Each individual core will be able to run two simultaneous threads and sitting atop the per-core Level 2 cache will be a shared slab of Level 3 cache accessible to all cores.
Sneak peek: Intel chief Paul Otellini shows an early Nehalem waferSneak peek: Intel chief Paul Otellini shows an early Nehalem wafer(Nehalem derives its codename from the river of the same name in Orgeon, close to Intel’s centre where the architecture and companion next-gen processors were designed. It’ll get a real name closer to launch, and we’re tipping it to be christened as ‘Core 3′. While the current microarchitecure is officially just called Core,the processors built on that foundation are now known as Core 2. Nehalem provides Intel with what must be a welcome chance to clean up the resulting brand confusion)
“First the tick, now the tock” said Stephen Smith, Intel VP and director of the Digital Enterprise Group, referring to Intel’s strategy of delivering processors and microarchitecture in alternate steps across a two year cycle.
“For the beginning of each silicon generation we have a processor that’s derived from the prior generation, and that’s the tick. Then we have a product that’s the larger step, what we call the tock, which is a new microarchitecture to really bring on a leadership level of performance.”
Last year’s 45nm Penryn processor was the tick, and now Intel is gearing up for the tock as Nehalem, which Smith describes as “our ground-up new design optimised for 45nm technology”.
Buy one, get one free: Intel's Stephen Smith says we won't see a single-core Nehalem chip. Even the entry level processor will be dual-coreBuy one, get one free: Intel’s Stephen Smith says we won’t see a single-core Nehalem chip. Even the entry level processor will be dual-core
Beyond its mandatory newer-faster-better bits of technology, Nehalem is “a scalable design”, explains Smith. “It uses a number of building blocks that allows us to build different product configurations, so we can compose various products out of these blocks. Four core is the first to go into production, we also expect to build a monolithic eight core design in 2009 with additional bandwidth.”
Just don’t go looking for a single core Nehalem chip: that’s now ancient history, says Smith. “The lowest implementation we have in mind is a two core design”.
Modular microarchitecture: Nehalem's design treats key system components as building blocks which Intel can assemble in almost any configurationModular microarchitecture: Nehalem’s design treats key system components as building blocks which Intel can assemble in almost any configuration
Among those building blocks are two new pieces of processor plumbing: an integrated memory controller and a dedicated CPU pipeline called QuickPath Interconnect. Well, they’re new for Intel, although AMD already employs both features (for QuickPath Interconnect, read ‘HyperTransport’) as well as Level 3 cache. In those areas, Nehalem is very much Intel’s catch-up ploy.
Nehalem’s memory controller will sit on the individual processor die rather than an external ‘northbridge’ hub and support only DDR3 (DDR2 chips need not apply) with three memory channels per socket to further boost memory bandwidth and reduce latency.
All cached up: Each core in a Nehalem-class processor gets 265KB of L2 cache while sharing anywhere from 4MB to 24MB of L3 cacheAll cached up: Each core in a Nehalem-class processor gets 265KB of L2 cache while sharing anywhere from 4MB to 24MB of L3 cache
This in turn allows Intel to re-engineer Nehalem’s caching. Level 2 cache has been promoted from being per-processor to per-core, but the efficiency of the integrated controller allows this L2 cache to be substantially downsized to just 256Kb for each core. A fat Level 3 cache starting at 4MB is shared by all cores in the processor.
A key difference in the approaches of AMD and Intel is that Intel’s Level 3 ‘inclusive cache hierarchy’ dictates that any data sitting in the per-core L1 or L2 caches must also be present in the shared L3 cache. “If one core needs a piece of data it only has to go as far as checking Level 3, because all the data from all cores is sitting in Level 3″ explains Ronak Singhal, one of Nehalem’s lead engineers. “There’s no latency in fetching data direct from the other core caches”.
The size of the L3 cache “will depend on the number of cores in the product” says Singhal. It’s expected that the dual-core Nehalem processors will start with 4MB of L3 cache, quad-cores will get 8MB and the eight-core servers could get as much as 24MB to play with.
Under the covers: this Nehalem processor shows four 45nm cores, L3 cache, integrated memory controller and the QuickPath Interconnect pipeline all baked onto a single dieUnder the covers: this Nehalem processor shows four 45nm cores, L3 cache, integrated memory controller and the QuickPath Interconnect pipeline all baked onto a single dieThe other key piece of plumbing for multi-processor systems is QuickPath Interconnect, which provides a dedicated pathway between CPU sockets in order to boost dataflow between processors, as well as running between the CPU and the IO hub.
The modular design of Nehalem considers QPI as just another building blocks, allowing Intel to configure high-end systems with multiple QPI pathways.
So when can we expect all this geeky goodness? Intel signed off on Nehalem’s design last year and the first silicon hits production in fourth quarter of the year. The only product at launch will be a six-core server powerplant.
Things hot up in 2009, with the arrival of the quad-core ‘Bloomfield’ desktop chip within the first three months, followed in Q2 by mainstream dual-core desktop (Havendale) and mobile (Auburbdale) processor plus a quad-core extreme mobile chip (Clarksfield).
Nehalem will also be the foundation underpinning Intel’s sixth-generation Centrino platform, codenamed Calpella and due to launch in early 2009, perhaps simultaneously with the mobile processors. (The fifth-gen Montevina platform, now christened Centrino 2 and due around the middle of this year, will use 45nm Penryn-class processors but is built on the Core architecture.)
