Graphcore unveiled its third-generation intelligence processing unit (IPU), the primary processor to be constructed utilizing 3D wafer-on-wafer (WoW) know-how.
Codenamed the Bow IPU, Graphcore’s new AI processor achieves as much as 40% greater efficiency and 16% higher energy effectivity than the earlier (non-WoW, however in any other case equivalent) product, launched in 2020.
“Wafer-on-wafer know-how units a route when it comes to the place Graphcore is heading,” mentioned Graphcore CEO Nigel Toon. “We’ve been working very carefully with TSMC on this know-how, growing this during the last two years. We’ve been in intensive manufacturing qualification during the last 12 months with very detailed testing for reliability, and we’re now on the stage the place this know-how is prepared for full quantity manufacturing.”
Graphcore plans to drastically improve its value/efficiency metrics by providing the brand new components on the identical value because the outdated ones. Prospects can swap over to Bow IPUs with out making any software program modifications, the corporate mentioned.
Graphcore additionally introduced that it’s going to use future generations of WoW IPU to construct a product it calls the Good Laptop, an ultra-intelligence AI supercomputer product able to 10 ExaFLOPS, in response to buyer demand.
Graphcore payments the brand new Bow IPU as “the very best efficiency manufacturing AI processor on this planet as we speak.” Every Bow IPU chip provides 350 TeraFLOPS of mixed-precision AI compute. The processor has the identical 1472 unbiased processor cores and the identical 900MB in-processor SRAM because the earlier technology Colossus Mk2 IPU chip, nevertheless it runs round 40% sooner than its predecessor – 1.85 GHz as a substitute of 1.325 GHz – therefore the as much as 40% efficiency enchancment.
Graphcore mentioned its prospects are seeing as much as 40% improve in time to coach throughout a variety of fashions. Figures revealed by Graphcore present speedups of between 1.29x and 1.39x throughout a variety of workloads together with picture classification (together with imaginative and prescient transformers), object detection, textual content to picture, graph networks, pure language processing, and speech recognition.
Energy effectivity (efficiency per watt) additionally improved between 9-16% throughout a smaller vary of workloads, based on Graphcore’s figures.
Supposed to energy AI coaching and inference at massive scale, Graphcore’s Bow IPUs might be mixed into very massive multi-chip techniques. A 256-chip BowPod-256 system provides 89 PetaFLOPS, whereas a 1024-chip BowPod-1024 provides 350 PetaFLOPS.
Graphcore is TSMC’s lead buyer for the foundry’s wafer-on-wafer (WoW) know-how.
WoW chips characteristic two wafers bonded collectively: a wafer of processor die, and a wafer of energy supply die. The facility supply wafer incorporates deep trench capacitors, just like these used to retailer info in DRAM, used as a cost reservoir and linked to the transistors on the processor die at very low impedance.
“This permits the transistors to function rather more shortly at good energy effectivity, so the online impact on the Bow IPU processor is to extend its clock velocity,” mentioned Graphcore CTO Simon Knowles, regardless of utilizing the identical processor design and the identical course of know-how (TSMC 7nm) for the processor die.
WoW depends upon two key applied sciences.
Hybrid bonding permits two wafers to be bonded collectively, steel sides collectively, with none interstitial bumps.
“It’s like a form of chilly weld,” mentioned Knowles. “The benefit of doing it this manner is an especially excessive density of interconnect between the wafers.”
The opposite key know-how is a brand new kind of through-silicon through (TSV) referred to as a back-side TSV (BTSV) which permits connection to layers contained in the wafer “sandwich.”
WoW is distinct from chip-on-wafer applied sciences used within the business to mount reminiscence die on high of processor die; Knowles mentioned the variations lead to finer connection pitch for WoW, although he didn’t reveal what the pitch is. Knowles ascribed the finer pitch to the convenience of aligning two full wafers fairly than two particular person die, and the power to make use of an ion etch course of for BTSVs because of the energy supply wafer being extraordinarily skinny. The skinny wafer, “thinner than cling movie,” is so skinny that it’s clear and floppy, so bonding to the thicker wafer earlier than thinning permits the thicker wafer to behave as a mechanical assist throughout subsequent course of steps. This wouldn’t be doable with particular person die, he mentioned.
“We’ve been working With TSMC as their vanguard buyer on this know-how for about two years now,” mentioned Knowles. “An unlimited quantity of labor has been completed to make this a manufacturing know-how, and I’m certain any of our rivals who’re beginning as we speak will take a superb very long time to get to the place we’re.”
Identical software program
The brand new Bow IPU processor is “100% software program suitable” with present buyer code, because the habits of the processor is equivalent to the earlier technology (it simply runs sooner and extra effectively). Bow is supported by Graphcore’s Poplar low stage compiler and SDK, which is suitable with many higher-level frameworks, together with PyTorch, TensorFlow, Keras, Lightning, Halo, PaddlePaddle and extra.
As with earlier generations of IPU, Graphcore’s Bow IPU might be supplied as a 4-IPU, 1.4 PetaFLOPS, 1U server blade. Graphcore has relied on price-performance metrics in its earlier product pitches, fairly than chip-to-chip comparisons, and this time isn’t any completely different. The corporate compares its BowPod-16 (16 IPU chips, 5.6 PetaFLOPS, $149,995) to an Nvidia DGX-A100 (8 GPU chips, 5 PetaFLOPS, $299,000) which is an analogous bodily dimension. Graphcore claims a TCO benefit primarily based on this comparability.
The Bow IPU machines and Bow-Pod techniques are being supplied on the identical value as their previous-gen equivalents, regardless of rising wafer price, utilizing twice as many wafers and extra advanced packaging processes. How is the corporate ready to do that?
“It actually comes right down to the economics of producing scale,” Toon mentioned, including that intensive manufacturing qualification processes take note of improved studying on the redundancy Graphcore builds into its chips. “All of this combines to permit us to have the ability to ship this and extra superior know-how on the identical price,” he mentioned.
Toon mentioned Graphcore might select to scale back the price of previous-gen techniques going ahead.
Graphcore additionally introduced a roadmap product which it calls the Good Laptop, after Nineteen Sixties pc science pioneer Jack Good. Good proposed an “ultra-intelligent” machine, that’s, a pc with extra intelligence than a human mind. Graphcore’s Good Laptop will use scale to realize this. Future-gen IPUs, which Graphcore anticipates will use WoW processes to stack processor die, might be used to construct a 10-ExaFLOPS system with 8192 Graphcore chips. It’s going to assist fashions with as much as 500 trillion parameters.
“We’ll exceed the parametric capability of the human mind, and thereby we hope, characterize an enormous step on the trail to the invention of ultra-intelligence,” Knowles mentioned.
Knowles was clear that the Good Laptop is just not supposed to be a one-off; it is going to be a business product retailing for round $120 million, and is a direct response to what prospects are asking for as we speak.
“Now that we’ve got the know-how working after two years of shut improvement with TSMC, we’re prepared for this step… And it is going to be a really potent step,” he mentioned.
The Good Laptop ought to be available on the market by 2024.
Bow IPUs are already in use on the US Division of Vitality’s Pacific Northwest Nationwide Laboratory for functions together with cybersecurity and computational chemistry, in addition to with a handful of different prospects.
US cloud service supplier Cirrascale is making Bow Pod techniques out there as we speak as a part of its Graphcloud IPU naked steel service, whereas G-Core Labs in Europe will launch cloud situations in Q2 2022. Kingsoft Cloud will launch an IPU service in China, and NHN is within the strategy of constructing out an IPU cloud providing in Korea.
Bow IPU chips are in quantity manufacturing and techniques are delivery to prospects now.