17.2 C
Casper
Saturday, May 25, 2024

Google Announces the Cloud TPU v5p, Its Most Powerful AI Accelerator Yet

Must read

Google promises these chips are significantly faster than the v4 TPUs. The team claims that the v5p features a 2x improvement in FLOPS and 3x improvement in high-bandwidth memory.

Google announced the launch of its new Gemini large language model (LLM). With that, the company also launched its new Cloud TPU v5p, an updated version of its Cloud TPU v5e, which launched into general availability earlier this year. A v5p pod comprises 8,960 chips and is backed by Google’s fastest interconnect yet, with up to 4,800 Gpbs per chip.

Unsurprisingly, Google promises that these chips are significantly faster than the v4 TPUs. The team claims that the v5p features a 2x improvement in FLOPS and 3x improvement in high-bandwidth memory. That’s a bit like comparing the new Gemini model to the older OpenAI GPT 3.5 model. After all, Google already moved the state of the art beyond the TPU v4. In many ways, though, v5e pods were a bit of a downgrade from the v4 pod, with only 256 v5e chips per pod versus 4096 in the v4 pods and a total of 197 TFLOPs 16-bit floating point performance per v5e chip versus 275 for the v4 chips. For the new v5p, Google promises up to 459 TFLOPs of 16-bit floating point performance, backed by the faster interconnect.

Google says all of this means the TPU v5p can train a large language model like GPT3-175B 2.8 times faster than the TPU v4 — and do so more cost-effectively, too (though the TPU v5e, while slower, actually offers more relative performance per dollar than the v5p).

“In our early stage usage, Google DeepMind and Google Research have observed 2X speedups for LLM training workloads using TPU v5p chips compared to the performance on our TPU v4 generation,” writes Jeff Dean, chief scientist, Google DeepMind and Google Research. “The robust support for ML Frameworks (JAX, PyTorch, TensorFlow) and orchestration tools enables us to scale even more efficiently on v5p. With the 2nd generation of SparseCores, we also see significant improvement in the performance of embeddings-heavy workloads. TPUs are vital to enabling our largest-scale research and engineering efforts on cutting-edge models like Gemini.”

The new TPU v5p isn’t generally available yet, so developers must contact their Google account manager to get on the list.

More articles

Latest news