According to EETimes, Intel has also been talking about its AI strategy for the data centre for the first time since acquiring data center AI accelerator company Habana Labs.
Cooper Lake was the first tranche of parts in the third generation of its flagship Xeon Scalable CPU line for the data centre. With more than 35 million Xeon Scalable processors deployed, this series is the clear market leader, and the only mainstream CPU offering for the data centre that includes deep learning acceleration. .
Cooper Lake supports bfloat16 (BF16), a number format invented by Google that is becoming the standard for AI training as it offers an optimal balance of compute efficiency and prediction accuracy. This update to DL Boost represents the industry’s first x86 support for BF16 and vector neural network instructions (VNNI).
Intel’s figures have Cooper Lake at up to 1.93X the training performance and 1.87X the inference performance of second-generation parts for image classification. For natural language processing training, performance is 1.7X that of the previous generation.
Other features cinclude Speed Select, which allows control over the base and turbo frequencies of specific cores to maximise performance of high priority workloads.
Cooper Lake is intended to make AI training and inference more widely deployable on general-purpose CPUs. It is for 4-8 socket implementations and there are 11 SKUs announced today that are already shipping to customers (Facebook already announced its server design is based on it, and Alibaba, Baidu, Tencent and others are adopting too). General OEM availability is expected in the second half of 2020. Ice Lake processors, the third generation Xeon Scalable processors for 1-2 socket implementations, are expected to be available later this year.
The fourth generation of the Xeon Scalable family, Sapphire Rapids is being tested. These devices will use the new advanced matrix extension (AMX) instruction set, whose spec is due to be published this month.
Stratix 10-NX
Intel’s first AI-optimised FPGA, the Stratix 10-NX embeds a new AI compute engine it calls the AI tensor block. The new engine delivers up to 15X the INT8 performance than today’s Stratix 10-MX for AI workloads.
Intel’s cunning plan appears to be to offer AI acceleration across its entire data centre range. AI has been a success for Intel which saw its customers using Xeon for things like recommendation engines that have a long and complex workload where the AI is a portion of it, and FPGA or Habana customers who really benefit from having a partner in solving the AI challenge to their Xeon infrastructure.