"Nissan Infosec Under Scrutiny: Breach Affects Over 50K US Employees!"
Nissan's information security practices are back in the spotlight following a breach impacting over 50,000 US employees. Join us as we delve into the details of this security incident, its implications, and what it means for Nissan's cybersecurity measures.
Qualcomm has emerged as an unexpected ally among contenders in the AI infrastructure landscape.
Announced during Ampere's annual strategy and roadmap update on Thursday, the collaboration unveiled a 2U machine featuring eight Qualcomm AI 100 Ultra accelerators for machine-learning inference and 192 Ampere CPU cores. This configuration allows hosting up to 56 AI accelerators with 1,344 computation cores in a standard 12.5kW rack, eliminating the necessity for costly liquid cooling, according to Ampere.
Ampere and its partner Oracle have demonstrated that running large language models (LLMs) behind popular chatbots is feasible on CPUs, albeit with certain limitations. Due to limited memory bandwidth, CPUs are best suited for smaller models with fewer parameters and smaller batch sizes, typically between seven and eight billion parameters, and for fewer concurrent users.
Qualcomm's AI 100 accelerators address this challenge by offering higher memory bandwidth, enabling them to handle inferencing on larger models or with higher batch sizes. Inferencing involves processing operations across the entire model, so the ability to handle large amounts of data is crucial.
Why Qualcomm? While Nvidia's GPUs, Intel's Gaudi, and AMD's Instinct product lines dominate discussions around AI chips for the datacenter, Qualcomm is less frequently mentioned in this context. Qualcomm is primarily known for its AI smartphone and notebook strategy.
However, Qualcomm does have a presence in the datacenter with its AI 100 series accelerators, including the latest Ultra-series parts introduced last fall. These slim, single-slot PCIe cards are designed for inferencing on LLMs. With power requirements of 150W, they are more energy-efficient compared to the higher wattage accelerators from AMD and Nvidia.
Qualcomm claims that a single AI 100 Ultra accelerator is capable of handling models with up to 100 billion parameters, while pairing two accelerators can support GPT-3 scale models with 175 billion parameters.
In terms of inference performance, the 64-core AI 100 Ultra pushes 870 TOPs at INT8 precision and is equipped with 128GB of LPDDR4x memory capable of 548GB/s bandwidth. Memory bandwidth is crucial for scaling AI inferencing to larger batch sizes.
To address limitations related to memory bandwidth, Qualcomm has focused on software optimizations, employing technologies like speculative decoding and micro-scaling formats (MX). Speculative decoding utilizes a smaller model to generate an initial response, followed by a larger model to verify and correct its accuracy, potentially improving throughput and efficiency. MX formats, such as MX6 and MX4, reduce the memory footprint of models by compressing model weights to lower precision.
Qualcomm's AI 100 Ultra accelerators offer an alternative to GPUs from Nvidia for larger scale AI inferencing, providing a more energy-efficient solution.
Collaborations between Qualcomm and other companies, such as Ampere and Cerebras, aim to address AI training and inferencing challenges. Cerebras, known for its waferscale chips designed for training models, collaborates with Qualcomm to optimize models for Qualcomm's software optimizations.
Other members of the AI Platform Alliance, like Furiosa, are also developing inference accelerators. Furiosa's RNGD accelerator, fabricated on a TSMC 5nm process, offers high performance with up to 512 teraFLOPS of 8-bit performance and 1,024 TOPS at INT4, along with 48GB of HBM3 memory with high bandwidth.
Qualcomm's involvement in the AI Platform Alliance highlights its commitment to building an ecosystem for AI acceleration. While Qualcomm's AI 100 Ultra accelerators are currently available, future developments in AI chip technology, such as Armv9 and SME2 support, could further enhance AI inferencing capabilities.
Overall, collaborations within the AI Platform Alliance aim to leverage each member's expertise to advance AI infrastructure and address the diverse challenges in AI training and inferencing.
What's Your Reaction?