Nvidia pushes into a wider application ecosystem

The BlueField DPU architecture and DOCA SDK provide a strategic platform for expanding the company's reach.

Nvidia pushes into a wider application ecosystem

Nvidia is extending its solution footprint far beyond artificial intelligence (AI) and gaming, venturing broadly across the entire computing ecosystem into mobility and the next-generation cloud data center.

Nvidia’s ambitions in this regard are clear from its pending acquisition of Arm Technology and from CEO Jensen Huang’s positioning of the company as a “full-stack computing” provider. Demonstrating that he’s putting substantial R&D dollars behind this vision, at the virtual Nvidia GPU Technology Conference this month, Huang announced the rollout of the company’s new BlueField “data processing unit (DPU)” chip architecture.

Accelerating diverse workloads through programmable CPU offload

Strategically, the BlueField DPU builds on two of Nvidia’s boldest recent acquisitions. The new hardware architecture runs on Arm’s CPU architecture. It also incorporates high-speed interconnect technology that Nvidia acquired recently with Mellanox.

Marking the company’s evolution beyond a GPU-centric product architecture, Nvidia’s new DPU architecture is a high-performance, multicore SoC (system on chip). BlueField DPUs incorporate software-programmable data-processing engines that can accelerate a wide range of AI, networking, acceleration, virtualization, security, storage, and other enterprise workloads.

As the foundation of server-based intelligent network interface controllers, DPUs offload workloads from CPUs while efficiently parsing, processing, and transferring high volumes of data at line speeds. In addition to their CPU-offload acceleration benefits, Nvidia’s DPUs can strengthen data center security because the Arm cores embedded within them provide an added level of isolation between security services and CPU-executed applications.

Announced at this latest GTC were the following versions of this new DPU SoC family:

  • Nvidia BlueField-2: Due to be included in new systems from Nvidia server hardware partners in 2021, this architecture features all capabilities of the Nvidia Mellanox ConnectX-6 Dx SmartNIC. It incorporates programmable Arm cores, supports data transfer rates of 200Gbps, and provides hardware offloads to accelerate key data center tasks. It speeds up security, networking and storage tasks, including isolation, root trust, key management, RDMA/RDMA over Converged ethernet, GPUDirect, elastic block storage, and data compression. It includes a controller for managing high-performance back-end nonvolatile memory express storage, all-flash arrays and hyperconverged systems. A single BlueField-2 DPU can offload data center workloads from as many as 125 CPU cores, thereby freeing up cycles to process other enterprise applications.
  • Nvidia BlueField-2X: Under development and due to become available in 2021, this adds an Nvidia Ampere architecture GPU to BlueField-2 for in-networking computing with CUDA and Nvidia AI. It includes all the key features of BlueField-2 and leverages Nvidia’s third-generation Tensor Cores for real-time AI-driven security analytics. It can identify abnormal traffic indicative of theft of confidential data. It can also encrypt traffic analytics at line rate and introspect traffic to identify malicious activity and automatically trigger security features and automated responses.

Nvidia also announced that it will launch next-generation BlueField-3 and BlueField-3X DPUs in 2022, and BlueField-4X in 2023. In the latter generation, Nvidia will integrate the GPU and Arm cores at the silicon level. The company promised that BlueField-4 will boost the DPU’s processing speeds 1000 times beyond BlueField-2X and 600 times beyond BlueField-3X.

Building a robust ecosystem around the DPU accelerator architecture

As it evolves its hardware platform into a DPU-centric architecture in support of new enterprise applications, Nvidia is also making sure that it fully integrates its BlueField/DOCA accelerators into the Arm partner ecosystem.

Signaling that strategy at GTC, the vendor announced that it will help Arm partners go to market with full-stack solution platforms that consist of GPU-enabled as well as DPU-enabled networking, storage and security technologies. It has engaged Arm partners to create full-stack solutions for high-performance computing, cloud, edge and PC opportunities. Also, it is porting its AI and RTX engines to Arm, so that they address a much larger market than the x86 platforms on which Nvidia has traditionally run.

Partners are essential to Nvidia’s plans to support a wider range of enterprise application workloads than just AI on its new DPU product family. Integral to Nvidia’s land-and-expand strategy is DOCA, a new data center infrastructure SoC architecture and software development kit.

Currently available to early access partners only, the DOCA SDK enables developers to program applications on BlueField-accelerated data center infrastructure services. Developers can offload CPU workloads to BlueField DPUs. Consequently, this new offering builds out Nvidia’s enterprise developer tools, complementing the CUDA programming model that enables development of GPU-accelerated applications. In addition, the SDK is fully integrated into the Nvidia NGC catalog of containerized software, thereby encouraging third-party application providers to develop, certify, and distribute DPU-accelerated applications.

Several leading software vendors (VMware, Red Hat, Canonical, and Check Point Software Technologies) announced plans at GTC to integrate their wares with the new DSP/DOCA acceleration architecture in the coming year. In addition, Nvidia announced that several leading server manufacturers, including AsusAtosDell Technologies, Fujitsu, Gigabyte, H3C, InspurLenovoQuanta/QCT, and Supermicro, plan to integrate the DPU into their respective products in the same timeframe.

Although there was no specific Arm tie-in to Huang’s announcement that Microsoft is adopting Nvidia AI on Azure to bring GPU-accelerated smart experiences to its cloud-based Microsoft Office experience, it would not be surprising if, in coming years, more of the mobile experience on this and other Office apps were accelerated locally by leveraging DPU-offload technology .

Enabling Nvidia solutions to lessen their dependency on GPU-centric functionality

Nvidia’s product teams are wasting no time to incorporate the DPUs’ CPU-offload acceleration into their solutions. Most notably, Huang announced that the Nvidia EGX AI edge-server platform is evolving to combine the Nvidia Ampere architecture GPU and BlueField-2 DPU on a single PCIe card.

Although there was no specific BlueField DPU tie-in to Nvidia Jetson, the company’s Arm-based SoC for AI robotics, one should expect that the DOCA SDK will advance to support development of these applications, which are a hot growth field for Nvidia’s core platforms. It’s also a safe bet that the company will use its new hardware and SDK to accelerate its Omniverse platform for collaborative 3-D content production, its Jarvis platform for conversational AI, and its new Maxine platform for cloud-native, AI-accelerated video streaming.

Maintaining momentum

Nvidia’s new BlueField DPU architecture and DOCA SDK provide a strategic platform for broadening its reach into enterprise, service provider, and consumer opportunities of all types.

By enabling hardware-accelerated CPU-offload of diverse workloads, the DPU architecture provides Nvidia with a clear path for converging the new DOCA programming models with its CUDA AI development framework and NGC catalog of containerized cloud solutions. This will enable the company to provide both its own product teams and solution partners with the hardware and software platforms needed to accelerate a full range of application and infrastructure workloads from cloud to edge.

As it awaits the eventual approval of its proposed acquisition of Arm Technology, Nvidia will need to prove this new architecture to its existing partner ecosystem. If DPU technology falls short of Nvidia’s aggressive performance promises, that deficiency could sour relations with Arm’s vast array of licensees, all of whom rely heavily on its CPU-based processor architecture and would benefit from more seamless integration with Nvidia’s market-leading AI technology.

Clearly Nvidia cannot afford to lose momentum in the cloud-to-edge microprocessor wars just when it has begun to pull away from archrival and CPU-powerhouse Intel.

Copyright © 2020 IDG Communications, Inc.

How to choose a low-code development platform