❌

Normal view

Received before yesterday

Redefining Secure AI Infrastructure with NVIDIA BlueField Astra for NVIDIA Vera Rubin NVL72

7 January 2026 at 17:00
Large-scale AI innovation is driving unprecedented demand for accelerated computing infrastructure. Training trillion-parameter foundation models, serving them...

Large-scale AI innovation is driving unprecedented demand for accelerated computing infrastructure. Training trillion-parameter foundation models, serving them with disaggregated architectures, and processing inference workloads at massive throughput all push data center design to the limits. To keep up, service providers need infrastructure that not only scales but also delivers stronger security…

Source

Introducing NVIDIA BlueField-4-Powered Inference Context Memory Storage Platform for the Next Frontier of AI

6 January 2026 at 17:30
AI‑native organizations increasingly face scaling challenges as agentic AI workflows drive context windows to millions of tokens and models scale toward...

AI‑native organizations increasingly face scaling challenges as agentic AI workflows drive context windows to millions of tokens and models scale toward trillions of parameters. These systems currently rely on agentic long‑term memory for context that persists across turns, tools, and sessions so agents can build on prior reasoning instead of starting from scratch on every request.

Source

Scaling Power-Efficient AI Factories with NVIDIA Spectrum-X Ethernet PhotonicsΒ 

6 January 2026 at 16:59
An image of the Spectrum-X Ethernet.NVIDIA is bringing the world’s first optimized Ethernet networking with co-packaged optics to AI factories, enabling scale-out and scale-across on the NVIDIA...An image of the Spectrum-X Ethernet.

NVIDIA is bringing the world’s first optimized Ethernet networking with co-packaged optics to AI factories, enabling scale-out and scale-across on the NVIDIA Rubin platform with NVIDIA Spectrum-X Ethernet Photonics, the flagship switch for multi-trillion-parameter AI infrastructure. This blog post explores key optimizations and innovations in the protocol and hardware of Spectrum-X Ethernet…

Source

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

5 January 2026 at 22:20
AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI...

AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI factories that continuously convert power, silicon, and data into intelligence at scale. These factories now underpin applications that generate business plans, analyze markets, conduct deep research, and reason across vast bodies of…

Source

Next-Generation AI Factory Telemetry with NVIDIA Spectrum-X Ethernet

11 December 2025 at 19:03
As AI data centers rapidly evolve into AI factories, traditional network monitoring methods are no longer sufficient. Workloads continue to grow in complexity...

As AI data centers rapidly evolve into AI factories, traditional network monitoring methods are no longer sufficient. Workloads continue to grow in complexity and infrastructures scale rapidly, making real-time, high-frequency insights critical. The need for effective system monitoring has never been greater. This post explores how high-frequency sampling and advanced telemetry techniques…

Source

Enhancing Communication Observability of AI Workloads with NCCL Inspector

10 December 2025 at 21:45
When using the NVIDIA Collective Communication Library (NCCL) to run a deep learning training or inference workload that uses collective operations (such as...

When using the NVIDIA Collective Communication Library (NCCL) to run a deep learning training or inference workload that uses collective operations (such as AllReduce, AllGather, and ReduceScatter), it can be challenging to determine how NCCL is performing during the actual workload run. This post introduces the NCCL Inspector Profiler Plugin, which addresses this problem. It offers a way for…

Source

Improve AI-Native 6G Design with the NVIDIA Aerial Omniverse Digital Twin

9 December 2025 at 17:00
AI-native 6G networks will serve billions of intelligent devices, agents, and machines. As the industry moves into new spectrums like FR3 (7–24 GHz), radio...

AI-native 6G networks will serve billions of intelligent devices, agents, and machines. As the industry moves into new spectrums like FR3 (7–24 GHz), radio physics becomes far more sensitive, shifting the network from a static infrastructure to a dynamic, living system. This shift demands a fundamental change in how we design, build, and optimize 6G systems. Traditional β€œbuild and test”…

Source

AWS Integrates AI Infrastructure with NVIDIA NVLink Fusion for Trainium4 Deployment

2 December 2025 at 16:00
As demand for AI continues to grow, hyperscalers are looking for ways to accelerate deployment of specialized AI infrastructure with the highest performance....

As demand for AI continues to grow, hyperscalers are looking for ways to accelerate deployment of specialized AI infrastructure with the highest performance. Announced today at AWS re:Invent, Amazon Web Services collaborated with NVIDIA to integrate with NVIDIA NVLink Fusion β€” a rack-scale platform that lets industries build custom AI rack infrastructure with NVIDIA NVLink scale-up…

Source

NVIDIA NVQLink Architecture Integrates Accelerated Computing with Quantum Processors

17 November 2025 at 22:31
Quantum computing is entering an era where progress will be driven by the integration of accelerated computing with quantum processors. The hardware that...

Quantum computing is entering an era where progress will be driven by the integration of accelerated computing with quantum processors. The hardware that controls and measures a quantum processing unit (QPU) faces demanding computational requirementsβ€”from real-time calibration to quantum error-correction (QEC) decoding. Useful quantum applications will require QEC and calibration at scales only…

Source

Fusing Communication and Compute with New Device API and Copy Engine Collectives in NVIDIA NCCL 2.28

11 November 2025 at 00:06
The latest release of the NVIDIA Collective Communications Library (NCCL) introduces a groundbreaking fusion of communication and computation for higher...

The latest release of the NVIDIA Collective Communications Library (NCCL) introduces a groundbreaking fusion of communication and computation for higher throughput, reduced latency, and maximized GPU utilization across multi-GPU and multi-node systems. NCCL 2.28 focuses on GPU-initiated networking, device APIs for communication-compute fusion, copy-engine-based collectives, and new APIs for…

Source

Building Scalable and Fault-Tolerant NCCL Applications

10 November 2025 at 21:29
The NVIDIA Collective Communications Library (NCCL) provides communication APIs for low-latency and high-bandwidth collectives, enabling AI workloads to scale...

The NVIDIA Collective Communications Library (NCCL) provides communication APIs for low-latency and high-bandwidth collectives, enabling AI workloads to scale from just a few GPUs on a single host to thousands of GPUs in a data center. This post discusses NCCL features that support run-time rescaling for cost optimization, as well as minimizing service downtime from faults by dynamically removing…

Source

Streamline Complex AI Inference on Kubernetes with NVIDIA Grove

10 November 2025 at 14:00
Over the past few years, AI inference has evolved from single-model, single-pod deployments into complex, multicomponent systems. A model deployment may now...

Over the past few years, AI inference has evolved from single-model, single-pod deployments into complex, multicomponent systems. A model deployment may now consist of several distinct componentsβ€”prefill, decode, vision encoders, key value (KV) routers, and more. In addition, entire agentic pipelines are emerging, where multiple such model instances collaborate to perform reasoning, retrieval…

Source

Enabling Multi-Node NVLink on Kubernetes for NVIDIA GB200 NVL72 and Beyond

10 November 2025 at 14:00
The NVIDIA GB200 NVL72 pushes AI infrastructure to new limits, enabling breakthroughs in training large-language models and running scalable, low-latency...

The NVIDIA GB200 NVL72 pushes AI infrastructure to new limits, enabling breakthroughs in training large-language models and running scalable, low-latency inference workloads. Increasingly, Kubernetes plays a central role for deploying and scaling these workloads efficiently whether on-premises or in the cloud. However, rapidly evolving AI workloads, infrastructure requirements…

Source

Streamline AI Infrastructure with NVIDIA Run:ai on Microsoft Azure

30 October 2025 at 17:10
Modern AI workloads, ranging from large-scale training to real-time inference, demand dynamic access to powerful GPUs. However, Kubernetes environments have...

Modern AI workloads, ranging from large-scale training to real-time inference, demand dynamic access to powerful GPUs. However, Kubernetes environments have limited native support for GPU management, which leads to challenges such as inefficient GPU utilization, lack of workload prioritization and preemption, limited visibility into GPU consumption, and difficulty enforcing governance and quota…

Source

❌