NVIDIA Technical Blog
Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer 5 January 2026 at 22:20

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

5 January 2026 at 22:20

end-to-end-press-ces26-inside-vr-tech-blog-1920x1080-4671300_-r1

AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI...

AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI factories that continuously convert power, silicon, and data into intelligence at scale. These factories now underpin applications that generate business plans, analyze markets, conduct deep research, and reason across vast bodies of…

Source

NVIDIA Technical Blog
NVIDIA CUDA-X Powers the New Sirius GPU Engine for DuckDB, Setting ClickBench Records 15 December 2025 at 17:18

NVIDIA CUDA-X Powers the New Sirius GPU Engine for DuckDB, Setting ClickBench Records

NVIDIA Technical Blog

By:Xiangyao Yu

15 December 2025 at 17:18

Sirius, an open-source GPU native SQL engine, achieved a new performance record on Clickbench—a widely used analytics benchmark. Developed by University of... Decorative image.

Sirius, an open-source GPU native SQL engine, achieved a new performance record on Clickbench—a widely used analytics benchmark. Developed by University of Wisconsin-Madison with support from NVIDIA engineers, Sirius brings GPU-accelerated analytics to DuckDB. DuckDB has seen rapid adoption among organizations such as DeepSeek, Microsoft, and Databricks due to its simplicity, speed…

Source

NVIDIA Technical Blog
Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization 2 December 2025 at 18:51

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

NVIDIA Technical Blog

By:Peihan Huo

2 December 2025 at 18:51

Financial portfolio optimization is a difficult yet essential task that has been consistently challenged by a trade-off between computational speed and model...

Financial portfolio optimization is a difficult yet essential task that has been consistently challenged by a trade-off between computational speed and model complexity. Since the introduction of Markowitz Portfolio Theory 70 years ago, robust analysis beyond basic mean-variance—such as large-scale simulations, multistep optimizations, or richer risk measures—was too slow for dynamic decision…

Source

NVIDIA Technical Blog
Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL 13 November 2025 at 20:30

Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL

NVIDIA Technical Blog

By:Brandon Sun

13 November 2025 at 20:30

CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns...

CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns into composable mathematical operations. While CUTLASS 3.x and CuTe have empowered kernel developers to achieve peak performance on Tensor Cores through intuitive abstractions, the extensive use of C++ templates has resulted in high…

Source

Normal view