❌

Normal view

Received before yesterday

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

5 January 2026 at 22:20
AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI...

AI has entered an industrial phase. What began as systems performing discrete AI model training and human-facing inference has evolved into always-on AI factories that continuously convert power, silicon, and data into intelligence at scale. These factories now underpin applications that generate business plans, analyze markets, conduct deep research, and reason across vast bodies of…

Source

NVIDIA CUDA-X Powers the New Sirius GPU Engine for DuckDB, Setting ClickBench Records

15 December 2025 at 17:18
Decorative image.Sirius, an open-source GPU native SQL engine, achieved a new performance record on Clickbenchβ€”a widely used analytics benchmark. Developed by University of...Decorative image.

Sirius, an open-source GPU native SQL engine, achieved a new performance record on Clickbenchβ€”a widely used analytics benchmark. Developed by University of Wisconsin-Madison with support from NVIDIA engineers, Sirius brings GPU-accelerated analytics to DuckDB. DuckDB has seen rapid adoption among organizations such as DeepSeek, Microsoft, and Databricks due to its simplicity, speed…

Source

Accelerating Real-Time Financial Decisions with Quantitative Portfolio Optimization

2 December 2025 at 18:51
Financial portfolio optimization is a difficult yet essential task that has been consistently challenged by a trade-off between computational speed and model...

Financial portfolio optimization is a difficult yet essential task that has been consistently challenged by a trade-off between computational speed and model complexity. Since the introduction of Markowitz Portfolio Theory 70 years ago, robust analysis beyond basic mean-varianceβ€”such as large-scale simulations, multistep optimizations, or richer risk measuresβ€”was too slow for dynamic decision…

Source

Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL

13 November 2025 at 20:30
CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns...

CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns into composable mathematical operations. While CUTLASS 3.x and CuTe have empowered kernel developers to achieve peak performance on Tensor Cores through intuitive abstractions, the extensive use of C++ templates has resulted in high…

Source

❌