NVIDIA Technical Blog
How to Train an AI Agent for Command-Line Tasks with Synthetic Data and Reinforcement Learning 15 January 2026 at 16:00

How to Train an AI Agent for Command-Line Tasks with Synthetic Data and Reinforcement Learning

15 January 2026 at 16:00

Copy of llm-press-bash-tech-blog-gtc25-dc-1920x1080

What if your computer-use agent could learn a new Command Line Interface (CLI)—and operate it safely without ever writing files or free-typing shell commands?...

What if your computer-use agent could learn a new Command Line Interface (CLI)—and operate it safely without ever writing files or free-typing shell commands? In Part 1 of our series on building a computer use agent, we built a custom Bash computer-use agent using NVIDIA Nemotron in just one hour. In this sequel, we’ll take it further by teaching the same reasoning model with no prior…

Source

NVIDIA Technical Blog
Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT-LLM 16 December 2025 at 21:00

Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT-LLM

NVIDIA Technical Blog

By:Laikh Tewari

16 December 2025 at 21:00

For machine learning engineers deploying LLMs at scale, the equation is familiar and unforgiving: as context length increases, attention computation costs...

For machine learning engineers deploying LLMs at scale, the equation is familiar and unforgiving: as context length increases, attention computation costs explode. Whether you’re dealing with retrieval-augmented generation (RAG) pipelines, agentic AI workflows, or long-form content generation, the complexity of attention remains a primary bottleneck. This post explains a technique known as…

Source

NVIDIA Technical Blog
Model Quantization: Concepts, Methods, and Why It Matters 24 November 2025 at 19:23

Model Quantization: Concepts, Methods, and Why It Matters

NVIDIA Technical Blog

By:Ruixiang Wang

24 November 2025 at 19:23

AI models are becoming increasingly complex, often exceeding the capabilities of available hardware. Quantization has emerged as a crucial technique to address... Decorative image.

AI models are becoming increasingly complex, often exceeding the capabilities of available hardware. Quantization has emerged as a crucial technique to address this challenge, enabling resource-intensive models to run on constrained hardware. The NVIDIA TensorRT and Model Optimizer tools simplify the quantization process, maintaining model accuracy while improving efficiency.

Source

NVIDIA Technical Blog
Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks 7 November 2025 at 17:44

Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks

NVIDIA Technical Blog

By:Allison Ding

7 November 2025 at 17:44

Data scientists spend a lot of time cleaning and preparing large, unstructured datasets before analysis can begin, often requiring strong programming and... An illustration of an AI agent.

Data scientists spend a lot of time cleaning and preparing large, unstructured datasets before analysis can begin, often requiring strong programming and statistical expertise. Managing feature engineering, model tuning, and consistency across workflows is complex and error-prone. These challenges are amplified by the slow, sequential nature of CPU-based ML workflows…

Source

NVIDIA Technical Blog
Democratizing Large-Scale Mixture-of-Experts Training with NVIDIA PyTorch Paralism 6 November 2025 at 17:00

Democratizing Large-Scale Mixture-of-Experts Training with NVIDIA PyTorch Paralism

NVIDIA Technical Blog

By:Hemil Desai

6 November 2025 at 17:00

Training massive mixture-of-experts (MoE) models has long been the domain of a few advanced users with deep infrastructure and distributed-systems expertise....

Training massive mixture-of-experts (MoE) models has long been the domain of a few advanced users with deep infrastructure and distributed-systems expertise. For most developers, the challenge wasn’t building smarter models—it was scaling them efficiently across hundreds or even thousands of GPUs without breaking the bank. With NVIDIA NeMo Automodel, an open-source library within NVIDIA NeMo…

Source

Normal view