❌

Normal view

Received before yesterday

How to Train an AI Agent for Command-Line Tasks with Synthetic Data and Reinforcement Learning

15 January 2026 at 16:00
What if your computer-use agent could learn a new Command Line Interface (CLI)β€”and operate it safely without ever writing files or free-typing shell commands?...

What if your computer-use agent could learn a new Command Line Interface (CLI)β€”and operate it safely without ever writing files or free-typing shell commands? In Part 1 of our series on building a computer use agent, we built a custom Bash computer-use agent using NVIDIA Nemotron in just one hour. In this sequel, we’ll take it further by teaching the same reasoning model with no prior…

Source

Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT-LLM

16 December 2025 at 21:00
For machine learning engineers deploying LLMs at scale, the equation is familiar and unforgiving: as context length increases, attention computation costs...

For machine learning engineers deploying LLMs at scale, the equation is familiar and unforgiving: as context length increases, attention computation costs explode. Whether you’re dealing with retrieval-augmented generation (RAG) pipelines, agentic AI workflows, or long-form content generation, the complexity of attention remains a primary bottleneck. This post explains a technique known as…

Source

Model Quantization: Concepts, Methods, and Why It Matters

24 November 2025 at 19:23
Decorative image.AI models are becoming increasingly complex, often exceeding the capabilities of available hardware. Quantization has emerged as a crucial technique to address...Decorative image.

AI models are becoming increasingly complex, often exceeding the capabilities of available hardware. Quantization has emerged as a crucial technique to address this challenge, enabling resource-intensive models to run on constrained hardware. The NVIDIA TensorRT and Model Optimizer tools simplify the quantization process, maintaining model accuracy while improving efficiency.

Source

Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks

7 November 2025 at 17:44
An illustration of an AI agent.Data scientists spend a lot of time cleaning and preparing large, unstructured datasets before analysis can begin, often requiring strong programming and...An illustration of an AI agent.

Data scientists spend a lot of time cleaning and preparing large, unstructured datasets before analysis can begin, often requiring strong programming and statistical expertise. Managing feature engineering, model tuning, and consistency across workflows is complex and error-prone. These challenges are amplified by the slow, sequential nature of CPU-based ML workflows…

Source

Democratizing Large-Scale Mixture-of-Experts Training with NVIDIA PyTorch Paralism

6 November 2025 at 17:00
Training massive mixture-of-experts (MoE) models has long been the domain of a few advanced users with deep infrastructure and distributed-systems expertise....

Training massive mixture-of-experts (MoE) models has long been the domain of a few advanced users with deep infrastructure and distributed-systems expertise. For most developers, the challenge wasn’t building smarter modelsβ€”it was scaling them efficiently across hundreds or even thousands of GPUs without breaking the bank. With NVIDIA NeMo Automodel, an open-source library within NVIDIA NeMo…

Source

❌