Reading view

Sion Power’s Licerion cells exceed 500 Wh/kg for defense and aerospace


Sion Power is expanding its Licerion® lithium-metal battery program to supply cells and battery systems for US defense and aerospace. The cells are engineered to exceed 500 Wh/kg, up to 200 Wh/kg more than current advanced lithium-ion technology, even with silicon anode enhancements.

The platform covers both primary (single-discharge) and secondary (rechargeable) configurations. Target applications include long-endurance UAS, tactical and counter-UAS drones, missile and loitering munition platforms, autonomous maritime and ground vehicles and space systems. Sion Power operates a 110,000 sq ft cell manufacturing facility in Tucson, Arizona, and says it can demonstrate cells and integrated battery systems today, and expects initial product shipments in late 2026.

Lithium-metal anodes store substantially more energy per kilogram than graphite because lithium metal is lighter and more electrochemically active. For weight-constrained platforms, closing the gap from 300-350 Wh/kg for advanced Li-ion to 500+ Wh/kg translates directly into longer endurance and expanded payload capacity. Sion Power’s expansion also responds to US policy momentum—NDAA provisions support domestic battery supply chains and highlight demand for American-manufactured advanced cells.

“Our lithium-metal technology provides the step-change in energy density required to support longer-range missions, increased flight duration and higher payload capability while maintaining a U.S.-based manufacturing capability aligned with national security priorities,” said Pamela Fletcher, CEO of Sion Power.

“By combining high-energy lithium-metal chemistry with advanced battery pack engineering, Sion Power enables defense integrators to unlock two to three times increases in mission endurance, significantly extended operational range and dramatically higher payload capacity compared with conventional lithium-ion and lithium-polymer batteries used in today’s unmanned systems,” said Tracy Kelley, chief science officer at Sion Power.

Source: Sion Power

  •  

The certified BMS trap: why it might not actually protect your battery


Off-the-shelf controllers with safety certifications are giving e-mobility engineers a false sense of security.

An off-the-shelf BMS with a third-party functional safety certification sounds like a solved problem. SIL-rated, ASIL-rated, ready to drop into your e-mobility battery pack. But according to Rich Byczek, Global Chief Engineer for Batteries at Intertek, that certification probably doesn’t cover what you think it covers.

“Certified BMS systems, meaning certified systems that have functional safety certifications from a third party, don’t necessarily address these functions,” Byczek told Charged during a recent webinar (now available to watch on demand). “They just look at the controller as a more generic electrical system.”

The problem: most certifications evaluate the controller hardware against a general integrity standard (IEC 61508, ISO 26262 or ISO 13849). They verify that the electronics are reliable. They don’t verify that the controller monitors individual cell voltages, manages cell-level temperature limits or handles the specific failure modes of lithium-ion chemistry.

Fuses don’t protect at the cell level

The gap is sharpest with passive protection. A pack-level fuse can interrupt a gross overcurrent event, but it’s blind to an individual cell in a series string being driven past its voltage limits. That requires active, per-cell monitoring, and a generic certified controller may not have the inputs and outputs to deliver it.

For e-mobility systems specifically, Byczek stressed that the failure modes and effects analysis (FMEA) must evaluate overvoltage, undervoltage, overcharge, overdischarge, over- and under-temperature, short circuit and excessive current, all at the cell level. “We look at those at the cell level, not only at the macro or battery pack level,” he said.

This is a different world from portable devices, where legacy standards like IEC 62133 rely on type tests and single-fault evaluations. Those standards were designed for products a user could set down and walk away from.

E-mobility doesn’t work that way. “You’re literally riding on top of that battery, potentially going at a fairly high speed,” said Byczek. “You can’t just get away from it.”

Start with the FMEA, not the certificate

The fix isn’t complicated, but it does require work. Start with an FMEA that covers every safety-critical function your BMS must perform, at the cell level. Then verify that your controller (certified or not) actually has the architecture to deliver each one. A certified controller is a starting point, not a finish line.

The standards themselves can be mixed and matched. SIL, ASIL and Performance Levels don’t map one-to-one, but regulators accept cross-framework approaches as long as your risk assessment demonstrably covers every identified hazard. For BMS systems, you’re typically targeting SIL 2, ASIL B or PLc, but the specific level matters less than proving your system can fail safely when a sensor drifts, a resistor opens or a communication link drops.

For teams pivoting from automotive EV programs into adjacent markets like forklifts, floor scrubbers and personal mobility devices, this is the adjustment that matters most. The batteries may be smaller, but the safety obligations are not.

Watch the full webinar: Rich Byczek’s complete presentation on applying functional safety to e-mobility battery systems is available on demand.

  •  

ENNOVI patents adhesive-free lamination for battery cell contacting systems


ENNOVI has secured a German patent for its adhesive-free lamination technology for battery cell contacting systems (CCS). The laser-based process eliminates the adhesives used in conventional hot and cold lamination, and the company says the technology is already validated—meaning OEMs can adopt it without having to prove out the manufacturing process themselves.

CCS components connect and integrate individual cells within a battery module, typically combining busbars, voltage sense lines and the physical laminate layers that hold them together. Conventional CCS lamination bonds those layers using adhesives in hot or cold press processes. ENNOVI’s laser lamination achieves the same bond without adhesive material. The technology supports cylindrical, prismatic and soft pouch cell architectures. With this patent, ENNOVI now offers three lamination options (hot, cold and adhesive-free) for its CCS designs, giving battery engineers a process choice matched to their cell format.

The patent’s main commercial argument is risk reduction. Developing a new lamination process in-house takes time and carries qualification uncertainty; using a pre-validated, patented technology lets engineering teams skip that work. ENNOVI supports co-development and tailored engineering engagement, which it says allows OEM partners to maintain control over their product roadmaps.

The technology was developed at ENNOVI’s Advanced Solutions Engineering Center in Neckarsulm, which includes prototyping, testing and R&D capabilities. The facility holds ISO 9001:2015 and TISAX certifications—the latter covering automotive supply chain data security requirements.

“Automotive OEMs and battery manufacturers can design in the unique features of adhesive-free lamination, reduce engineering risk by using a technology that is already validated, rather than reinventing it,” said Randy Tan, Product Portfolio Director for Energy Systems at ENNOVI.

Source: ENNOVI

  •  

Purpose-Built for AI: The Shift Toward Modular Data Center Infrastructure

Originally posted on Compu Dynamics.

Discover how AI is transforming mission‑critical infrastructure: From modular data center design and liquid cooling to extreme power density to purpose‑built AI facilities, Steve Altizer, President and CEO of Compu Dynamics, covers these topics in this recent conversation.

At PTC 2026 in Hawaii, Isabel Paradis of HOT TELECOM held a discussion with Altizer to discuss how AI is reshaping the way modular data centers are designed now and in the future.

AI Is Rewriting the Rules of Data Center Design

AI is transforming data centers. While many are still trying to shoehorn AI workloads into traditional designs, that approach is only going to last a few more years. Hyperscalers are leading the way into an AI‑centric future, where liquid cooling – once a specialty – is now becoming standard across the industry.

Retrofitting conventional colo or cloud facilities for AI is not ideal. It’s not as cost effective as doing something that’s purpose built, yet building AI‑only facilities also carries risk, because repurposing that heavy investment later is difficult. The industry is therefore moving toward modular infrastructure, which allows for hybrid, purpose‑built AI facilities that remain flexible enough to serve a range of customers.

To continue reading, please click here.

The post Purpose-Built for AI: The Shift Toward Modular Data Center Infrastructure appeared first on Data Center POST.

  •  

AI’s Overlooked Bottleneck: Why Front-End Networks Are Crucial to AI Data Center Performance

By Mike Hodge, AI Solutions Lead, Keysight Technologies

It’s the heart of the AI gold rush, and everyone wants to capitalize on the next big thing. Large language models, multimodal systems, and domain-specific AI workloads are moving from experimentation to production at scale. Across industries, enterprises are building their own proprietary models or integrating pre-trained ones to power applications spanning from video analytics to highly specialized inference services.

This shift has triggered a new wave of infrastructure investment. But while GPUs and accelerators dominate the conversation, scaling AI platforms has produced a less obvious constraint: front-end network performance. In increasingly distributed, multi-tenant AI environments, the ability to move data efficiently into (and across) platforms has become just as critical as raw compute density.

New AI platforms mean new expectations for infrastructure

AI infrastructure is no longer the exclusive domain of a handful of hyperscalers. A growing class of service providers has begun offering end-to-end AI platforms where compute, storage, networking, and orchestration are delivered as a service. Their value proposition is straightforward: customers bring data and models, while the platform handles the complexity of building, operating, and maintaining large-scale data center deployments.

Service models like these, however, place extraordinary demands on networking. Unlike traditional cloud workloads, AI jobs are defined by massive, sustained data movement and tight coupling between data pipelines and compute utilization. GPUs cannot perform at peak efficiency unless data arrives on time, in the right order, and at predictable speeds.

As a result, network performance is now one of the primary determinants of training, inference, and infrastructure efficiency.

The eye of the storm is moving from the fabric to the front end

AI infrastructure discussions often focus on back-end fabrics. Think about things like high-bandwidth, low-latency interconnects between GPUs, for example. However, while these fabrics are indeed essential, they are only part of the picture.

Before training or inference ever begins, data must first traverse the front-end network. This occurs in several ways, but some of the most common paths include:

  • From remote object stores or on-premises repositories into the data center
  • From ingress points into virtual machines or containers
  • From storage into GPU-attached hosts

This is where north-south traffic (external to internal) intersects with east-west traffic (host-to-host and service-to-service). And in AI environments, these flows are not occasional spikes. They are sustained, high-throughput, latency-sensitive streams that run continuously throughout the lifecycle of a job.

When front-end networks underperform, the consequences are costly and immediate: idle accelerators, elongated training windows, unpredictable inference latency, and poor multi-tenant isolation.

Why traditional network validation falls short

Most cloud networks were designed around general-purpose workloads. Think about things like web services, databases, and transactional systems with relatively modest bandwidth demands and fluctuating traffic patterns punctuated by the occasional spike.

AI workloads, on the other hand, break these assumptions. On the front end, AI traffic is characterized by:

  • Extremely large data transfers, often using jumbo frames
  • Long-lived connections, sustained over hours or days
  • Millions of concurrent sessions in multi-tenant environments
  • Tight latency and jitter tolerances to avoid starving accelerators

Conventional network testing approaches — such as synthetic benchmarks, isolated link tests, or small-scale simulations — are unable to replicate this behavior. As a result, many issues only surface once customer workloads are already running, which also happens to be when the cost of remediation is highest.

The need for realistic workload emulation

Optimizing front-end AI networks requires the ability to reproduce real workload behavior at scale. That means emulating both north-south and east-west traffic patterns simultaneously, across distributed environments and under sustained load.

For north-south paths, this includes verifying that large datasets can be reliably pulled from diverse external sources into local storage. Moreover, the network must also be able to do so with consistent throughput, predictable latency, and no silent data loss. Transfers like these are essential, as any inefficiency propagates directly into longer training times and underutilized GPUs.

For east-west paths, the challenge shifts to connection density, latency, and scalability. Once workloads are running, virtual machines and services exchange data continuously. Sometimes within the same host, sometimes across racks, and sometimes across geographically separated data centers. Modern AI platforms increasingly rely on SmartNICs and offload technologies to make this feasible, so these components must also be validated under realistic connection rates and protocol behavior.

Without large-scale, workload-accurate testing, subtle bottlenecks — such as rule-processing limits, connection-tracking inefficiencies, or unexpected latency spikes — can remain hidden until production traffic exposes them.

Front-end optimization is a competitive differentiator

In response, the most advanced AI platform operators are shifting left: validating their front-end networks before customers ever deploy workloads. Along the way, their proactive approach is changing the economics of AI infrastructure.

Stress-testing networks under real-world conditions offers a range of benefits for network operators:

  • Identifying performance cliffs at high line rates
  • Understanding how different layers of the stack interact under load
  • Resolving scaling limitations in NICs, virtual networking, or storage paths
  • Delivering predictable performance across tenants and geographies

It’s not just about improving peak throughput. It’s about building confidence that platforms perform as expected under peak pressure. And in a market where AI workloads are expensive, time-sensitive, and strategically important, this confidence becomes a differentiator. Customers may never see the network directly, but they feel its impact in faster training cycles, lower inference latency, and fewer production surprises.

Looking ahead: front-end networks and the next generation of AI

AI workloads continue to evolve. Microservices-based architectures, distributed inference pipelines, and increasingly stateful services are placing even more emphasis on low-latency, high-availability front-end connectivity. At the same time, data is becoming more geographically distributed, pushing platforms to span multiple regions and network domains.

In this environment, front-end networks are no longer a supporting actor. They are a core component of AI system design. That means they must be engineered, validated, and optimized with the same rigor applied to compute and accelerators.

The lesson is clear: operators cannot optimize AI infrastructure by focusing on GPUs alone. The performance, efficiency, and reliability of tomorrow’s AI platforms will be defined just as much by how well they move data as by how fast they process it.

The post AI’s Overlooked Bottleneck: Why Front-End Networks Are Crucial to AI Data Center Performance appeared first on Data Center POST.

  •  

AI Workloads and the Implications for High-Density Data Centre Design

AI workloads are pushing data centre infrastructure towards higher rack densities, new cooling strategies and greater power demand. Jamie Darragh, Data Centre Director, Europe, at global data centre engineering design consultancy Black & White Engineering, examines the design implications for the next generation of facilities.

AI and high-performance computing are placing new demands on data centre infrastructure. Rack densities are increasing; facilities are being delivered at larger scale and operators are under pressure to support workloads that consume far greater levels of power and generate far higher heat loads than conventional cloud environments.

Independent forecasts underline the pace of expansion. Gartner estimates global data centre electricity consumption will rise from around 448TWh in 2025 to roughly 980TWh by 2030, driven largely by AI-optimised computing infrastructure. Within that growth, AI servers alone are expected to account for close to 44% of data centre power consumption by the end of the decade.

For our engineering teams, these workloads are altering the practical limits of traditional infrastructure design. Rack densities exceeding 100–200kW are now appearing in project specifications, particularly where large AI training clusters are planned. These loads influence every part of the building environment, from electrical distribution and cooling capacity to structural loading and cable management.

Designing for extreme density

Under these conditions, air cooling alone becomes difficult to sustain across entire facilities. Liquid cooling is therefore increasingly included in the baseline design of new data centres rather than introduced later as a specialist solution. This cooling method is becoming increasingly favourable due to its higher specific thermal capacity compared with air, which enables more efficient heat transfer and removal. Direct-to-chip and rack-level systems are being designed alongside air cooling so facilities can accommodate different densities and equipment types across the same site.

The introduction of liquid systems requires careful coordination between disciplines. Facilities must manage environments where air and liquid cooling operate together, supported by monitoring platforms, safety controls and operational procedures capable of supporting both approaches.

Some IT chips require different liquid cooling temperatures than those used in air-cooling systems, creating technical hurdles for the overall heat rejection system and requiring precise control of the cooling circuit temperature. Another engineering challenge lies in integrating these systems with power distribution, control platforms and maintenance strategies rather than selecting one cooling method over another.

Higher density also narrows operational tolerance. Commissioning becomes more demanding and redundancy strategies require more detailed modelling. Infrastructure must be capable of supporting peak compute demand while maintaining efficiency when loads are lower, placing greater emphasis on flexible electrical and mechanical systems.

The scale of development is also increasing. Buildings that once delivered a few megawatts of capacity are now part of campus-scale developments where multiple data halls contribute to facilities delivering hundreds of megawatts. data centres are increasingly planned and delivered as long-term infrastructure assets rather than individual projects.

This environment encourages repeatable design and industrialised delivery methods. Developers and investors expect predictable construction schedules and consistent performance across multiple sites. As a result, engineering teams are placing greater emphasis on modular infrastructure systems and digital design methods that allow mechanical and electrical systems to be configured and deployed repeatedly.

Power, control and operational intelligence

Power availability is also becoming a determining factor in project planning. In many regions, grid connection capacity is now one of the main constraints on new development. Gartner has warned that by 2027 as many as 40% of AI data centres could face operational limits because of power availability.

Developers are therefore engaging more closely with utilities during early feasibility stages and exploring complementary infrastructure such as on-site generation and energy storage. In some cases, data centres are also being designed to contribute to wider grid stability through demand response and energy management capability.

Artificial intelligence is also beginning to influence how facilities themselves are operated. Machine-learning systems are already being used in some environments to optimise airflow patterns, cooling plant performance and power distribution using live operational data.

The next stage will see more widespread use of integrated control platforms and digital twins capable of modelling facility behaviour in real time. These systems allow operators to simulate infrastructure performance under different load conditions, test operational changes and identify maintenance requirements before faults occur.

Environmental performance remains another constraint as compute density increases. Higher workloads place additional pressure on energy supply while raising questions around water consumption, construction materials and waste heat recovery. Planning authorities and investors are increasingly looking for measurable improvements in efficiency and carbon reporting before approving new developments. Sustainability therefore sits alongside power and cooling as a central engineering consideration rather than a secondary design feature.

Taken together, these conditions create a more complex design environment for data centre infrastructure. Higher compute densities, power constraints and new operational technologies require mechanical, electrical and digital systems to be considered together from the earliest design stages.

Facilities intended to support AI workloads must accommodate far greater performance requirements than earlier generations of data centres while remaining adaptable as infrastructure technologies and operating practices continue to develop.

# # #

About the Author

Jamie Darragh is Data Centre Director, Europe at Black & White Engineering. He leads the delivery of complex, mission-critical projects across the region, with a focus on technical quality, design coordination and strong client relationships. A Chartered Engineer and member of CIBSE and the IET, Jamie has worked across Europe, the Middle East and the UK since 2005. He brings a clear, practical approach to engineering challenges, combining technical expertise with commercial awareness. He is committed to developing teams that work collaboratively and perform at a high level. Jamie has received several industry awards, recognising both his technical capability and his impact on the built environment including ‘Engineer of the Year’ at leading Middle East industry awards.

The post AI Workloads and the Implications for High-Density Data Centre Design appeared first on Data Center POST.

  •  

VIDEO: Grid-forming, LDES and solar hybrids in Europe, with Envision Energy’s Michael Koller

Envision Energy's director of energy storage solutions, Michael Koller, speaks with Energy-Storage.news at Energy Storage Summit 2026 in London, UK.

  •  

AI and cooling: toward more automation

AI is increasingly steering the data center industry toward new operational practices, where automation, analytics and adaptive control are paving the way for “dark” — or lights-out, unstaffed — facilities. Cooling systems, in particular, are leading this shift. Yet despite AI’s positive track record in facility operations, one persistent challenge remains: trust.

In some ways, AI faces a similar challenge to that of commercial aviation several decades ago. Even after airlines had significantly improved reliability and safety performance, making air travel not only faster but also safer than other forms of transportation, it still took time for public perceptions to shift.

That same tension between capability and confidence lies at the heart of the next evolution in data center cooling controls. As AI models — of which there are several — improve in performance, becoming better understood, transparent and explainable, the question is no longer whether AI can manage operations autonomously, but whether the industry is ready to trust it enough to turn off the lights.

AI’s place in cooling controls

Thermal management systems, such as CRAHs, CRACs and airflow management, represent the front line of AI deployment in cooling optimization. Their modular nature enables the incremental adoption of AI controls, providing immediate visibility and measurable efficiency gains in day-to-day operations.

AI can now be applied across four core cooling functions:

  • Dynamic setpoint management. Continuously recalibrates temperature, humidity and fan speeds to match load conditions.
  • Thermal load forecasting. Predicts shifts in demand and makes adjustments in advance to prevent overcooling or instability.
  • Airflow distribution and containment. Uses machine learning to balance hot and cold aisles and stage CRAH/CRAC operations efficiently.
  • Fault detection, predictive and prescriptive diagnostics. Identifies coil fouling, fan oscillation, or valve hunting before they degrade performance.

A growing ecosystem of vendors is advancing AI-driven cooling optimization across both air- and water-side applications. Companies such as Vigilent, Siemens, Schneider Electric, Phaidra and Etalytics offer machine learning platforms that integrate with existing building management systems (BMS) or data center infrastructure management (DCIM) systems to enhance thermal management and efficiency.

Siemens’ White Space Cooling Optimization (WSCO) platform applies AI to match CRAH operation with IT load and thermal conditions, while Schneider Electric, through its Motivair acquisition, has expanded into liquid cooling and AI-ready thermal systems for high-density environments. In parallel, hyperscale operators, such as Google and Microsoft, have built proprietary AI engines to fine-tune chiller and CRAH performance in real time. These solutions range from supervisory logic to adaptive, closed-loop control. However, all share a common aim: improve efficiency without compromising compliance with service level agreements (SLAs) or operator oversight.

The scope of AI adoption

While IT cooling optimization has become the most visible frontier, conversations with AI control vendors reveal that most mature deployments still begin at the facility water loop rather than in the computer room. Vendors often start with the mechanical plant and facility water system because these areas present fewer variables, such as temperature differentials, flow rates and pressure setpoints, and can be treated as closed, well-bounded systems.

This makes the water loop a safer proving ground for training and validating algorithms before extending them to computer room air cooling systems, where thermal dynamics are more complex and influenced by containment design, workload variability and external conditions.

Predictive versus prescriptive: the maturity divide

AI in cooling is evolving along a maturity spectrum — from predictive insight to prescriptive guidance and, increasingly, to autonomous control. Table 1 summarizes the functional and operational distinctions among these three stages of AI maturity in data center cooling.

Table 1 Predictive, prescriptive, and autonomous AI in data center cooling

Table: Predictive, prescriptive, and autonomous AI in data center cooling

Most deployments today stop at the predictive stage, where AI enhances situational awareness but leaves action to the operator. Achieving full prescriptive control will require not only a deeper technical sophistication but also a shift in mindset.

Technically, it is more difficult to engineer because the system must not only forecast outcomes but also choose and execute safe corrective actions within operational limits. Operationally, it is harder to trust because it challenges long-held norms about accountability and human oversight.

The divide, therefore, is not only technical but also cultural. The shift from informed supervision to algorithmic control is redefining the boundary between automation and authority.

AI’s value and its risks

No matter how advanced the technology becomes, cooling exists for one reason: maintaining environmental stability and meeting SLAs. AI-enhanced monitoring and control systems support operating staff by:

  • Predicting and preventing temperature excursions before they affect uptime.
  • Detecting system degradation early and enabling timely corrective action.
  • Optimizing energy performance under varying load profiles without violating SLA thresholds.

Yet efficiency gains mean little without confidence in system reliability. It is also important to clarify that AI in data center cooling is not a single technology. Control-oriented machine learning models, such as those used to optimize CRAHs, CRACs and chiller plants, operate within physical limits and rely on deterministic sensor data. These differ fundamentally from language-based AI models such as GPT, where “hallucinations” refer to fabricated or contextually inaccurate responses.

At the Uptime Network Fall Americas Fall Conference 2025, several operators raised concerns about AI hallucinations — instances where optimization models generate inaccurate or confusing recommendations from event logs. In control systems, such errors often arise from model drift, sensor faults, or incomplete training data, not from the reasoning failures seen in language-based AI. When a model’s understanding of system behavior falls out of sync with reality, it can misinterpret anomalies as trends, eroding operator confidence faster than it delivers efficiency gains.

The discomfort is not purely technical, it is also human. Many data center operators remain uneasy about letting AI take the controls entirely, even as they acknowledge its potential. In AI’s ascent toward autonomy, trust remains the runway still under construction.

Critically, modern AI control frameworks are being designed with built-in safety, transparency and human oversight. For example, Vigilent, a provider of AI-based optimization controls for data center cooling, reports that its optimizing control switches to “guard mode” whenever it is unable to maintain the data center environment within tolerances. Guard mode brings on additional cooling capacity (at the expense of power consumption) to restore SLA-compliant conditions. Typical examples include rapid drift or temperature hot spots. In addition, there is also a manual override option, which enables the operator to take control through monitoring and event logs.

This layered logic provides operational resiliency by enabling systems to fail safely: guard mode ensures stability, manual override guarantees operator authority, and explainability, via decision-tree logic, keeps every AI action transparent. Even in dark-mode operation, alarms and reasoning remain accessible to operators.

These frameworks directly address one of the primary fears among data center operators: losing visibility into what the system is doing.

Outlook

Gradually, the concept of a dark data center, one operated remotely with minimal on-site staff, has shifted from being an interesting theory to a desirable strategy. In recent years, many infrastructure operators have increased their use of automation and remote-management tools to enhance resiliency and operational flexibility, while also mitigating low staffing levels. Cooling systems, particularly those governed by AI-assisted control, are now central to this operational transformation.

Operational autonomy does not mean abandoning human control; it means achieving reliable operation without the need for constant supervision. Ultimately, a dark data center is not about turning off the lights, it is about turning on trust.


The Uptime Intelligence View

AI in thermal management has evolved from an experimental concept into an essential tool, improving efficiency and reliability across data centers. The next step — coordinating facility water, air and IT cooling liquid systems — will define the evolution toward greater operational autonomy. However, the transition to “dark” operation will be as much cultural as it is technical. As explainability, fail-safe modes and manual overrides build operator confidence, AI will gradually shift from being a copilot to autopilot. The technology is advancing rapidly; the question is how quickly operators will adopt it.

The post AI and cooling: toward more automation appeared first on Uptime Institute Blog.

  •  

Accelerating Diffusion Models with an Open, Plug-and-Play Offering

Recent advances in large-scale diffusion models have revolutionized generative AI across multiple domains, from image synthesis to audio generation, 3D asset...

Recent advances in large-scale diffusion models have revolutionized generative AI across multiple domains, from image synthesis to audio generation, 3D asset creation, molecular design, and beyond. These models have demonstrated unprecedented capabilities in producing high-quality, diverse outputs across various conditional generation tasks. Despite these successes…

Source

  •  

How to Unlock Local Detail in Coarse Climate Projections with NVIDIA Earth-2

A global image showing weather patterns.Global climate models are good at the big picture—but local climate extremes, like hurricanes and typhoons, often disappear in the details. Those patterns are...A global image showing weather patterns.

Global climate models are good at the big picture—but local climate extremes, like hurricanes and typhoons, often disappear in the details. Those patterns are still there—you just need the right tools to unlock them in high-resolution climate data. Using NVIDIA Earth‑2, this blog post shows you how to downscale coarse climate projections into higher-resolution, bias‑corrected fields—revealing…

Source

  •  

Streamlining CUB with a Single-Call API

The C++ template library CUB is a go-to for high-performance GPU primitive algorithms, but its traditional "two-phase" API, which separates memory estimation...

The C++ template library CUB is a go-to for high-performance GPU primitive algorithms, but its traditional “two-phase” API, which separates memory estimation from allocation, can be cumbersome. While this programming model offers flexibility, it often results in repetitive boilerplate code. This post explains the shift from this API to the new CUB single-call API introduced in CUDA 13.1…

Source

  •  

How to Write High-Performance Matrix Multiply in NVIDIA CUDA Tile

This blog post is part of a series designed to help developers learn NVIDIA CUDA Tile programming for building high-performance GPU kernels, using matrix...

This blog post is part of a series designed to help developers learn NVIDIA CUDA Tile programming for building high-performance GPU kernels, using matrix multiplication as a core example. In this post, you’ll learn: Before you begin, be sure your environment meets the following requirements (see the quickstart for more information): Environment requirements: Install…

Source

  •  

Learn How NVIDIA cuOpt Accelerates Mixed Integer Optimization using Primal Heuristics

Decorative image.NVIDIA cuOpt is a GPU-accelerated optimization engine designed to deliver fast, high-quality solutions for large, complex decision-making problems. Mixed...Decorative image.

NVIDIA cuOpt is a GPU-accelerated optimization engine designed to deliver fast, high-quality solutions for large, complex decision-making problems. Mixed integer programming (MIP) is a technique for solving problems. It can be modeled by a set of linear constraints, with some of the variables able to assume only integer values. The types of problems that can be modeled as MIP are numerous and…

Source

  •  

Building Generalist Humanoid Capabilities with NVIDIA Isaac GR00T N1.6 Using a Sim-to-Real Workflow 

A robot thrTo make humanoid robots useful, they need cognition and loco-manipulation that span perception, planning, and whole-body control in dynamic environments. ...A robot thr

To make humanoid robots useful, they need cognition and loco-manipulation that span perception, planning, and whole-body control in dynamic environments. Building these generalist robots requires a workflow that unifies simulation, control, and learning for robots to acquire complex skills before transferring into the real world. In this post, we present NVIDIA Isaac GR00T N1.6…

Source

  •  

Build and Orchestrate End-to-End SDG Workflows with NVIDIA Isaac Sim and NVIDIA OSMO 

As robots take on increasingly dynamic mobility tasks, developers need physics-accurate simulations that translate across environments and workloads. Training...

As robots take on increasingly dynamic mobility tasks, developers need physics-accurate simulations that translate across environments and workloads. Training robot policies and models to do these tasks requires a large amount of diverse, high-quality data, which is often expensive and time-consuming to collect in the physical world. Therefore, generating synthetic data at scale using cloud…

Source

  •  

Simplify Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena

Generalist robot policies must operate across diverse tasks, embodiments, and environments, requiring scalable, repeatable simulation-based evaluation. Setting...

Generalist robot policies must operate across diverse tasks, embodiments, and environments, requiring scalable, repeatable simulation-based evaluation. Setting up large-scale policy evaluations is tedious and manual. Without a systematic approach, developers need to build high-overhead custom infrastructure, yet task libraries remain limited in complexity and diversity.

Source

  •  

AI Factories, Physical AI, and Advances in Models, Agents, and Infrastructure That Shaped 2025

Four-image grid illustrating AI agents, robotics, data center infrastructure, and simulated environments.2025 was another milestone year for developers and researchers working with NVIDIA technologies. Progress in data center power and compute design, AI...Four-image grid illustrating AI agents, robotics, data center infrastructure, and simulated environments.

2025 was another milestone year for developers and researchers working with NVIDIA technologies. Progress in data center power and compute design, AI infrastructure, model optimization, open models, AI agents, and physical AI redefined how intelligent systems are trained, deployed, and moved into the real world. These posts highlight the innovations that resonated most with our readers.

Source

  •  

Accelerating AI-Powered Chemistry and Materials Science Simulations with NVIDIA ALCHEMI Toolkit-Ops

Machine learning interatomic potentials (MLIPs) are transforming the landscape of computational chemistry and materials science. MLIPs enable atomistic...

Machine learning interatomic potentials (MLIPs) are transforming the landscape of computational chemistry and materials science. MLIPs enable atomistic simulations that combine the fidelity of computationally expensive quantum chemistry with the scaling power of AI. Yet, developers working at this intersection face a persistent challenge: a lack of robust, Pythonic toolbox for GPU…

Source

  •  

Solving Large-Scale Linear Sparse Problems with NVIDIA cuDSS

Solving large-scale problems in Electronic Design Automation (EDA), Computational Fluid Dynamics (CFD), and advanced optimization workflows has become the norm...

Solving large-scale problems in Electronic Design Automation (EDA), Computational Fluid Dynamics (CFD), and advanced optimization workflows has become the norm as chip designs, manufacturing, and multi-physics simulations have grown in complexity. These workloads push traditional solvers and require unprecedented scalability and performance. The NVIDIA CUDA Direct Sparse Solver (cuDSS) is built…

Source

  •  

Simulate Robotic Environments Faster with NVIDIA Isaac Sim and World Labs Marble

Building realistic 3D environments for robotics simulation has traditionally been a labor-intensive process, often requiring weeks of manual modeling and setup....

Building realistic 3D environments for robotics simulation has traditionally been a labor-intensive process, often requiring weeks of manual modeling and setup. Now, with generative world models, you can go from a text prompt to a photorealistic, simulation-ready world in a fraction of time. By combining NVIDIA Isaac Sim, an open source robotics reference framework, with generative models such as…

Source

  •