AI and ML workloads have changed the physics of the data center.
Racks that once ran comfortably at 5–10 kW are now climbing toward 50 kW and beyond. For next-generation accelerators, direct liquid cooling and other liquid architectures are quickly shifting from “interesting” to “essential.”
But choosing cooling technology is only half the battle. The real differentiator is how well your liquid cooling environment is managed over its entire lifecycle—from commissioning to day-to-day monitoring and remediation.
This is where Guardian operates: as your onsite cooling management partner for AI & ML data centers.
Why data centers need an onsite cooling management partner
Industry guidance from ASHRAE TC 9.9 and others is clear: as liquid cooling moves into mainstream data centers, operators must pay closer attention to coolant chemistry, materials compatibility, and long-term reliability.
At the same time, the Open Compute Project (OCP) ecosystem is rapidly evolving new open standards for liquid-cooled racks and manifolds to support 1MW-class AI deployments.
In practice, that creates three challenges for operators:
- Complexity at the rack and loop level: More components (manifolds, CDUs, quick-connects, sensors) mean more potential failure points.
- Stricter safety and compliance expectations: From ASHRAE guidance to OCP and vendor-specific requirements, the margin for error around leaks, conductivity, and environmental health is shrinking.
- Skills and staffing gaps: Most sites weren’t staffed with liquid cooling specialists. AI build-outs are happening faster than teams can hire and train.
An onsite cooling management partner fills this gap—working inside your facility, under your change controls, to keep liquid cooling infrastructure within safe operating envelopes.
Guardian’s Liquid Cooling Lifecycle Services
1. Commissioning & Validation
Guardian supports the commissioning and validation of liquid-cooled systems in production environments:
- Verifying correct installation of manifolds, cold plates, CDUs, and sensors
- Flushing and filling loops according to OEM and site specifications
- Performing pressure and leak tests, documenting baseline readings
- Validating that monitoring and alarm thresholds are configured correctly
- Reduce provisioning (bringing racks online) risk root causes to liquid cooling systems.
For operators, this means new AI rows and pods pass into production with documented, repeatable commissioning procedures, reducing thermal risk from day one.
2. Preventative Maintenance Schedule (PM)
Cooling stability is not a “set and forget” exercise.
Guardian provides scheduled preventative maintenance for liquid-cooled environments, including:
- Periodic inspection of fittings, quick-connects, and hoses
- Filter replacement and inspection of strainers and strain-relief components
- Verification of pump operation, delta-T, and flow rates
- Health checks on sensors and controls tied to your DCIM/BMS
Instead of reacting to alarms, your team gains a planned maintenance rhythm aligned with ASHRAE and OEM recommendations—integrated into your existing change windows.
3. Fluid Management & Remediation
Coolant is the lifeblood of liquid cooling systems. If chemistry drifts, risk increases.
Guardian provides end-to-end fluid management, including:
- Sampling and testing coolant for conductivity, pH, corrosion inhibitors, and contamination
- Managing top-offs and replacements in line with OEM and chemistry provider guidance (e.g., low-conductivity coolants used in electrified systems and sensitive electronics).
- Remediation after incidents: draining, cleaning, and refilling affected loops; coordinating safe handling and disposal of contaminated fluids through approved environmental channels
This gives data centers a clear playbook for keeping fluids in spec and responding quickly when something goes wrong.
4. Onsite Monitoring & Response
Guardian can embed technicians onsite or on a recurring schedule to:
- Monitor key performance indicators (temperatures, flow, pressures, conductivity)
- Respond to cooling events and system alarms under your incident workflow
- Support physical inspections during changes, migrations, and hardware swaps
Instead of relying solely on remote dashboards, you get eyes and hands in the room focused on cooling resiliency for AI & ML workloads.
Safety & Compliance: Aligning with ASHRAE and OCP Expectations
Safety is non-negotiable when you introduce liquids into high-density electrical environments.
Guardian’s approach aligns with:
- ASHRAE liquid cooling guidance (e.g., TC 9.9 white papers on the emergence and expansion of liquid cooling in mainstream data centers and evolving 2024 guidelines).
- Evolving best practices across the Open Compute Project ecosystem, where OCP and ASHRAE are collaborating on liquid cooling performance and resilience requirements for AI data centers.
Guardian wraps these standards into field-tested operating procedures, documentation, and reporting so your risk, EHS, and compliance teams get the assurance they require.
What “direct” really means for data centers
A lot of vendors talk about liquid cooling. Fewer are ready to stand in your white space and own day-to-day operational outcomes.
“Direct” for Guardian means:
- Integrating into your change management, safety, and EHS processes
- Providing trained cooling technicians who specialize in liquid cooling environments
- Reporting in the language your leadership expects: uptime, risk, ESG, and compliance
You keep control of architecture, OEM selection, and long-term capacity planning. Guardian focuses on making sure your liquid cooling systems perform reliably every day.
Next step for AI & ML data centers
If you’re planning or expanding liquid cooling for AI/ML workloads, there are three practical actions:
- Audit your current state.
Document where liquid is in the environment today (or will be soon), including architectures (DLC, rear door, immersion), fluids, and monitoring. - Define clear cooling roles and responsibilities.
Decide what your internal team owns vs. what an onsite cooling management partner like Guardian should handle. - Pilot a lifecycle program on a single AI cluster.
Start with one high-value AI/ML cluster and establish the full lifecycle: commissioning, maintenance, fluid management, and monitoring—with Guardian embedded in the process.
For your AI & ML clusters, what’s the single biggest concern you have about moving deeper into liquid cooling—safety, skills, long-term reliability, or something else?
