Summary
Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qcon.ai with any comments or concerns.
The presentation titled Designing AI Platforms for Reliability: Tools for Certainty, Agents for Discovery was delivered by Aaron Erickson, who founded the Applied AI Lab for DGX Cloud at NVIDIA. The focus of the talk is on creating AI platforms that merge deterministic precision with probabilistic exploration to achieve both reliability and adaptability.
Key Concepts Discussed:
- Dual-Layer Approach: Combining deterministic systems for precise, high-stakes operations such as security and compliance, with probabilistic agents to address complex and evolving challenges.
- Practical Applications: Examples include anomaly detection in clusters and utilizing health agents for diagnostic purposes, illustrating the practical integration of reliable outcomes with adaptable systems.
- AI Agents and Tools: Effective AI agents are those with access to useful tools, governed by guardrails, and capable of learning from feedback loops to enhance accuracy and reliability over time.
- Agent Types: Discussion on worker agents, which manage tasks across numerous records, and ruminative agents, capable of ongoing analysis to improve task execution.
- Real-World Implementation: The Llo11yPop project at NVIDIA, which uses AI agents for managing and optimally utilizing GPU resources, was highlighted as an example of applying these principles in a global context.
Conclusion: The presentation emphasized that AI platforms of the future will need to be both precise and explorative, balancing deterministic tools for certainty with agents designed for discovery. This balanced approach is essential for developing more capable and trustworthy AI systems that can operate effectively in uncertain and rapidly evolving environments.
This is the end of the AI-generated content.
Modern AI platforms don’t have to choose between deterministic precision and probabilistic exploration—they need both. Deterministic tools provide the certainty required for high-stakes operations like transactions, security, and compliance, while probabilistic agents bring adaptability and discovery to complex, evolving problems. In this talk, we’ll explore how to design platforms that combine these modes effectively: long-running agents grounded by frequent truth checks, tools that guarantee reliable outcomes where variability is unacceptable, and hybrid systems that thrive in uncertainty when the right tool for the job is probabilistic reasoning. Using real-world examples—from detecting anomalous clusters to health agents debating diagnostic hypotheses—we’ll show how this dual-layer approach leads to platforms that are not only more capable, but also more trustworthy.
Speaker
Aaron Erickson
Senior Manager and Founder of the DGX Cloud Applied AI Lab @NVIDIA, Previously Engineer @ThoughtWorks, VP of Engineering @New Relic, CEO and Co-Founder @Orgspace
Aaron Erickson founded the Applied AI Lab for DGX Cloud at NVIDIA, which specializes in building foundation models and agentic systems to solve broad industry problems like time series-based anomaly detection. Previously, he held engineering leadership roles at ThoughtWorks and New Relic before founding Orgspace, a startup that pioneered generative AI–driven organizational design. He is the author of The Nomadic Developer and Professional F# 2.0, and most recently launched NVIDIA’s Llo11yPop project, applying AI agents to govern GPU resources at global scale.