
John Robert
John Robert is a Senior Data & Machine Learning Engineer at Sunnic Lighthouse (Enerparc AG), working at the intersection of data engineering, machine learning, and cloud platforms. He previously contributed to autonomous driving projects at Bosch and Mercedes-Benz, gaining hands-on experience with safety-critical AI systems. John is a frequent speaker at conferences such as PyCon, PyData, and Data Native, with a focus on AI safety, transparency, and trustworthy machine learning. Outside of work, he enjoys traveling and building projects like PetWorld+ and SpreadsheetProAI, etc
Senior Data and AI platform at Enerparc AG, Germanyjohn-robert-587907103
Metrics That Actually Matter: Evaluating AI Agents Beyond Success Rate
AI Coding Summit 2026
Upcoming
Metrics That Actually Matter: Evaluating AI Agents Beyond Success Rate

This talk challenges the overreliance on success rate and introduces a more practical and safety-aware framework for evaluating AI agents. Drawing from real deployment scenarios, we explore metrics that better capture agent usefulness and reliability, including time-to-completion, tool-call efficiency, error recovery rate, and cost per successful task.The session will show how teams unintentionally optimize the wrong metrics, leading to hidden failure modes such as excessive retries, inefficient tool usage, silent errors, and escalating operational costs. We will connect these issues to broader AI safety concerns, highlighting how poor evaluation practices can create misleading confidence in agent behavior.Attendees will leave with concrete evaluation strategies, a clearer understanding of trade-offs in agent design, and practical guidance on building metrics that reflect real-world performance not just benchmark wins.