The Secret Life of Azure: The Judge and the Jury
The Secret Life of Azure: The Judge and the Jury Building reliable systems with automated evaluation #Azure #AIAgents #Evaluation #LLMAsAJudge 🎧 Audio Edition: Prefer to listen? Check out the expanded AI podcast version of this deep dive on YouTube . 📺 Video Edition: Prefer to watch? Check out the 7-minute visual explainer on YouTube . Evaluation & Quality The whiteboard was filled with the flowcharts from our last session, but Timothy was staring at a set of logs with a frustrated expression. "Margaret," Timothy said, "the system is running, but the quality is inconsistent. Sometimes the Extraction Agent misses a field, and then the Inventory Agent tries to log null data. It's a chain reaction of errors. I can’t sit here and manually check every single execution trace." Margaret picked up a green marker and drew a new box that sat outside the main workflow, connected to the output of every agent. "That's because you're treating the outp...