Enjoying this? Get full access.
Start a 15-day free trial to unlock every episode.
Why evaluating agents is different from evaluating prompts or models. Set the stage for the series: the shifting ground (models, users, inputs/outputs) and what makes a good agent eval.