Writing evals is a core skill for making AI products that actually work. Evals are our "definition of what good looks like". They are both harder than they seem to get right, and at the same time not rocket science at all - anyone can learn to write evals.
Start a 15-day free trial to unlock every episode. Cancel any time.
This is what we came for: some hands-on eval writing.
What are evals, why do we need them, and why isn't this just QA?
This is the fun part, hands-on writing evals together.
Evals can be tricky, and it's easy to make some very expensive (in terms of quality, end result and cost) mistakes.
One reason evals are tricky, is that it can be hard to define what Good looks like when working (as we are) in a team.
There are no evals without data sets. How do we create solid data sets? How many data points are enough? What about synthetic data?
“This is opening up all sorts of new neural pathways for me to see under the hood more of how the sausage is made! 🙏”
“Very timely at my enterprise software company as evaluation of AI features scales.”
“Everything I know about evals is from Peter's talk, which is why I'm back to find out more!”