Evals 101

Writing evals is a core skill for making AI products that actually work. Evals are our "definition of what good looks like". They are both harder than they seem to get right, and at the same time not rocket science at all - anyone can learn to write evals.

What will we discuss?

  • What evals are (it's not rocket science)
  • How do write your first eval
  • The structure of a good evaluation
  • When to use or not use evals
  • How to set up evals in a project
  • Which tools to use, and when to use them
  • How to create useful data sets

Who is this for?

  • Anyone who is building things with AI, whether that is an internal tool, an agent, a product or a workflow
  • Anyone interested in learning how to do evals in a practical, no-nonsense way

Start your free trial

Start a 15-day free trial to unlock every episode. Cancel any time.

1. Let's write some evals

Episode 1.1 · 3:37
Evals intro: set up your accounts

This is what we came for: some hands-on eval writing.

Members only
Episode 1.2 · 11:11
An introduction to evals

What are evals, why do we need them, and why isn't this just QA?

Members only
Episode 1.3 · 13:53
Let's write an eval together

This is the fun part, hands-on writing evals together.

Members only
Episode 1.4 · 7:12
Eval Tips and Common Mistakes

Evals can be tricky, and it's easy to make some very expensive (in terms of quality, end result and cost) mistakes.

Members only
Episode 1.5 · 15:15
How we define what Good looks like

One reason evals are tricky, is that it can be hard to define what Good looks like when working (as we are) in a team.

Members only
Episode 1.5 · 8:12
Creating Data Sets

There are no evals without data sets. How do we create solid data sets? How many data points are enough? What about synthetic data?

Members only

What participants say

“This is opening up all sorts of new neural pathways for me to see under the hood more of how the sausage is made! 🙏”
Maya Lindgren · Senior UX Designer
“Very timely at my enterprise software company as evaluation of AI features scales.”
Daniel Reyes · Principal Product Designer
“Everything I know about evals is from Peter's talk, which is why I'm back to find out more!”
Anneke Visser · UX Researcher

Ready to get started?

15-day free trial. Cancel any time.