Open Seminar Series
Access the knowledge
We've assembled world-class researchers to teach in the International Programme on AI Evaluation — but we don't want to gatekeep their expertise.
If we want AI evaluation to become a real discipline, the knowledge can't stay in a room of 40. Through the Open Seminar Series, we're opening select lectures from the programme to anyone who wants to learn.
What to expect
As our first cohort works through the programme's modules, we’ll open selected sessions to the wider community. These are live lectures by leading researchers — the same sessions our students attend — covering topics from uncertainty estimation to red-teaming to AI governance.
Sessions are free, and delivered on Zoom.
Meta-Evaluations: The Scientific and Institutional Foundations of AI Evaluation — Patricia Paskov
We use evaluations to judge AI systems.
But how do we know if those evaluations are reliable?
This session introduces meta-evaluation — assessing the quality, validity, and trustworthiness of AI evaluation methods, benchmarks, and audits.
With Patricia Paskov (RAND).
Sociotechnical Approach to AI Evaluation — Laura Weidinger
We often evaluate AI systems in isolation.
But their most important effects happen in the real world.
This session introduces the sociotechnical approach to AI evaluation — examining how systems behave when they interact with users, institutions, and society.
With Laura Weidinger (Google DeepMind).
(Past) Alignment Evaluation — Dr. Xiaoyuan Yi
Most alignment tests check if a model knows the “right answer.” But alignment is about behaviour.
This session explores new approaches to evaluating alignment — from static benchmarks to dynamic and adaptive methods.
With Xiaoyuan Yi (Microsoft Research Asia)
(Past) Evaluating AI Agents — Dr. Cozmin Ududec
AI systems don’t just answer questions anymore.
They act, plan, and make decisions over time.
This session explores how to evaluate agentic systems, where performance depends on sequences of actions, not single outputs.
With Cozmin Ududec (UK AI Security Institute).
(Past) Measurement Theory for AI Evaluation: Cognitive Constructs, IRT, and Latent Factor Models — Dr. Sanmi Koyejo
What are we actually measuring when we evaluate AI?
This session introduces tools from measurement theory — including cognitive constructs, Item Response Theory (IRT), and latent factor models — to better understand model capabilities.
With Sanmi Koyejo (Stanford University).
(Past) Evaluating Multi-Agent / Social Systems — Prof. Joel Z. Leibo
What happens when AI systems interact — with each other and with us?
This session explores evaluation in multi-agent environments, where behaviour emerges from interaction, not just individual models.
With Joel Z. Leibo (Google DeepMind).
(Past) Uncertainty Estimation — Prof. Thomas Dietterich
When an AI makes a prediction, how much should we trust it?
This session explores methods for estimating uncertainty, from understanding its sources to detecting when a model is out of its depth.
With Thomas Dietterich (Oregon State University).
Want to go deeper?
The International Programme on AI Evaluation trains 40 selected participants each year in a comprehensive 150-hour programme.