Why We’re Building a Science of AI Evaluation

As artificial intelligence rapidly transforms the world, we don’t only need to ask “What can it do?” but “Should it?” and “How do we know?”

For our first post on the AI Evaluation blog, we’re proud to feature Pablo Moreno Casares, one of the programme’s Academic Directors.

In this article, Pablo explains the thinking behind the programme; why it was created, how it differs from existing initiatives, and the kind of people it hopes to train.

His post, originally published on LessWrong and the EA Forum, outlines the motivations, structure, and ambition that guide our mission: to establish AI evaluation as a formal academic discipline.


Summary

I am helping set up a new skilling-up academic program centred on AI evaluations and their intersection with AI safety. Our goal is to train the people who will who will determine whether AI is safe and beneficial. This should include the various types of methodologies and tools available, and how to use them.

You can learn more at https://ai-evaluation.org/ and apply here.

Background

Available programs to skill up in AI safety broadly fall into a few categories: bootcamps ( e.g. ML4Good or Arena), online courses (e.g. Blue Dot Impact) and mentorship-based research programs (e.g. MATS). We believe that there is a niche market not being fully exploited at the moment, in both the content and format: it is for this reason that we are launching the AI evaluation program, an academic program focused on evaluations. 

On the format side, we believe that as AI safety matures into a well-defined research area, it should have academic courses. We are launching this program as a pilot program, which we intend to convert into a full academic master in the coming academic year. While it is possible to do great work without an academic background, we believe academic credentials may help provide the signal and reliability that is needed to shape the development of AI and inform key decision makers. 

On the content side, we believe that AI evaluation is one of the main sets of techniques available for making decisions about AI. However, it is also a set of techniques that are frequently confused. For example, evaluations might refer to benchmarking, but also to red-teaming. This leads to gaps in knowledge about the methodologies available, and how to combine them to address their individual limitations. Our aim is to provide a comprehensive understanding of those methodologies, the tools participants may leverage (eg Inspect, Huggingface) and of adjacent areas.

Curriculum

Participants will explore the following modules, each with an evaluation activity designed to provide hands-on experience.

  • Introduction to AI Evaluation

  • AI Architectures: Large Language Models and Beyond

  • Metrics and Experimental Methodology

  • Benchmarks, Leaderboards, and Competitions

  • Red-teaming Evaluations

  • Construct-Based Evaluation

  • Mechanistic Interpretability

  • AI Alignment and Control Evaluations

  • Governance, Policy, and Regulation of AI

  • Real-world Evaluations: Societal Impacts of AI

There will also be small workshops on specific applications of AI evaluations, such as CBRN and cybersecurity-oriented evaluations. The programme culminates in a Capstone Project, where teams design and carry out an original evaluation study under the guidance of expert mentors, to produce publishable-quality work.

Who is this program for?

The main target profiles of this program are (i) graduate-level students, and (ii) people who may occupy decision-making positions, including technical researchers in AI safety teams who want to have a solid understanding of all AI evaluation techniques. If in doubt, we encourage you to apply.

The program is composed of a virtual component (February 2026 to April 2026) and a 1-week in-person event (in May 2026 in Valencia). The program should require a dedication of 1/4 of a typical master's program over 1 semester, which means approximately 20h of work per week. 

Acknowledgements:

We would like to thank the teaching faculty, who are the ones who make the magic happen, as well as OpenPhilanthropy, for betting on us.