Our Mission

Our mission is simple yet urgent: to train the people who will who will determine whether AI is safe and beneficial. This is what AI evaluation is all about.

AI is moving faster than our ability to understand and evaluate its capabilities and risks. Right now, governments, companies, and researchers don’t have a shared language or tools to make sure these systems are safe, reliable, and aligned with human values.

That’s where this programme comes in. The International Programme on AI Evaluation: Capabilities & Safety is here to close that gap. We are creating a new academic discipline and preparing the next generation of specialists with the technical and policy expertise needed to test, monitor, and guide the safe development of AI.

Our Vision

We believe AI evaluation should become a cornerstone of how AI is built and used worldwide; a universal discipline taught in every top university, embedded in every lab and regulatory body, and serving as a safeguard so that advanced AI truly benefits humanity.


This programme is the first step towards establishing the world’s first MSc in AI Evaluation, and building a lasting home for this field - a place where researchers, practitioners, and policymakers can speak the same language and shape the future of AI together.

Meet the Team

  • Jose Hernández Orallo, Steering Academic Director of the AI Evaluation programme. Research Director at the Leverhulme Centre for the Future of Intelligence, University of Cambridge, and Professor at Technical University of Valencia.

    José Hernández-Orallo

    Steering Academic Director

    Research Director at the Leverhulme Centre for the Future of Intelligence, University of Cambridge, and Professor at Technical University of Valencia. His work spans AI, machine learning, and intelligence measurement, with more than 200 publications and several books, including ‘The Measure of All Minds’ Cambridge University Press, PROSE Award 2018). His work has been featured in The Economist, WSJ, FT, New Scientist, and Nature. He co-founded AI Evaluation Digest, and has advised European and global initiatives on AI governance. He is a EurAI Fellow and member of AAAI, CLAIRE, and ELLIS.

    LinkedIn | Scholar | Website

  • Xiting Wang, Academic Director of the AI Evaluation programme. Associate Professor at Renmin University of China and former Principal Researcher at Microsoft Research Asia.

    Xiting Wang

    Academic Director

    Associate Professor at Renmin University of China and former Principal Researcher at Microsoft Research Asia. Her research focuses on explainable and trustworthy AI, with 50+ publications and technologies deployed in Microsoft Bing and Microsoft News. She has served as area chair for IJCAI and AAAI, keynote speaker at SIGIR workshops, and editorial board member of Visual Informatics. She is an IEEE Senior Member and recipient of AAAI’s Best SPC Award.

    Scholar

  • Pablo Moreno Casares, Academic Director of the AI Evaluation programme. Quantum Algorithm Scientist at Xanadu.ai, PhD in Quantum Computation, MSc in Theoretical and Mathematical Physics from the University of Oxford.

    Pablo Moreno Casares

    Academic Director

    Quantum Algorithm Scientist at Xanadu.ai, where he develops quantum algorithms with applications to chemistry. He is the creator of TFermion, a widely used software library for resource estimation in quantum phase estimation, published in Quantum Journal. Pablo holds a PhD in Quantum Computation and Information from Universidad Complutense de Madrid and an MSc in Theoretical and Mathematical Physics from the University of Oxford.

    LinkedIn | Scholar

  • Jindong Wang, Academic Director of the AI Evaluation programme. Assistant Professor at William & Mary and faculty member of the Future of Life Institute. Previously a Senior Researcher at Microsoft Research Asia.

    Jindong Wang

    Academic Director

    Assistant Professor at William & Mary and faculty member of the Future of Life Institute. Previously a Senior Researcher at Microsoft Research Asia, his work covers machine learning, foundation models, and AI for social science. Among the world’s Top 2% Highly Cited Scientists (H-index 52), he has authored 60+ papers, a book on transfer learning, and leads open-source projects with 20K+ stars. He serves as area chair for ICML, NeurIPS, ICLR, and other major venues.

    LinkedIn | Scholar | Website

  • Joseph Castellano, Strategic Lead at the AI Evaluation programme. Safeguards Policy and Enforcement Analyst at Anthropic, and Executive Research Assistant at the Valencian Research Institute for Artificial Intelligence (VRAIN).

    Joseph Castellano

    Strategic Lead

    Executive Research Assistant at the Valencian Research Institute for Artificial Intelligence (VRAIN). He also has been a Visiting Research Fellow at the Centre for the Governance of AI (Oxford). With a Master of International Public Policy from Johns Hopkins SAIS, and over 15 years of experience spanning government, academia, and industry, Joe specializes in AI governance, quantum, and cybersecurity.

    LinkedIn

  • Andreina Gómez Torres, Programme Manager for the AI Evaluation: Capabilities and Safety international programme. Experience in programme and operations management across NGOs, startups, government, and international organizations.

    Andreina Gómez Torres

    Programme Manager

    Over 13 years of experience in programme and operations management across NGOs, startups, and government. Andreina has designed and scaled hybrid educational programmes serving thousands of learners across four continents. With a Master’s in International Relations from Syracuse University, she is committed to building rigorous, inclusive pathways into AI evaluation and safety.

    LinkedIn

Faculty

Our programme has the support of faculty from leading universities, including the University of Cambridge, Stanford, Princeton, Beijing Normal University, Renmin University of China, William & Mary, and the Technical University of Valencia.

Confirmed faculty also come from key institutions, research organizations, and companies such as the EU AI Office, the UK AI Safety Institute, CAIS, FAR AI, RAND, Epoch AI, Apollo Research, Redwood Research, Microsoft Research, and Google DeepMind.

Learn more about our Programme