Back to All Events

Evaluating Multi-Agent / Social Systems — Prof. Joel Z. Leibo

Most AI evaluation focuses on individual models — one system, one benchmark. But AI systems increasingly operate alongside other agents and alongside us. How do you evaluate an AI that cooperates, competes, deceives, or negotiates? That's a fundamentally different problem, and one that becomes more urgent as AI agents are deployed in the real world.

Prof. Leibo pioneered this question at Google DeepMind, creating Melting Pot — a benchmark suite of 50+ environments and 256 test scenarios for evaluating social generalization in AI agents — and Concordia, a platform for simulating social interactions between language model-based agents. This talk explores how we evaluate AI not as isolated tools, but as social actors.

Dr. Joel Z. Leibo is a Senior Staff Research Scientist at Google DeepMind and a visiting professor at King's College London. He specializes in multi-agent reinforcement learning and the development of human-compatible artificial intelligence.

With a PhD from MIT in computational neuroscience and machine learning, his research focuses on leveraging insights from human biological and cultural evolution to inform AI development.

He is particularly interested in applying theories of cooperation from cultural evolution and institutional economics to create ethical and effective AI systems.

Want to join this session?

Sign up to register for the session, and get notified about upcoming lectures.

Previous
Previous
February 24

Uncertainty Estimation — Prof. Thomas Dietterich