People

Researchers, practitioners, and key figures in the AI alignment field.

Stuart Russell

Professor, UC Berkeley

Computer scientist known for his work on AI and co-author of the standard AI textbook.

Eliezer Yudkowsky

Research Fellow, MIRI

Co-founder of MIRI, writer on rationality and AI risk.

Dario Amodei

CEO, Anthropic

Former VP of Research at OpenAI, co-founder of Anthropic.

Paul Christiano

Founder, ARC

Former OpenAI researcher, pioneer of RLHF and iterated amplification.

Jan Leike

Alignment Lead, Anthropic

Former DeepMind researcher, co-author of key RLHF papers.

Chris Olah

Co-founder, Anthropic

Pioneer of neural network interpretability and mechanistic understanding.

Neel Nanda

Research Scientist, Google DeepMind

Leading researcher in mechanistic interpretability and developer of TransformerLens.

Sam Bowman

Research Scientist, Anthropic

NLP researcher focused on language model evaluation and safety benchmarks.

Beth Barnes

Founder, METR

Former ARC researcher focused on AI evaluations and dangerous capability testing.

Amanda Askell

Character Lead, Anthropic

Philosopher leading Claude's character development, bridge between ethics and AI.

Nick Bostrom

Philosopher, Oxford

Author of Superintelligence, founder of Future of Humanity Institute.

Yoshua Bengio

Professor, Mila

Turing Award winner, deep learning pioneer, AI safety advocate.

Ilya Sutskever

Co-founder, SSI

Former Chief Scientist at OpenAI, co-founder of Safe Superintelligence Inc.

Connor Leahy

CEO, Conjecture

Co-founder of EleutherAI, AI safety researcher and advocate.

Dan Hendrycks

Director, CAIS

Creator of MMLU and other AI benchmarks, leads Center for AI Safety.