People
Researchers, practitioners, and key figures in the AI alignment field.
Stuart Russell
Professor, UC Berkeley
Computer scientist known for his work on AI and co-author of the standard AI textbook.
Eliezer Yudkowsky
Research Fellow, MIRI
Co-founder of MIRI, writer on rationality and AI risk.
Dario Amodei
CEO, Anthropic
Former VP of Research at OpenAI, co-founder of Anthropic.
Paul Christiano
Founder, ARC
Former OpenAI researcher, pioneer of RLHF and iterated amplification.
Jan Leike
Alignment Lead, Anthropic
Former DeepMind researcher, co-author of key RLHF papers.
Chris Olah
Co-founder, Anthropic
Pioneer of neural network interpretability and mechanistic understanding.
Neel Nanda
Research Scientist, Google DeepMind
Leading researcher in mechanistic interpretability and developer of TransformerLens.
Sam Bowman
Research Scientist, Anthropic
NLP researcher focused on language model evaluation and safety benchmarks.
Beth Barnes
Founder, METR
Former ARC researcher focused on AI evaluations and dangerous capability testing.
Amanda Askell
Character Lead, Anthropic
Philosopher leading Claude's character development, bridge between ethics and AI.
Nick Bostrom
Philosopher, Oxford
Author of Superintelligence, founder of Future of Humanity Institute.
Yoshua Bengio
Professor, Mila
Turing Award winner, deep learning pioneer, AI safety advocate.
Ilya Sutskever
Co-founder, SSI
Former Chief Scientist at OpenAI, co-founder of Safe Superintelligence Inc.
Connor Leahy
CEO, Conjecture
Co-founder of EleutherAI, AI safety researcher and advocate.