People

Researchers, practitioners, and key figures in the AI alignment field.

Professor, UC Berkeley

Computer scientist known for his work on AI and co-author of the standard AI textbook.

Research Fellow, MIRI

Co-founder of MIRI, writer on rationality and AI risk.

CEO, Anthropic

Former VP of Research at OpenAI, co-founder of Anthropic.

Founder, ARC

Former OpenAI researcher, pioneer of RLHF and iterated amplification.

Alignment Lead, Anthropic

Former DeepMind researcher, co-author of key RLHF papers.

Co-founder, Anthropic

Pioneer of neural network interpretability and mechanistic understanding.

Research Scientist, Google DeepMind

Leading researcher in mechanistic interpretability and developer of TransformerLens.

Research Scientist, Anthropic

NLP researcher focused on language model evaluation and safety benchmarks.

Founder, METR

Former ARC researcher focused on AI evaluations and dangerous capability testing.

Character Lead, Anthropic

Philosopher leading Claude's character development, bridge between ethics and AI.

Philosopher, Oxford

Author of Superintelligence, founder of Future of Humanity Institute.

Professor, Mila

Turing Award winner, deep learning pioneer, AI safety advocate.

Co-founder, SSI

Former Chief Scientist at OpenAI, co-founder of Safe Superintelligence Inc.

CEO, Conjecture

Co-founder of EleutherAI, AI safety researcher and advocate.

Director, CAIS

Creator of MMLU and other AI benchmarks, leads Center for AI Safety.