Alignment Wiki

Stuart Russell

RoleProfessor, UC Berkeley

Known ForAI Textbook, Value Alignment

EducationPhD, Stanford

Notable BookHuman Compatible (2019)

Stuart Russell is a British computer scientist and professor at UC Berkeley, known for co-authoring the standard AI textbook "Artificial Intelligence: A Modern Approach" and for his work on AI safety and value alignment.

Career

Academia

Russell has been a professor at UC Berkeley since 1986. He founded the Center for Human-Compatible AI (CHAI) to research approaches to building beneficial AI.

AI Safety Advocacy

Russell has been a prominent public voice on AI risk, arguing that the standard model of AI optimization is fundamentally flawed and proposing alternative approaches based on uncertainty about human preferences.

Key Contributions

AI Textbook: "Artificial Intelligence: A Modern Approach" (with Peter Norvig) - the most widely used AI textbook
Inverse Reinforcement Learning: Techniques for learning reward functions from behavior
Cooperative Inverse Reinforcement Learning (CIRL): Framework for human-AI cooperation
CHAI: Founded the Center for Human-Compatible AI at Berkeley

Views on AI Safety

Russell argues that the fundamental problem with AI is the "standard model" of optimization: giving AI systems fixed objectives to maximize. He proposes instead that AI systems should be uncertain about human preferences and seek to learn them.

His book "Human Compatible" (2019) presents this vision and argues for rebuilding AI on new foundations that make machines inherently beneficial.

Books

"Artificial Intelligence: A Modern Approach" (1995, with Peter Norvig)
"Human Compatible: Artificial Intelligence and the Problem of Control" (2019)