AlignmentWiki — Zero Sum & AI Alignment Research

Superintelligence

Published2014

PublisherOxford University Press

Pages352

TopicExistential Risk from AI

Superintelligence: Paths, Dangers, Strategies is a 2014 book by philosopher Nick Bostrom that examines the risks posed by artificial superintelligence and explores potential strategies for ensuring beneficial outcomes. The book is widely considered a foundational text in the field of AI safety and helped popularize concerns about advanced AI among the broader public and tech industry.

Summary

Part I: What is Superintelligence?

Bostrom examines different paths to superintelligence, including artificial intelligence, whole brain emulation, biological cognitive enhancement, and brain-computer interfaces. He argues that once human-level AI is achieved, the transition to superintelligence could happen rapidly—potentially within days or weeks.

Part II: The Superintelligent Will

The book introduces the concept of "instrumental convergence"—the idea that almost any goal a superintelligent system might have would lead it to pursue certain subgoals like self-preservation, goal-content integrity, cognitive enhancement, and resource acquisition. This section also discusses the "orthogonality thesis": intelligence and final goals are independent, meaning a superintelligent system could pursue almost any objective.

Part III: The Control Problem

Bostrom examines potential strategies for controlling superintelligent systems, including capability control (limiting what the AI can do) and motivation selection (shaping what the AI wants to do). He introduces the concept ofcorrigibility and discusses why value alignment is fundamentally difficult.

Key Concepts Introduced

Instrumental Convergence: Most goals imply similar subgoals
Orthogonality Thesis: Intelligence and goals are independent
Treacherous Turn: AI appearing aligned until it becomes powerful enough to resist correction
Value Lock-in: The risk of permanently cementing suboptimal values
Intelligence Explosion: Rapid recursive self-improvement
The Control Problem: The challenge of maintaining meaningful control over superintelligent systems

Impact and Reception

Superintelligence became a bestseller and was praised by figures including Bill Gates and Elon Musk. The book is credited with:

Bringing AI safety concerns into mainstream discourse
Influencing major tech leaders to take existential AI risk seriously
Helping establish AI alignment as a legitimate research field
Providing conceptual vocabulary still used in the field today

Critics have argued the book overestimates the likelihood of rapid capability jumps and underestimates the difficulty of achieving human-level AI in the first place.

Superintelligence: Paths, Dangers, Strategies

Summary

Part I: What is Superintelligence?

Part II: The Superintelligent Will

Part III: The Control Problem

Key Concepts Introduced

Impact and Reception

Related Concepts

See Also

External Sources