AlignmentWiki — Zero Sum & AI Alignment Research

Amanda Askell

RoleCharacter Lead, Anthropic

Known ForClaude's character development

PreviousPhilosophy PhD, NYU

FieldAI Ethics, Philosophy

Amanda Askell is a philosopher and AI researcher at Anthropic, where she leads work on Claude's character and personality development. Her work bridges philosophy of mind, ethics, and practical AI alignment.

Background

Askell holds a PhD in philosophy from New York University, with research focusing on ethics and metaethics. Her philosophical background informs her approach to developing AI systems with robust values and beneficial behaviors.

Work at Anthropic

At Anthropic, Askell is responsible for defining Claude's character - the personality traits, values, and behavioral patterns that make Claude helpful, harmless, and honest. Her approach involves iterative empirical refinement through extensive conversations with AI systems to identify and cultivate desirable traits.

Key Contributions

Development of Claude's character guidelines
Research on AI honesty and epistemic humility
Work on avoiding sycophancy in AI assistants
Framework for AI systems that can maintain values under pressure

Philosophy and AI

Askell has written about the intersection of philosophy and AI development, arguing that philosophical rigor is essential for creating AI systems that behave ethically. She emphasizes the importance of careful thinking about what we want AI systems to value and how to instill those values.

Amanda Askell

Background

Work at Anthropic

Key Contributions

Philosophy and AI

See Also