AlignmentWiki — Zero Sum & AI Alignment Research

ARC

TypeResearch Institute

Founded2021

FocusAlignment Research & Evals

The Alignment Research Center (ARC) is a nonprofit research organization founded by Paul Christiano in 2021. ARC focuses on theoretical alignment research and developing evaluations to assess AI capabilities and safety.

Overview

ARC conducts both theoretical research on AI alignment and practical work on evaluating AI systems for dangerous capabilities. The organization has become known for its work on AI evaluations ("evals") that test whether models can perform tasks that would be concerning from a safety perspective.

Research Areas

Theoretical Alignment

Research on foundational questions in AI alignment, including scalable oversight, eliciting latent knowledge, and AI-assisted alignment research.

AI Evaluations

Development of evaluations to test AI systems for dangerous capabilities, including autonomous replication, acquiring resources, and deceptive behavior.

Key Contributions

Eliciting Latent Knowledge (ELK): A research agenda focused on getting AI systems to report what they actually know
AI Evals: Framework for evaluating dangerous capabilities in frontier models
ARC Prize: Competition for solving the ARC-AGI benchmark

Key People

Paul Christiano - Founder

Alignment Research Center (ARC)