Alignment Wiki

Beth Barnes

RoleFounder, METR

Known ForAI Evaluations

FocusDangerous Capability Testing

Beth Barnes is the founder of METR (Model Evaluation and Threat Research), an organization focused on evaluating AI systems for dangerous capabilities. She previously led ARC Evals at the Alignment Research Center.

Career

Alignment Research Center (2022-2023)

At ARC, Barnes founded and led ARC Evals, developing methods to test whether AI models could perform dangerous tasks like acquiring resources, self-replication, or evading shutdown.

METR (2023-present)

METR continues and expands the evaluation work, partnering with AI labs to test frontier models before deployment and developing standardized evaluation protocols.

Key Contributions

Dangerous Capability Evaluations: Methods to test if AI can perform harmful tasks
Autonomous Replication Testing: Can AI copy itself and acquire resources?
Pre-deployment Testing: Evaluating models before public release
Evaluation Methodology: Standardizing how we test AI capabilities
Lab Partnerships: Working with major AI companies on safety testing

Evaluation Philosophy

Barnes argues that we need concrete tests for dangerous capabilities:

Don't just theorize about risks—test for them
Develop evaluations before capabilities emerge
Create standardized protocols labs can adopt
Make results comparable across models

Notable Evaluations

Testing GPT-4 for autonomous replication ability
Evaluating models for manipulation capabilities
Assessing ability to acquire resources independently
Testing resistance to shutdown attempts