Beth Barnes

PersonResearcher
Suggest Edit
Beth Barnes
RoleFounder, METR
Known ForAI Evaluations
PreviousARC Evals
FocusDangerous Capability Testing

Beth Barnes is the founder of METR (Model Evaluation and Threat Research), an organization focused on evaluating AI systems for dangerous capabilities. She previously led ARC Evals at the Alignment Research Center.

Career

Alignment Research Center (2022-2023)

At ARC, Barnes founded and led ARC Evals, developing methods to test whether AI models could perform dangerous tasks like acquiring resources, self-replication, or evading shutdown.

METR (2023-present)

METR continues and expands the evaluation work, partnering with AI labs to test frontier models before deployment and developing standardized evaluation protocols.

Key Contributions

  • Dangerous Capability Evaluations: Methods to test if AI can perform harmful tasks
  • Autonomous Replication Testing: Can AI copy itself and acquire resources?
  • Pre-deployment Testing: Evaluating models before public release
  • Evaluation Methodology: Standardizing how we test AI capabilities
  • Lab Partnerships: Working with major AI companies on safety testing

Evaluation Philosophy

Barnes argues that we need concrete tests for dangerous capabilities:

  • Don't just theorize about risksโ€”test for them
  • Develop evaluations before capabilities emerge
  • Create standardized protocols labs can adopt
  • Make results comparable across models

Notable Evaluations

  • Testing GPT-4 for autonomous replication ability
  • Evaluating models for manipulation capabilities
  • Assessing ability to acquire resources independently
  • Testing resistance to shutdown attempts

See Also

Last updated: November 28, 2025