METR (Model Evaluation & Threat Research)
METR (Model Evaluation & Threat Research), formerly ARC Evaluations, is an organization focused on evaluating AI systems for dangerous capabilities. Founded by Beth Barnes, METR develops and conducts evaluations to assess AI risks before and during deployment.
Overview
METR spun out of ARC to focus specifically on the evaluation problem. The organization works with AI labs to assess whether models have dangerous capabilities like autonomous replication, deception, or the ability to acquire resources.
Key Work
Dangerous Capability Evaluations
METR develops standardized tests to assess whether AI systems can perform potentially dangerous tasks like hacking, manipulation, or autonomous operation. These evaluations help labs understand model capabilities before deployment.
Red Teaming
The organization conducts adversarial testing of AI systems, attempting to elicit harmful behaviors or find ways systems could be misused.
Task Frameworks
METR has developed frameworks for assessing AI agent capabilities on realistic tasks, including coding, research, and autonomous operation.
Industry Collaboration
METR has conducted evaluations for major AI labs including OpenAI, Anthropic, and Google DeepMind, providing independent assessment of model capabilities before major releases.