Scale AI to set the Pentagon’s path for testing and evaluating large language models

Published on February 20, 2024

The company will create a comprehensive T&E framework for generative AI within the Defense Department.

The Pentagon’s Chief Digital and Artificial Intelligence Office (CDAO) tapped Scale AI to produce a trustworthy means for testing and evaluating large language models that can support — and potentially disrupt — military planning and decision-making.

According to a statement the San Francisco-based company shared exclusively with DefenseScoop, the outcomes of this new one-year contract will supply the CDAO with “a framework to deploy AI safely by measuring model performance, offering real-time feedback for warfighters, and creating specialized public sector evaluation sets to test AI models for military support applications, such as organizing the findings from after action reports.”

Scale AI to set the Pentagon’s path for testing and evaluating large language models

Houthi Militants Deploy a Drone Sub for the First Time

Army picks wearable computers with artificial intelligence (AI) for documenting patients on the battlefield