U.S. Government Now Tests AI Models Before Public Release

*Photo Credit: Greggory DiSalvo via iStock*

The U.S. government will now be reviewing the most advanced AI models before they go public.

The U.S. Center for AI Standards and Innovation (CAISI) recently announced formal agreements with Google DeepMind, Microsoft, and Elon Musk’s xAI that will allow the U.S. government to evaluate their AI models before they are publicly available.

This move extends a federal oversight framework that already includes OpenAI and Anthropic, whose earlier agreements with CAISI were renegotiated to reflect updated directives from Commerce Secretary Howard Lutnick and America’s AI Action Plan.

CAISI, established in 2025 under President Trump administration, sits under the U.S. Department of Commerce and serves as the industry’s primary point of contact for frontier AI model evaluation. Its mandate checks for national security risks, specifically cybersecurity, biosecurity, and chemical weapons threats, rather than broader public safety considerations alone.

When CAISI evaluates a model, it does not always test the version the public will see. Developers provide versions with safety guardrails stripped away so the government can assess what the model is truly capable of when pushed to its limits.

From Voluntary Commitments to Formal Agreements

This is not a sudden policy turn, as the groundwork was laid under the Biden administration, which in July 2023, secured voluntary commitments from seven major AI companies to conduct security testing before releasing new models. But while that approach depended entirely on company willingness to comply, the current framework takes a more formal route.

In October 2024, the White House issued National Security Memorandum NSM-25, which defines “frontier AI models” as general-purpose systems near cutting-edge performance whose capabilities present elevated risks to national security. The Trump administration then built on this through an AI Action Plan released back in July 2025 that assigned 17 specific tasks to CAISI. The new agreements with Google DeepMind, Microsoft, and xAI reflect those directives directly.

Why the Urgency

Part of what is driving this push is how capable frontier AI models have become in sensitive domains. The initiative has been accelerated by concerns that AI can identify and exploit software vulnerabilities at unprecedented speeds, posing significant cybersecurity threats.

The scope of these reviews is also expanding beyond U.S. companies. In April 2026, CAISI evaluated DeepSeek V4 Pro, a Chinese open-weight model, showing that the center’s mandate now extends to foreign AI systems alongside commercial models from private-sector developers.

Beyond CAISI’s announced agreements, many reports say the White House has been weighing the creation of a separate AI vetting system. That process is still ongoing.

What This Means Going Forward

Pre-release evaluations will affect how quickly AI companies can bring new models to market. Pre-release vetting could introduce delays as models undergo evaluation by CAISI, a factor organizations must also now factor into their development planning.

The compliance requirements are widening, and while precise thresholds for which models trigger mandatory review are not yet fully set, the direction is clear. The days when a company could build and ship a frontier AI model without any government visibility are coming to a close.

What's Hot

The U.S. Government Is Now Stress-Testing AI Models Before They Reach the Public. The Era of Unregulated Frontier AI Releases Is Ending.

Anthropic Just Secured More Compute by Partnering With SpaceX. Claude’s Usage Limits Are Doubling and the AI Capacity Race Just Got a New Player.

Microsoft Investigates Its Own Israel Office for How the Israeli Military Used Azure. The Findings Could Reshape How Cloud Contracts Are Written.

The U.S. Government Is Now Stress-Testing AI Models Before They Reach the Public. The Era of Unregulated Frontier AI Releases Is Ending.

Anthropic Just Secured More Compute by Partnering With SpaceX. Claude’s Usage Limits Are Doubling and the AI Capacity Race Just Got a New Player.

Microsoft Investigates Its Own Israel Office for How the Israeli Military Used Azure. The Findings Could Reshape How Cloud Contracts Are Written.

Google Gave the Pentagon Its AI and 600 of Its Own Employees Said It Was Wrong. Here Is the Conflict Tearing the Tech Giant Apart.

Our Picks

Most Popular

Coinbase responds to hack: customer impact and official statement

Anthropic Will Use Claude User Chats For Data Training

Cursor AI Hits 1 Million Daily Users. Why Developers Are Switching to This Coding Tool

Stay Ahead with Exclusive Updates!

What's Hot

The U.S. Government Is Now Stress-Testing AI Models Before They Reach the Public. The Era of Unregulated Frontier AI Releases Is Ending.

From Voluntary Commitments to Formal Agreements

Why the Urgency

What This Means Going Forward

Related Posts