
The U.S. government will now be reviewing the most advanced AI models before they go public.
The U.S. Center for AI Standards and Innovation (CAISI) recently announced formal agreements with Google DeepMind, Microsoft, and Elon Musk’s xAI that will allow the U.S. government to evaluate their AI models before they are publicly available.
This move extends a federal oversight framework that already includes OpenAI and Anthropic, whose earlier agreements with CAISI were renegotiated to reflect updated directives from Commerce Secretary Howard Lutnick and America’s AI Action Plan.
CAISI, established in 2025 under President Trump administration, sits under the U.S. Department of Commerce and serves as the industry’s primary point of contact for frontier AI model evaluation. Its mandate checks for national security risks, specifically cybersecurity, biosecurity, and chemical weapons threats, rather than broader public safety considerations alone.
When CAISI evaluates a model, it does not always test the version the public will see. Developers provide versions with safety guardrails stripped away so the government can assess what the model is truly capable of when pushed to its limits.
From Voluntary Commitments to Formal Agreements
This is not a sudden policy turn, as the groundwork was laid under the Biden administration, which in July 2023, secured voluntary commitments from seven major AI companies to conduct security testing before releasing new models. But while that approach depended entirely on company willingness to comply, the current framework takes a more formal route.
In October 2024, the White House issued National Security Memorandum NSM-25, which defines “frontier AI models” as general-purpose systems near cutting-edge performance whose capabilities present elevated risks to national security. The Trump administration then built on this through an AI Action Plan released back in July 2025 that assigned 17 specific tasks to CAISI. The new agreements with Google DeepMind, Microsoft, and xAI reflect those directives directly.
Why the Urgency
Part of what is driving this push is how capable frontier AI models have become in sensitive domains. The initiative has been accelerated by concerns that AI can identify and exploit software vulnerabilities at unprecedented speeds, posing significant cybersecurity threats.
The scope of these reviews is also expanding beyond U.S. companies. In April 2026, CAISI evaluated DeepSeek V4 Pro, a Chinese open-weight model, showing that the center’s mandate now extends to foreign AI systems alongside commercial models from private-sector developers.
Beyond CAISI’s announced agreements, many reports say the White House has been weighing the creation of a separate AI vetting system. That process is still ongoing.
What This Means Going Forward
Pre-release evaluations will affect how quickly AI companies can bring new models to market. Pre-release vetting could introduce delays as models undergo evaluation by CAISI, a factor organizations must also now factor into their development planning.
The compliance requirements are widening, and while precise thresholds for which models trigger mandatory review are not yet fully set, the direction is clear. The days when a company could build and ship a frontier AI model without any government visibility are coming to a close.