Google, Microsoft, and xAI Agree to US Government Pre-Release AI Model Reviews

Google, Microsoft, and xAI have agreed to submit frontier AI models to the US Commerce Department's Center for AI Standards and Innovation for pre-release evaluation — joining OpenAI and Anthropic, which have participated in the program since 2024. The move expands an informal safety review into ...

Abstract government seal in rose-purple light surrounded by translucent AI evaluation panels, representing pre-release government review of frontier AI models.
Pre-release government AI review just went from a two-company arrangement to an industry standard — and formal executive order language may follow.

Google, Microsoft, and xAI have agreed to submit pre-release frontier AI models to the US Commerce Department's Center for AI Standards and Innovation (CAISI) for evaluation before public launch. The announcements, made on May 5, expand a program that has operated quietly since 2024, when OpenAI and Anthropic became CAISI's first voluntary partners. Those two companies have renegotiated their existing arrangements to align with priorities in President Trump's AI Action Plan, released in July 2025. With five of the six leading frontier AI developers now enrolled, CAISI's model evaluation program has transitioned from an informal pilot into what practitioners are calling a de facto industry standard for responsible model release.

What CAISI's Evaluation Actually Does

CAISI operates under the National Institute of Standards and Technology (NIST) within the Commerce Department. The center's AI evaluations focus on two primary domains: capability assessment — measuring what a model can do, including in dual-use categories such as biological synthesis guidance, cybersecurity exploitation, and autonomous replication — and safety stress-testing, which probes how models respond to adversarial inputs, jailbreak attempts, and edge cases in sensitive topic areas. Since 2024, CAISI has completed more than 40 evaluations of AI models, including state-of-the-art unreleased models from OpenAI and Anthropic, according to Commerce Department officials.

The evaluations are conducted before a model is released to the public but are not structured as a formal approval gate: CAISI does not have the authority to block a model from launching. The practical leverage comes from reputational signaling. If CAISI finds significant risk indicators in a pre-release evaluation, the findings are shared with the developer, who then faces the choice of delaying release to address the issues or launching knowing that a federal agency has documented concerns. The program functions more like a peer review process than a regulatory clearance — influential, but not binding under current law.

The expansion of the program to Google, Microsoft, and xAI closes a notable gap. Google's Gemini family and xAI's Grok are now subject to the same pre-launch government visibility as GPT and Claude, creating a more symmetric information environment for the US government's national security assessments. Microsoft's inclusion is slightly more complex — Microsoft is both a large independent model developer and the primary distributor of OpenAI's models — and its evaluation scope is expected to cover both its own Azure AI models and any frontier OpenAI models it deploys first on Azure infrastructure.

What the Trump AI Action Plan Requires and What Is Still Missing

Trump's AI Action Plan, released in July 2025 as a follow-on to his first-term AI executive orders, directs CAISI to lead national security-related AI model assessments and to be embedded within a broader AI evaluation ecosystem spanning the Defense Department, the Intelligence Community, and civilian agencies. The Action Plan stopped short of mandating company participation, relying instead on voluntary agreements — which is why the expansion to Google, Microsoft, and xAI required individual negotiations rather than a blanket executive order.

Reports from multiple outlets indicate the Trump administration is considering a formal executive order that would create a structured government review process, potentially with timelines, documentation requirements, and defined risk thresholds. A White House official told media that any announcement would come directly from President Trump and characterized speculation about specific executive order language as premature. The absence of a formal legal mandate means the program's durability depends on continued voluntary participation — a structural weakness if a major developer decides the reputational or competitive costs of pre-release access outweigh the benefits.

One conspicuous absence: Meta. The company's Llama open-weight models — released without access restrictions — represent a fundamentally different risk model that the current CAISI evaluation framework was not designed to address. An open-weight frontier model, once released, cannot be pre-screened in a meaningful way because anyone can download and run it without going through an evaluation channel. Meta's participation would require a different framework, likely focused on post-release monitoring and incident reporting rather than pre-release evaluation.

What to Watch

The executive order question is the near-term hinge point: if Trump signs an order codifying pre-release review requirements before a major model launch from any of the five participating companies, the voluntary program crystallizes into a binding framework with precedent for extension to additional developers. Watch whether Meta receives or responds to an invitation to join — the political and regulatory pressure to include the world's most downloaded open-weight models will intensify as Llama 5 approaches release. And watch the CAISI evaluation reports: so far, the results of the 40+ evaluations have not been published. If that changes — even in summary form — it would create the first systematic public dataset on frontier model risk profiles, reshaping how investors, regulators, and users assess AI companies.

Trump admin moves further into AI oversight — will test Google, Microsoft and xAI models
CNBC covers the expansion of CAISI's pre-release model evaluation program to Google, Microsoft, and xAI, and the status of potential executive order language from the White House.
AI Firms Agree to Give US Early Access to Evaluate Their Models
Bloomberg's coverage of the Commerce Department program expansion, including details on OpenAI and Anthropic's renegotiated agreements and the 40+ evaluation milestone.

Want every AI × Web3 signal the moment it breaks? Subscribe to the BlockAI News daily brief.

Keep Reading

SpaceX Eyes $119B 'Terafab' Chip Mega-Factory in Texas

SpaceX Eyes $119B 'Terafab' Chip Mega-Factory in Texas

SpaceX is evaluating an investment of up to $119 billion in a vertically integrated semiconductor manufacturing facility in Texas, according to reporting based on materials reviewed by Bloomberg. The proposed plant, internally referred to as "Terafab," would not merely assemble chips but would pursue end-to-end fabrication — spanning wafer production through advanced packaging — placing SpaceX in direct competition with established foundry giants at a scale that would rival TSMC's entire US expansion program.

What's New on the Table

The Terafab concept represents a significant strategic escalation

Read full story →

Stay Ahead of the Market

Daily AI & crypto briefings — straight to your inbox, your phone, and your timeline.