Google DeepMind powers Frontier Safety Framework

We are expanding our risk domain and refining our risk assessment processes.

Breakthroughs in AI are changing our daily lives, from advances in mathematics, biology, and astronomy to enabling the potential of personalized education. As we build increasingly powerful AI models, we are committed to developing technology responsibly and taking an evidence-based approach to staying ahead of emerging risks.

Today, we are publishing the third iteration of the Frontier Safety Framework (FSF). This is the most comprehensive approach to date to identify and mitigate critical risks from advanced AI models.

This update builds on ongoing collaboration with industry, academia, and government experts. It also incorporates lessons learned from previous implementations and evolving frontier AI safety best practices.

Major updates to the framework

Addressing the risk of harmful operations

This update introduces Critical Capability Levels (CCL)*, which focus on harmful operations. Specifically, AI models with powerful manipulation capabilities that can be exploited to systematically and significantly alter beliefs and behaviors in high-stakes contexts identified during the course of interaction with the model, and additional harm can be reasonably expected on a severe scale.

This addition builds on and operationalizes the research we have done to identify and evaluate the mechanisms that drive operations from generative AI. We will continue to invest in this area to better understand and measure the risks associated with harmful operations.

Adapting the approach to misalignment risks

We also extended the framework to address potential future scenarios where uncoordinated AI models could impede an operator’s ability to direct, change, or stop operations.

While previous versions of the framework included an exploratory approach centered around the Instrumental Inference CCL (i.e., a warning level specific to when an AI model begins to think deceptively), this update now provides further protocols for the Machine Learning R&D CCL, focused on models that have the potential to accelerate AI R&D to potentially destabilizing levels.

In addition to the risks of misuse arising from these capabilities, there is also the risk of inconsistency arising from the possibility of undirected actions of models at these functional levels, and the possibility of such models being integrated into the AI development and deployment process.

To address the risks posed by CCLs, we conduct safety case reviews prior to external launch when relevant CCLs are reached. This includes performing a detailed analysis that shows how risks have been reduced to a manageable level. Advanced Machine Learning Research and Development For CCL, large-scale internal deployments can also pose risks, so we are currently expanding this approach to include such deployments.

Strengthen your risk assessment process

Our framework is designed to address risks according to their severity. In particular, we have enhanced our CCL definition to identify critical threats that require the most rigorous governance and mitigation strategies. We continually apply safety and security mitigations as part of our standard model development approach before reaching certain CCL thresholds.

Finally, this update provides more details about the risk assessment process. Building on our core early warning assessment, we explain how we conduct a comprehensive assessment that includes systematic risk identification, comprehensive analysis of model functionality, and explicit determination of risk acceptability.

Promote initiatives for remote safety

The latest updates to our Frontier Safety Framework represent our continued commitment to taking a scientific, evidence-based approach to tracking and pre-empting AI risks as capabilities progress toward AGI. By expanding the risk domain and strengthening the risk assessment process, we aim to ensure that innovative AI benefits humanity while minimizing potential harm.

Our framework will continue to evolve based on new research, stakeholder input, and lessons learned from implementation. We remain committed to working collaboratively across industry, academia and government.

The path to profitable AGI requires not only technological breakthroughs, but also robust frameworks to mitigate risks along the way. We hope that the updated Frontier Safety Framework will meaningfully contribute to this joint effort.

versatileai

See Full Bio

What's Hot

Pocket FM and OpenAI partner on content production: Rediff Moneynews

Gemini 2.5 Pro Preview: Even better coding performance

Build physical AI using virtual simulation data

Gemini 2.5 Pro Preview: Even better coding performance

Build physical AI using virtual simulation data

How NVIDIA builds open data for AI

Gemini’s Security Safeguard Advance – Google DeepMind

Wix Get 1 hour to expand generative AI capabilities and accelerate product innovation – TradingView News

Competitive programming with AlphaCode-Google Deepmind

Most Popular

Gemini’s Security Safeguard Advance – Google DeepMind

Wix Get 1 hour to expand generative AI capabilities and accelerate product innovation – TradingView News

Competitive programming with AlphaCode-Google Deepmind

Don't Miss

Pocket FM and OpenAI partner on content production: Rediff Moneynews

Gemini 2.5 Pro Preview: Even better coding performance

Build physical AI using virtual simulation data

Subscribe to Updates

What's Hot

Google DeepMind powers Frontier Safety Framework

Major updates to the framework

Addressing the risk of harmful operations

Adapting the approach to misalignment risks

Strengthen your risk assessment process

Promote initiatives for remote safety

Related Posts