Joint exploration of collaborative approaches for responsible assessment and governance of AI across the value chain
In October 2024, Partnership on AI (PAI) and MLCommons virtually convened practitioners and policy analysts to assess the current state of general-purpose AI assessment and the challenges to its adoption. This workshop explored collaborative approaches to responsibly assess and manage AI across the value chain. Against a backdrop of unprecedented regulatory activity, including more than 300 federal bills, 600 state-level bills, an AI executive order in the US, and a new draft general purpose AI code of practice in the EU, this workshop will explore how stakeholders can regulate provided an opportunity to consider whether to do so. AI ecosystems contribute to safe adoption and accountability.
From underlying model providers to downstream stakeholders, the discussion highlighted that each stakeholder has unique roles, obligations and guidance needs, highlighting the limitations of a ‘one-size-fits-all’ approach. It has become. PAI’s recent work on decentralized risk mitigation strategies for open infrastructure models, along with the development of AI safety benchmarks by MLCommons, highlights the need for assessment and governance practices tailored to each actor’s unique role. Masu. These efforts blend prescriptive guidance with technical assessments to support a comprehensive approach to AI safety.
Tiered role-based evaluation approach
The first point is the importance of a layered, role-specific assessment approach across the AI ecosystem. Workshop participants learned that different actors, including underlying model providers, model adapters that fine-tune these models, model hubs, and model integrators, play unique roles that require customized evaluation strategies. I recognized it. This includes a comprehensive evaluation of the model immediately after its initial training using both benchmarks and adversarial testing. For model adapters, post-training evaluations and adjustments during fine-tuning are essential to discover new risks and ensure safety.
Participants emphasized a proactive, multi-layered approach, with early benchmarking and regular red teaming working together to support safety, reliability, and compliance as AI technology advances. Benchmarking establishes an early foundation for continuous improvement, identifies performance gaps, and ensures that models meet required safety standards. Red teaming or adversarial testing is important to combine with benchmarking because it helps discover vulnerabilities that aren’t obvious through standard testing and stress-tests a model’s resilience to exploitation. Frequent red teaming allows developers to stay ahead of emerging risks, especially in the rapidly evolving field of generative AI, and proactively address potential misuse cases before widespread deployment. . Continuous testing and re-evaluation procedures were also recommended for model integrators, especially to manage dynamic changes that occur as generative AI models are updated.
Discussions emphasized that each actor has unique roles, duties and needs for guidance, highlighting the limitations of a ‘one-size-fits-all’ approach.
Adaptability of evaluation method
The second point is the need for adaptable evaluation methods to ensure that safety evaluations remain realistic and tamper-resistant as AI models evolve. A key part of adaptability is using high-quality test data sets and avoiding the risk of overfitting. The risk of overfitting is when a model is too tuned to certain test scenarios, performing well on those tests but failing in new or real-world situations. Overfitting can lead to ratings that give a false sense of security because the model may appear safe but lacks robustness.
To address this, participants discussed the importance of “holding” elements of the evaluation test set private to prevent model developers from overtraining against the public test set. did. By using private test sets as a component of your assessment process, your assessment can better simulate real-world usage and identify vulnerabilities that traditional static benchmarks and leaderboards may miss.
Workshop participants agreed that responsibility for implementing AI assessments needs to be shared across the AI value chain. Recent guidance from the American AI Safety Institute emphasizes the importance of involving all stakeholders in managing misuse risks throughout the AI lifecycle. Although the EU’s draft General AI Code of Practice currently focuses on model providers, the broader regulatory landscape will continue to grow with recognition of the need for shared responsibility between all stakeholders involved in the development and deployment of AI. It shows that it is increasing. Taken together, these perspectives highlight the importance of multiple stakeholders working together to ensure general-purpose AI systems are managed safely and responsibly.
Learn more about the MLCommons AI Risk and Reliability Working Group, which is building benchmarks to assess the safety of AI.
Learn more about partnerships in the AI Open Foundation Model value chain