Responsibility and safety
Published August 2, 2024 Author
Nahema Marchal and Rachel Sue
New research analyzes today’s misuse of multimodal generative AI to help build safer, more reliable technology
Generative artificial intelligence (AI) models that can generate images, text, audio, video, and more are enabling a new era of creativity and commercial opportunities. But as these capabilities grow, so does the potential for abuse, including manipulation, fraud, bullying, and harassment.
As part of our commitment to developing and using AI responsibly, we partnered with Jigsaw and Google.org to publish a new paper analyzing how generative AI technologies are being misused today . Teams across Google are using this and other research to develop better safeguards for generative AI technology, among other safety initiatives.
We collected and analyzed approximately 200 media reports documenting public abuse incidents published from January 2023 to March 2024. From these reports, we defined and categorized common tactics for exploiting generative AI and discovered new patterns in which these technologies are exploited and attacked. I compromised.
By clarifying the current threats and tactics used across different types of generative AI output, our research can help shape AI governance and help Google and other companies building AI technology become more comprehensive. to help develop safety assessments and mitigation strategies.
Highlighting the main categories of abuse
Generative AI tools are a unique and exciting way to increase creativity, but the ability to create bespoke and realistic content can be used inappropriately by malicious actors.
By analyzing media coverage, we identified two main categories of generative AI abuse tactics: abuse of generative AI capabilities and compromise of generative AI systems. Examples of misused technology include creating realistic depictions of human likenesses to impersonate public figures. Compromised technologies, on the other hand, included a “jailbreak” that removed protections on the model and the use of hostile inputs that caused it to malfunction.
In our dataset, the most prevalent cases of abuse are where easily accessible consumer-level generative AI tools are exploited by malicious actors, often in ways that do not require advanced technical skills. I did. For example, we investigated a high-profile incident in February 2024. In this incident, an international company reportedly suffered a loss of HK$200 million (approximately US$26 million) after employees were tricked into transferring money during an online meeting. In this case, all the other “persons” at the meeting, including the company’s chief financial officer, were actually computer-generated, convinced fraudsters.
Some of the most prominent tactics we observed, such as impersonation, fraud, and synthetic personas, predate the invention of generative AI and are used to influence information ecosystems or manipulate others. It has been used for a long time to However, wider access to generative AI tools will change the costs and incentives behind information manipulation, especially for those who previously lacked the technical sophistication to incorporate such tactics. may give new potency and potential to certain tactics.
Identify exploitation strategies and combinations
Tampering with evidence and manipulating human likeness underlie the most common tactics in real-world abuse cases. In the period we analyzed, most cases of misuse of generative AI were deployed for the purpose of influencing public opinion, enabling fraud or fraud, or generating profit.
By observing how bad actors combine generative AI exploitation tactics to achieve different goals, we identified specific combinations of exploitation and labeled these combinations as strategies. .
Although not overtly malicious, new forms of misuse of generative AI still raise ethical concerns. For example, government officials suddenly speak different voter-friendly languages
While this study provides new insights into new forms of exploitation, it is worth noting that this dataset is a limited sample of media coverage. Media coverage may prioritize sensational incidents, which may bias datasets toward certain types of abuse. Because generative AI systems are so novel, detecting or reporting instances of abuse can also be more difficult for stakeholders. This dataset also directly identifies the misuse of generative AI systems and traditional content creation and manipulation tactics such as image editing and setting up “content farms” to create large amounts of text, video, GIFs, images, etc. I haven’t compared it. So far, anecdotal evidence suggests that traditional content manipulation tactics remain prevalent.
Stay ahead of potential exploits
Our paper will help drive widespread generative AI literacy campaigns, develop better interventions to protect the public from bad actors, or alert people to uncover and counter manipulative strategies used to exploit generative AI. It highlights opportunities to design initiatives that protect the public, such as enabling .
This research helps our team better protect our products by informing the development of our safety initiatives. YouTube requires creators to share their work if it has been meaningfully altered or synthesized and feels realistic. Similarly, we have updated our election advertising policy to require advertisers to disclose if their election ads include digitally modified or generated material.
As we learn more about the malicious uses of generative AI and continue to make further technological advances, we know that it is more important than ever to ensure that our work is not done in silos. I recognize it. We recently joined Content for Provenance and Authenticity (C2PA) as a steering committee member to help develop technical standards and accelerate the adoption of content credentials. Content credentials are tamper-proof metadata that describes how content was created and edited over time.
In parallel, existing Red Team efforts include improving best practices for testing the safety of large-scale language models (LLMs) and developing pioneering tools such as SynthID to facilitate the identification of AI-generated content. We are also conducting research to advance our efforts. , are being integrated into an increasingly wide range of products.
In recent years, Jigsaw has conducted research to understand the tools and tactics used by misinformation creators, developed proactive misinformation videos to alert people to attempts to manipulate misinformation, and showed that proactive campaigns can improve resilience to large-scale misinformation. This work forms part of Jigsaw’s wider portfolio of information interventions to help people protect themselves online.
Proactively addressing potential misuse can help minimize risks while promoting responsible and ethical use of generative AI. These insights into the most common misuse tactics and strategies will help researchers, policymakers, and industry trust and safety teams build safer, more responsible technology and develop better measures to combat misuse. I hope it’s helpful.
Acknowledgment
This research was a collaborative effort by Nahema Marchal, Rachel Xu, Rasmi Elasmar, Iason Gabriel, Beth Goldberg, and William Isaac, with feedback from Mikel Rodriguez, Vijay Bolina, Alexios Mantzarlis, Seliem El-Sayed, Mevan Babakar, and Matt. Advice provided. Botvinnik, Campher Akbulut, Harry Lo, Sebastian Krier, Ziad Leslan, Boxy Wu, Frankie Garcia, and Jenny Brennan.