If you’re building with AI or trying to defend against the less tasty aspects of technology, Meta has dropped a new Llama security tool.
The improved security tools of the Lama AI model arrive with Meta’s fresh resources designed to help cybersecurity teams leverage AI for their defense. It’s all part of their push to make it a little safer to develop and use AI for everyone involved.
Developers who work with the Llama family of models now have an upgraded kit. Get these latest llama protection tools directly from the meta’s own Llama Protections page, or find a place where many developers live, hugging their faces and Github.
The first is the Llama Guard 4. Think of it as an evolution of AI’s Meta’s customizable safety filters. The big news here is that since it is now multimodal, it is possible to understand and apply safety rules not only to text but also to images. That’s very important as AI applications become more visual. This new version is also burnt into Meta’s brand new Llama API, which is currently in limited preview.
Then there’s llamafirewall. This is a new piece of Meta’s puzzle, designed to function like a security control center for AI systems. This will help you work together to connect various safety models to other protection tools in the Meta. That job? Like clever “quick injection” attacks designed to trick AI, potentially dangerous code generation, or dangerous behavior of AI plugins to find and block the types of risks that will keep AI developers at night.
Meta also adjusted the llama’s quick guard. The Main Prompt Guard 2 (86m) model is excellent at sniffing these nasty jailbreak attempts and quick injections. Even more interesting, perhaps the introduction of a quick guard 22m.
Prompt Guard 2 22m is a much smaller, nippy version. Meta believes that it can reduce latency and calculate costs by up to 75% compared to larger models without sacrificing much power. For those who need a faster response or are on tight budgets, that’s a welcome addition.
But Meta is not just focused on AI builders. They are also looking at cyber defenders at the forefront of digital security. They have heard the call for better AI-powered tools to help fight cyberattacks and share some updates aimed at that.
Cybersec Eval 4 benchmark suite has been updated. This open source toolkit helps organizations understand how good AI systems are actually at security tasks. This latest version includes two new tools.
Cybersoc Eval: Built with the help of Cybersecurity expert Crowdstrike, this framework specifically measures how well AI works in a real security operations center (SOC) environment. It is designed to provide greater clarity in the effectiveness of AI in threat detection and response. The benchmark itself will be coming soon. AutopatchBench: This benchmark tests how good Llama and other AI are by automatically finding and pinning security holes in your code before bad guys can exploit them.
To help put these types of tools into the hands of those who need it, Meta is launching the Rama Defender Program. This appears to be about giving partner companies and developers special access to the combination of AI solutions. Some open source, some early access, perhaps their own — all are directed towards a variety of security challenges.
As part of this, Meta shares an internally used AI security tool: an automated, sensitive document classification tool. Automatically slaps security labels for documents within your organization. why? It is located in leaky areas to prevent sensitive information from coming out the door or accidentally being fed into the AI system (like RAG setups).
It also tackles the issue of fake audio generated by AI and is increasingly used in scams. Llama Generated Audio Detectors and Llama Audio Watermark Detectors are shared with partners to help them find voices generated by AI in potential phishing calls or fraud attempts. Companies like Zendesk, Bell Canada, and AT&T are already lined up to integrate these.
Finally, Meta peeked out at private processing, which is potentially huge for the privacy of its users. This is a new technology we are working on for WhatsApp. The idea is to help AI to summarise unread messages and draft replies, but meta and WhatsApp can’t read the content of those messages.
Meta is very open about the security side, publishing threat models and inviting security researchers to drill holes before the architecture is published. It’s a sign they know that their privacy aspects need to be right.
Overall, it is a broad announcement of AI security from META. They are clearly trying to put serious muscles in order to ensure the AI they build, while also providing better tools to safely build and effectively defend the wider tech community.
See: The Amazing Rise of AI-powered Scams: Microsoft reveals $4 billion in thwarting fraud
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber Security & Cloud Expo.
Check out other upcoming Enterprise Technology events and webinars with TechForge here.