Threat Intelligence Report: NVIDIAScape Container Escape (CVE-2025-23266)

CISO Executive Summary #

Overview #

NVIDIAScape, tracked as CVE-2025-23266, is a critical vulnerability in the NVIDIA Container Toolkit discovered by researchers at Wiz and disclosed in July 2025. It allows a malicious container to break isolation and execute code as root on the host. Because the NVIDIA Container Toolkit is the backbone of GPU access for managed AI services on every major cloud, the flaw represents a systemic risk to the AI ecosystem. It carries a CVSS score of 9.0 and is resolved in Toolkit v1.17.8 and GPU Operator v25.3.1.

Impact #

A container escape on shared GPU infrastructure is among the most severe outcomes in multi-tenant cloud, because it can collapse the boundary between customers. An attacker who controls a container image, for example a model or notebook uploaded to a GPU-as-a-service platform, could potentially escape to the host and reach other tenants’ data, models, and workloads. The breadth of affected platforms is what elevates this from a single product bug to an ecosystem-level concern.

Mitigation #

Patch immediately: Upgrade NVIDIA Container Toolkit to v1.17.8 or later and GPU Operator to v25.3.1 or later.
Scrutinize untrusted container images: Treat customer- or third-party-supplied images and models as hostile by default on GPU hosts.
Defense in depth around the runtime: Apply least-privilege, seccomp/AppArmor, and node isolation so a single hook compromise is not game over.
Verify your provider: For managed AI/GPU services, confirm the platform has remediated and how tenant isolation is enforced.

Engineering Breakdown #

CVE Details #

CVE ID: CVE-2025-23266
Severity: Critical
CVSS Score: 9.0
Affected: NVIDIA Container Toolkit up to 1.17.7; GPU Operator up to 25.3.0
Fixed: Toolkit v1.17.8; GPU Operator v25.3.1

Description #

The vulnerability is a pre-execution flaw in how the toolkit initializes GPU containers. The nvidia-ctk createContainer OCI hook trusted unfiltered environment variables inherited from the container. By setting LD_PRELOAD to point at a library inside the container’s own filesystem, an attacker forced the privileged hook to load attacker-controlled code, executing it as root on the host and instantly escaping the container.

Technical Analysis #

Wiz described the exploit as requiring only three lines in a container image: the createContainer hook runs with the host’s privileges but with the container’s filesystem as its working directory, so a relative LD_PRELOAD resolves to the attacker’s malicious shared object. The simplicity is the story: a trivially small, declarative payload in an image is enough to cross one of the most security-sensitive boundaries in cloud computing.

Why It Matters for AI Programs #

Nearly all managed AI training and inference runs in GPU-enabled containers orchestrated by Kubernetes and the NVIDIA GPU Operator. A toolkit-level escape therefore undermines the isolation that multi-tenant AI platforms depend on. It is a concrete reminder that the accelerator’s supporting software stack, not just the model or the GPU silicon, is part of the AI attack surface and belongs in your patch and supply-chain governance.

Sources #

Stay Vigilant