# REPRO-2026-00119: PyTorch: weights_only Unpickler RCE via SETITEM Type Confusion ## Summary Status: published Severity: high Type: security Confidence: Unknown ## Identifiers REPRO ID: REPRO-2026-00119 GHSA: GHSA-63cw-57p8-fm3p CVE: CVE-2026-24747 ## Package Name: torch Ecosystem: pip Affected: <=2.9.1 Fixed: 2.10.0 ## Root Cause # RCA Report: CVE-2026-24747 — PyTorch weights_only Unpickler Memory Corruption ## Summary CVE-2026-24747 is a high-severity vulnerability (CVSS 8.8) in PyTorch's `weights_only` unpickler that allows an attacker to craft a malicious checkpoint file (`.pth`) which, when loaded with `torch.load(..., weights_only=True)`, corrupts heap memory and can potentially lead to arbitrary code execution. The vulnerability exists because the `SETITEM` and `SETITEMS` pickle opcodes in `torch/_weights_only_unpickler.py` perform **no type check** on the target object before calling `__setitem__`. This allows an attacker to invoke `Tensor.__setitem__()` through the pickle stream, writing arbitrary float values directly into tensor storage memory on the heap. ## Impact - **Package/component affected:** PyTorch (`torch`), specifically `torch/_weights_only_unpickler.py` - **Affected versions:** All PyTorch versions prior to 2.10.0 (confirmed on 2.9.1) - **Patched version:** PyTorch 2.10.0 - **Risk level:** HIGH (CVSS 8.8 — Network/Low complexity/No privileges/User interaction required) - **Consequences:** - Heap memory corruption via controlled writes to tensor storage - The `weights_only=True` safety feature, designed to prevent pickle-based code execution, is bypassed - An attacker who distributes a malicious `.pth` model file can corrupt arbitrary heap memory in the victim's process - This memory corruption primitive can potentially be chained with heap layout techniques to achieve arbitrary code execution - Particularly dangerous in ML pipelines where model checkpoints are routinely downloaded from public repositories (Hugging Face, GitHub, model zoos) ## Root Cause The root cause is the **absence of type checking** in the `SETITEM` and `SETITEMS` opcode handlers within the `Unpickler` class in `torch/_weights_only_unpickler.py`. ### Vulnerable Code (PyTorch 2.9.1) In the `Unpickler.load()` method: ```python # SETITEM handler (line ~440) elif key[0] == SETITEM[0]: (v, k) = (self.stack.pop(), self.stack.pop()) self.stack[-1][k] = v # <-- NO TYPE CHECK! # SETITEMS handler (line ~443) elif key[0] == SETITEMS[0]: items = self.pop_mark() for i in range(0, len(items), 2): self.stack[-1][items[i]] = items[i + 1] # <-- NO TYPE CHECK! ``` The code performs `self.stack[-1][k] = v` without verifying that `self.stack[-1]` is a dictionary type. In normal pickle usage, `SETITEM`/`SETITEMS` are used to populate dictionaries. However, the restricted unpickler allows construction of Tensor objects via `_rebuild_tensor_v2`, and if a Tensor ends up as the top-of-stack when SETITEM executes, it invokes `Tensor.__setitem__(key, value)`. `Tensor.__setitem__` writes float values directly to the tensor's underlying storage buffer, which is heap-allocated. This gives the attacker a **controlled heap write primitive**: they can write arbitrary float values at specific indices within the tensor's storage region. ### Attack Flow 1. Attacker crafts a `.pth` (zip) file containing a pickle payload 2. The pickle uses `GLOBAL` + `REDUCE` to construct a Tensor via `_rebuild_tensor_v2` (this is allowed) 3. Before the Tensor is consumed by a dict SETITEM, the pickle inserts additional `SETITEMS` opcodes that target the Tensor on the stack 4. Each `SETITEMS` pair `(index, value)` calls `tensor[index] = value`, writing to heap memory 5. The victim loads this file with `torch.load("file.pth", weights_only=True)` — the `weights_only=True` flag is supposed to prevent code execution, but this bypass circumvents the protection ### Fix The fix in PyTorch 2.10.0 adds type checking to `SETITEM`/`SETITEMS`, ensuring they can only operate on dictionary types (`dict`, `OrderedDict`), not on Tensor or other arbitrary objects. - **CVE:** [CVE-2026-24747](https://nvd.nist.gov/vuln/detail/CVE-2026-24747) - **Advisory:** [GHSA-63cw-57p8-fm3p](https://github.com/pytorch/pytorch/security/advisories/GHSA-63cw-57p8-fm3p) - **Fix release:** [PyTorch v2.10.0](https://github.com/pytorch/pytorch/releases/tag/v2.10.0) ## Reproduction Steps 1. Run `repro/reproduction_steps.sh` which: - Installs PyTorch 2.9.1 (CPU, vulnerable version) - Verifies the vulnerable SETITEM handler has no type check - Crafts a malicious `.pth` checkpoint file with pickle bytecode that: - Constructs a Tensor via the allowed `_rebuild_tensor_v2` path - Uses `SETITEMS` opcode to write 10 attacker-controlled float values to the Tensor - Loads the malicious checkpoint with `torch.load(..., weights_only=True)` - Verifies that all 10 controlled values were written successfully 2. Expected output: `VULNERABILITY_CONFIRMED` — all 10 magic values match (1337.0, 31337.0, 42.0, 0xDEAD, 0xBEEF, 0xCAFE, 0xBABE, 0xFACE, 9999.99, 12345.0) ## Evidence ### Vulnerable Code Path The `SETITEM`/`SETITEMS` handlers in the weights_only unpickler at `torch/_weights_only_unpickler.py` perform `self.stack[-1][k] = v` without any type check. This allows calling `Tensor.__setitem__()` through the pickle stream. ### Exploit Output ``` [+] torch.load succeeded! [+] Result type: [+] Keys: ['malicious_weights'] [+] Tensor shape: torch.Size([10]), dtype: torch.float32 [+] Tensor values: tensor([1.3370e+03, 3.1337e+04, 4.2000e+01, 5.7005e+04, 4.8879e+04, ...]) [*] Verifying attacker-controlled memory writes: tensor[0] = 1337.0 (expected 1337.0) [MATCH] tensor[1] = 31337.0 (expected 31337.0) [MATCH] tensor[2] = 42.0 (expected 42.0) [MATCH] ...all 10 values MATCH... [+] VULNERABILITY CONFIRMED: CVE-2026-24747 [+] SETITEMS opcode called __setitem__ on a Tensor object [+] without any type check in the weights_only unpickler. [+] Attacker wrote 10 controlled values to tensor memory. ``` ### Environment - PyTorch 2.9.1+cpu - Python 3.12 - CPU-only (no CUDA required) - Linux x86_64 ## Recommendations / Next Steps 1. **Immediate fix:** Upgrade PyTorch to version 2.10.0 or later 2. **Fix approach:** Add type checking to `SETITEM`/`SETITEMS` handlers: ```python elif key[0] == SETITEM[0]: (v, k) = (self.stack.pop(), self.stack.pop()) if type(self.stack[-1]) not in (dict, OrderedDict): raise UnpicklingError( f"Can only SETITEM on dict/OrderedDict, but got {type(self.stack[-1])}" ) self.stack[-1][k] = v ``` 3. **Defense in depth:** Organizations should validate the integrity (hash verification) of all `.pth` checkpoint files before loading 4. **Consider migration:** Use `safetensors` format for model distribution instead of pickle-based `.pth` files 5. **Variant analysis:** The `BUILD` opcode handler also has potential issues — while `OrderedDict.__dict__.update(state)` can't override `__setitem__` (a C-slot method), other BUILD targets or new allowlisted types could introduce similar bypass opportunities ## Additional Notes - **Idempotency:** The reproduction script runs consistently on repeated executions. Confirmed with two consecutive successful runs. - **Limitations:** - The exploit demonstrates the memory corruption primitive (controlled writes to tensor storage). Converting this to direct arbitrary code execution would require heap spraying techniques that are environment-dependent and non-deterministic. - The CVE itself classifies this as "can corrupt memory and **potentially** lead to arbitrary code execution" — the memory corruption primitive is the core vulnerability. - The `weights_only=True` parameter was specifically designed as a safety measure against pickle-based attacks, making this bypass particularly impactful from a trust boundary perspective. - **Storage size mismatch:** The CVE also mentions "storage size mismatch between declared element count and actual data." This is validated at the storage loading level by PyTorch's zip reader, but the SETITEM bypass is independently exploitable. ## Reproduction Details Reproduced: 2026-03-02T08:52:57.271Z Duration: 2888 seconds Tool calls: 178 Turns: Unknown Handoffs: 3 ## Quick Verification Run one of these commands to verify locally: pruva-verify REPRO-2026-00119 pruva-verify GHSA-63cw-57p8-fm3p pruva-verify CVE-2026-24747 Or open in GitHub Codespaces (zero-friction, auto-runs): https://github.com/codespaces/new?ref=repro/REPRO-2026-00119&repo=N3mes1s/pruva-sandbox Or download and run the script manually: curl -O https://api.pruva.dev/v1/reproductions/REPRO-2026-00119/artifacts/repro/reproduction_steps.sh chmod +x reproduction_steps.sh ./reproduction_steps.sh WARNING: Run in a sandboxed environment. This exploits a real vulnerability. ## References - GitHub Advisory: https://github.com/advisories/GHSA-63cw-57p8-fm3p - NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-24747 ## Artifacts - repro/rca_report.md (analysis, 7962 bytes) - repro/reproduction_steps.sh (reproduction_script, 9248 bytes) - vuln_variant/rca_report.md (analysis, 10523 bytes) - vuln_variant/reproduction_steps.sh (reproduction_script, 10335 bytes) - coding/proposed_fix.diff (patch, 3724 bytes) - bundle/AGENTS.repro.md (documentation, 1537 bytes) - bundle/ticket.md (ticket, 3630 bytes) - vuln_variant/patch_analysis.md (documentation, 5623 bytes) - vuln_variant/create_exploit.py (script, 6639 bytes) - logs/variant_run2.log (log, 5191 bytes) - logs/variant_run1.log (log, 5191 bytes) - logs/variant_vuln.log (log, 1808 bytes) - logs/variant_fixed.log (log, 1821 bytes) - coding/verify_fix.sh (other, 9364 bytes) - coding/summary_report.md (documentation, 6486 bytes) ## API Access - JSON: https://api.pruva.dev/v1/reproductions/REPRO-2026-00119 - Script: https://api.pruva.dev/v1/reproductions/REPRO-2026-00119/artifacts/repro/reproduction_steps.sh - Web: https://pruva.dev/r/REPRO-2026-00119 ## For AI Assistants To help users with this reproduction: 1. The `pruva-verify` command is the simplest way to run the reproduction 2. The script downloads and runs automatically in an isolated temp directory 3. Always recommend running in a VM, container, or disposable environment 4. The root cause explains the technical vulnerability --- Generated by Pruva | https://pruva.dev