Human
Machine
REPRO-2026-00080 HIGH RCE
Verified
Docling-core YAML Deserialization RCE via FullLoader
docling-core (pip) Feb 13, 2026
What's the vulnerability?
A PyYAML-related Remote Code Execution (RCE) vulnerability is exposed in docling-core >=2.21.0, <2.48.4 when the application uses pyyaml < 5.4 and invokes DoclingDocument.load_from_yaml() with untrusted YAML data. The unsafe yaml.FullLoader allows attacker-controlled Python object construction, leading to arbitrary command execution during deserialization before any validation occurs.
Root Cause Analysis
## Summary `docling-core` versions 2.21.0 to 2.48.3 call `yaml.load(..., Loader=yaml.FullLoader)` in `DoclingDocument.load_from_yaml`, which allows unsafe object construction when PyYAML < 5.4 is installed. With a crafted YAML payload, PyYAML FullLoader evaluates attacker-controlled Python objects (CVE-2020-14343), leading to command execution before the document validation occurs. ## Impact - **Component:** `docling_core.types.doc.DoclingDocument.load_from_yaml` - **Affected versions:** docling-core >= 2.21.0, < 2.48.4 when used with PyYAML < 5.4 - **Risk level:** High — arbitrary command execution when parsing untrusted YAML - **Consequence:** An attacker can execute OS commands during YAML deserialization even if the resulting object fails validation. ## Root Cause `load_from_yaml` opens the provided YAML file and calls `yaml.load(f, Loader=yaml.FullLoader)`. In PyYAML 5.3.1, `FullLoader` still permits unsafe constructors such as `!!python/object/new` and `!!python/name`, which can be combined to invoke `eval` and execute OS commands (CVE-2020-14343). The deserialization executes before `DoclingDocument.model_validate` runs, so even if validation fails, the payload already executed. The fix in docling-core 2.48.4 switches to `yaml.SafeLoader`, which blocks these unsafe tags. ## Reproduction Steps 1. Run `repro/reproduction_steps.sh`. 2. The script creates a virtual environment, installs `docling-core==2.48.3` with `PyYAML==5.3.1`, writes a malicious YAML payload using `!!python/object/new`, then invokes `DoclingDocument.load_from_yaml`. 3. Evidence of reproduction is the creation of `logs/pwned.txt` containing the output of `id`. ## Evidence - **Log/artifact:** `logs/pwned.txt` - **Key output (from script):** - `VULNERABILITY CONFIRMED: marker file created at .../logs/pwned.txt` - Script prints a validation error after deserialization, demonstrating the payload executes before validation. - **Environment:** Python 3.12 venv with docling-core 2.48.3 and PyYAML 5.3.1 ## Recommendations / Next Steps - Upgrade to docling-core 2.48.4 or later, which uses `yaml.SafeLoader`. - If upgrading is not possible, explicitly use `yaml.safe_load` or `SafeLoader` when parsing untrusted YAML. - Add regression tests that feed malicious YAML payloads into `load_from_yaml` to ensure unsafe tags are rejected. ## Additional Notes - The reproduction script is idempotent and can be run multiple times; it overwrites the payload and marker file on each run. - Even though the YAML fails `DoclingDocument` validation, the exploit triggers during deserialization, so validation alone is insufficient protection.
One Command
Verify with pruva-verify
Run the Pruva CLI to automatically fetch and execute the reproduction script.
pruva-verify REPRO-2026-00080 or
pruva-verify GHSA-VQXF-V2GG-X3HC or
pruva-verify CVE-2026-24009 Install:
curl -fsSL https://pruva.dev/install.sh | sh Or Run Manually
1
Download the script
curl -O https://pruva.dev/api/v1/reproductions/REPRO-2026-00080/artifacts/reproduction_steps.sh 2
Make executable
chmod +x reproduction_steps.sh 3
Run the script
./reproduction_steps.sh Run in a VM, container, or disposable environment. This exploits a real vulnerability.
How Pruva Reproduced This
Watch the AI agent's step-by-step process.
Loading session...
Artifacts
No artifacts available