# REPRO-2026-00208: Oj Ruby gem uninitialized stack memory leak via long JSON keys ## Summary Status: published Severity: medium Type: security Confidence: high ## Identifiers REPRO ID: REPRO-2026-00208 CVE: CVE-2026-54500 ## Package Name: Unknown Ecosystem: Unknown Affected: Unknown Fixed: Unknown ## Root Cause # RCA Report — CVE-2026-54500 ## Summary Oj (Optimized JSON), a Ruby gem with a C extension, contains an uninitialized stack memory read in `ext/oj/intern.c`'s `form_attr()` function. When `Oj.load` parses a JSON object in `:object` mode whose key is 254 bytes or longer, the long-key code path allocates a heap buffer `b`, correctly fills it with the attribute name, then frees it — but passes the **uninitialized** 256-byte stack buffer `buf` (not `b`) to `rb_intern3()`. Ruby therefore interns `len + 1` bytes of uninitialized stack memory (and, for keys ≥ 256 bytes, reads out of bounds past `buf`). The leaked bytes surface to the caller via the produced Symbol or via the `EncodingError` message raised when the stack garbage is not valid UTF-8, disclosing process stack contents. The fix is a single-character change: `rb_intern3(buf, ...)` → `rb_intern3(b, ...)`. ## Impact - **Package/component:** `ohler55/oj` — C extension, `ext/oj/intern.c`, `form_attr()` - **Affected versions:** Oj 0.0.1 – 3.17.2 (fixed in 3.17.3) - **Risk level:** Medium - **Consequences:** Information disclosure of process stack memory. An attacker who controls the JSON input (a key ≥ 254 bytes) can cause `Oj.load` to read and surface uninitialized stack bytes. The leak is observable through the `EncodingError` exception message (which embeds the invalid bytes) or through the produced Symbol object. The exact bytes and message length vary between process invocations, confirming the source is uninitialized (non-deterministic) memory. ## Impact Parity - **Disclosed/claimed maximum impact:** Uninitialized stack memory read / out-of-bounds read, leaking process stack contents via Symbol or EncodingError message. - **Reproduced impact from this run:** Uninitialized stack memory read confirmed. Every vulnerable run raised an `EncodingError` whose message contained 1262–1423 bytes of non-input (leaked stack) data, with message lengths varying across runs (1276–1432 bytes). The fixed version produced the correct, deterministic attribute name with zero leaked bytes. - **Parity:** `full` — the disclosed information-disclosure symptom (uninitialized stack memory surfacing via the EncodingError message, with per-run variation) was reproduced exactly, and the negative control on the fixed commit confirmed the fix. - **Not demonstrated:** No code execution was claimed or demonstrated; this is an information-disclosure / memory-read bug, not a code-execution vulnerability. ## Root Cause In `ext/oj/intern.c`, `form_attr(const char *str, size_t len)` converts a JSON object key into a Ruby attribute ID (interned symbol). It declares a 256-byte stack buffer `buf` (uninitialized) and branches on key length: ```c static VALUE form_attr(const char *str, size_t len) { char buf[256]; // UNINITIALIZED if (sizeof(buf) - 2 <= len) { // long-key path: len >= 254 char *b = OJ_R_ALLOC_N(char, len + 2); // heap buffer ID id; // ... b is filled correctly with '@' + key + '\0' ... id = rb_intern3(buf, len + 1, oj_utf8_encoding); // BUG: reads `buf`, not `b` OJ_R_FREE(b); return id; } // short-key path: buf IS properly filled before use (correct) ... return (VALUE)rb_intern3(buf, len + 1, oj_utf8_encoding); } ``` In the long-key path, `b` is the correctly-populated heap buffer, but `rb_intern3` is called with `buf` — the uninitialized stack buffer. `rb_intern3` reads `len + 1` bytes from `buf`. When `len >= 256`, this also reads out of bounds past the 256-byte `buf`. The bytes are interned as a symbol; if they are not valid UTF-8, Ruby raises an `EncodingError` whose message includes the offending bytes, leaking them to the caller. This is a duplicate of an earlier fix in `ext/oj/usual.c` that was missed in `intern.c`. **Call path:** `Oj.load(json, mode: :object)` → `object.c:oj_set_obj_ivar()` → `intern.c:oj_attr_intern()` → `cache.c:cache_intern()` → `intern.c:form_attr()`. Since `CACHE_MAX_KEY` is 35, keys ≥ 35 bytes bypass the cache and call `form_attr` directly every time, so the uninitialized read occurs on every invocation with a long key. **Fix commit:** `bbde91a679728f94c4492ebc3683f4fa3309049f` ("Fix intern.c and fast.c (#1015)") — changes `rb_intern3(buf, len + 1, oj_utf8_encoding)` to `rb_intern3(b, len + 1, oj_utf8_encoding)` in the long-key path of `form_attr()`. ## Reproduction Steps 1. **Reference:** `bundle/repro/reproduction_steps.sh` (self-contained, idempotent). 2. **What the script does:** - Installs Ruby + build tools, clones (or reuses) `ohler55/oj`. - Checks out the **vulnerable** commit `495cc38` (v3.17.2, parent of the fix), builds the C extension via `ruby extconf.rb && make`. - Runs `Oj.load('{"^o":"Oj::Bag","AAA...300...AAA":1}', mode: :object)` in 6 separate Ruby processes. The `^o:Oj::Bag` marker creates a non-Hash object so that `oj_set_obj_ivar` → `oj_attr_intern` → `form_attr` is invoked. - Checks out the **fixed** commit `bbde91a`, rebuilds, and runs the same probe 6 times as a negative control. - Compares results, writes `runtime_manifest.json`, and exits 0 if confirmed. 3. **Expected evidence:** - Vulnerable: all runs raise `EncodingError`; message lengths vary per run (1276–1432 bytes), with 1262–1423 non-`A` (leaked stack) bytes. - Fixed: all runs return an `Oj::Bag` with a single 301-byte instance variable `@AAA...` (0x40 + 300×0x41), deterministic across all runs. ## Evidence - **Log:** `bundle/logs/reproduction_steps.log` — full build + probe transcript. - **Vulnerable outcomes:** `bundle/logs/vuln_outcomes.txt` - **Fixed outcomes:** `bundle/logs/fixed_outcomes.txt` - **Message-length variation:** `bundle/logs/vuln_msg_lengths.txt` - **Probe script:** `bundle/repro/probe.rb` - **Runtime manifest:** `bundle/repro/runtime_manifest.json` ### Key excerpts (from the second verification run) **Vulnerable (commit 495cc38, v3.17.2) — all 6 runs leak:** ``` [vuln run 1] encoding_error MSG_LEN=1348 NON_A_BYTES=1339 [vuln run 2] encoding_error MSG_LEN=1349 NON_A_BYTES=1341 [vuln run 3] encoding_error MSG_LEN=1350 NON_A_BYTES=1343 [vuln run 4] encoding_error MSG_LEN=1276 NON_A_BYTES=1262 [vuln run 5] encoding_error MSG_LEN=1432 NON_A_BYTES=1423 [vuln run 6] encoding_error MSG_LEN=1368 NON_A_BYTES=1343 ``` The `EncodingError` message begins `invalid symbol in encoding UTF-8 :"` followed by Ruby `\xNN` escapes of the leaked stack bytes (e.g. `\xB8\xFF`, `\xD8\xFF`, `\xC0\xFF`) — these are pointers/binary data, not the 0x41 (`A`) input bytes. The message length varies across runs (1348–1432), which is impossible for deterministic, initialized data and confirms the source is uninitialized stack memory. **Fixed (commit bbde91a) — all 6 runs clean:** ``` [fixed run 1] parsed IVAR_LEN=301 CORRECT_ATTR=true FIRST_BYTES=40414141... [fixed run 2] parsed IVAR_LEN=301 CORRECT_ATTR=true FIRST_BYTES=40414141... ... (identical for all 6 runs) ``` `FIRST_BYTES` = `40` (`@`) + `41` (`A`) repeated — the correct, deterministic attribute name. No `EncodingError`, no leaked bytes. ### Environment - Ruby 3.3.8 (x86_64-linux-gnu), GCC 15.2.0, Ubuntu. - Oj built from source at vulnerable commit `495cc38` and fixed commit `bbde91a`. ## Recommendations / Next Steps - **Upgrade to Oj 3.17.3+** which contains the one-character fix. - **Audit `ext/oj/usual.c` and any other copies** of the `form_attr` pattern for the same `buf`/`b` confusion (this was already a duplicate of a `usual.c` fix). - **Add a regression test** that parses a JSON object with a ≥ 254-byte key in `:object` mode and asserts the resulting attribute name matches the input. - Consider compiling with `-ftrivial-auto-var-init=pattern` to make uninitialized reads more visible in CI, and enabling MSan/ASan in the test suite. ## Additional Notes - **Idempotency:** The script was run twice consecutively; both runs exited 0 with `CONFIRMED=true`. The script cleans all build artifacts between vulnerable/fixed builds (`git clean -fdx ext/oj lib/oj`) and uses a manual `extconf.rb + make` flow (avoiding `rake compile`, which loads bundler and can interfere with the git checkout state). - **Key-length boundary:** The bug triggers at `len >= 254` (`sizeof(buf) - 2 = 254`). At `len >= 256` the read also goes out of bounds past the 256-byte `buf`. The reproduction uses a 300-byte key to exercise both the uninitialized read and the OOB read. - **Cache bypass:** Because `CACHE_MAX_KEY = 35`, the 300-byte key bypasses the attribute cache entirely, so `form_attr` is called fresh on every invocation — maximizing the observable per-run variation. ## Reproduction Details Reproduced: 2026-07-02T19:44:32.031Z Duration: 1241 seconds Tool calls: 153 Turns: Unknown Handoffs: 2 ## Quick Verification Run one of these commands to verify locally: pruva-verify REPRO-2026-00208 pruva-verify CVE-2026-54500 Or open in GitHub Codespaces (zero-friction, auto-runs): https://github.com/codespaces/new?ref=repro/REPRO-2026-00208&repo=N3mes1s/pruva-sandbox Or download and run the script manually: curl -O https://api.pruva.dev/v1/reproductions/REPRO-2026-00208/artifacts/bundle/repro/reproduction_steps.sh chmod +x reproduction_steps.sh ./reproduction_steps.sh WARNING: Run in a sandboxed environment. This exploits a real vulnerability. ## References - NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-54500 - Source: ohler55/oj ## Artifacts - bundle/repro/reproduction_steps.sh (reproduction_script, 13652 bytes) - bundle/repro/rca_report.md (analysis, 8821 bytes) - bundle/vuln_variant/reproduction_steps.sh (reproduction_script, 11122 bytes) - bundle/vuln_variant/rca_report.md (analysis, 12476 bytes) - bundle/ticket.md (ticket, 1027 bytes) - bundle/ticket.json (other, 1441 bytes) - bundle/repro/probe.rb (other, 1413 bytes) - bundle/repro/runtime_manifest.json (other, 721 bytes) - bundle/repro/validation_verdict.json (other, 878 bytes) - bundle/logs/reproduction_steps.log (log, 17882 bytes) - bundle/logs/vuln_outcomes.txt (other, 1002 bytes) - bundle/logs/vuln_msg_lengths.txt (other, 30 bytes) - bundle/logs/vuln_run1.bin (other, 1348 bytes) - bundle/logs/vuln_run2.bin (other, 1349 bytes) - bundle/logs/vuln_run3.bin (other, 1350 bytes) - bundle/logs/vuln_run4.bin (other, 1276 bytes) - bundle/logs/vuln_run5.bin (other, 1432 bytes) - bundle/logs/vuln_run6.bin (other, 1368 bytes) - bundle/logs/vuln_leak_count (other, 2 bytes) - bundle/logs/vuln_err_count (other, 2 bytes) - bundle/logs/vuln_correct_count (other, 2 bytes) - bundle/logs/fixed_outcomes.txt (other, 774 bytes) - bundle/logs/fixed_msg_lengths.txt (other, 0 bytes) - bundle/logs/fixed_run1.bin (other, 301 bytes) - bundle/logs/fixed_run2.bin (other, 301 bytes) - bundle/logs/fixed_run3.bin (other, 301 bytes) - bundle/logs/fixed_run4.bin (other, 301 bytes) - bundle/logs/fixed_run5.bin (other, 301 bytes) - bundle/logs/fixed_run6.bin (other, 301 bytes) - bundle/logs/fixed_leak_count (other, 2 bytes) - bundle/logs/fixed_err_count (other, 2 bytes) - bundle/logs/fixed_correct_count (other, 2 bytes) - bundle/logs/vuln_variant_repro.log (log, 13939 bytes) - bundle/logs/vuln_variant_outcomes.txt (other, 9940 bytes) - bundle/logs/fixed_variant_outcomes.txt (other, 9408 bytes) - bundle/logs/vuln_variant/fixed_version.txt (other, 131 bytes) - bundle/vuln_variant/probe_variant.rb (other, 5239 bytes) - bundle/vuln_variant/runtime_manifest.json (other, 862 bytes) - bundle/vuln_variant/patch_analysis.md (documentation, 6424 bytes) - bundle/vuln_variant/variant_manifest.json (other, 4135 bytes) - bundle/vuln_variant/validation_verdict.json (other, 3747 bytes) - bundle/vuln_variant/source_identity.json (other, 1511 bytes) - bundle/vuln_variant/root_cause_equivalence.json (other, 2434 bytes) ## API Access - JSON: https://api.pruva.dev/v1/reproductions/REPRO-2026-00208 - Script: https://api.pruva.dev/v1/reproductions/REPRO-2026-00208/artifacts/bundle/repro/reproduction_steps.sh - Web: https://pruva.dev/r/REPRO-2026-00208 ## For AI Assistants To help users with this reproduction: 1. The `pruva-verify` command is the simplest way to run the reproduction 2. The script downloads and runs automatically in an isolated temp directory 3. Always recommend running in a VM, container, or disposable environment 4. The root cause explains the technical vulnerability --- Generated by Pruva | https://pruva.dev