# REPRO-2026-00099: Semantic Kernel: RCE via InMemoryVectorStore Filter ## Summary Status: published Severity: critical Type: security Confidence: Unknown ## Identifiers REPRO ID: REPRO-2026-00099 GHSA: GHSA-xjw9-4gw8-4rqx CVE: CVE-2026-26030 ## Package Name: semantic-kernel Ecosystem: pip Affected: < 1.39.4 Fixed: 1.39.4 ## Root Cause # Root Cause Analysis Report ## GHSA-xjw9-4gw8-4rqx: Microsoft Semantic Kernel InMemoryVectorStore RCE ## Summary The Microsoft Semantic Kernel Python SDK contains a critical Remote Code Execution (RCE) vulnerability in its `InMemoryVectorStore` filter functionality. The vulnerability allows attackers to escape the filter sandbox by accessing Python's dunder (double underscore) attributes such as `__class__`, `__bases__`, `__subclasses__`, `__mro__`, `__dict__`, and `__getattribute__` within filter expressions. These attributes can be chained together to traverse Python's object hierarchy and gain access to sensitive classes and functions, potentially leading to arbitrary code execution. ## Impact **Package:** semantic-kernel (Python SDK) **Affected Versions:** < 1.39.4 **Fixed Version:** 1.39.4 **Risk Level:** CRITICAL (CVSS 10.0) **Consequences:** - An attacker can escape the filter sandbox and execute arbitrary Python code - The vulnerability can be triggered through user-controlled filter expressions - Complete system compromise is possible through Python's introspection capabilities - The InMemoryVectorStore is not recommended for production use, but this vulnerability affects any application using it for testing or development ## Root Cause The vulnerability exists in the `_parse_and_validate_filter` method in `python/semantic_kernel/connectors/in_memory.py`. The method uses an allowlist approach to validate AST (Abstract Syntax Tree) nodes and function calls in filter expressions, but it fails to validate `ast.Attribute` nodes for dangerous attribute names. ### Technical Details: 1. The filter parser walks the AST and validates: - Node types against `allowed_filter_ast_nodes` (which includes `ast.Attribute`) - Name nodes against lambda parameter names - Function calls against `allowed_filter_functions` 2. **Missing validation:** The code does NOT check if `ast.Attribute` nodes access dangerous dunder attributes like: - `__class__` - Access to object's class - `__bases__` - Access to base classes - `__subclasses__` - Access to all subclasses of a class - `__mro__` - Method Resolution Order (object hierarchy) - `__dict__` - Access to object's attributes - `__getattribute__` - Access to attribute retrieval method 3. **Exploitation chain:** An attacker can chain these attributes to traverse from a simple data object to the root `object` class, enumerate all loaded classes via `__subclasses__()`, and find classes that expose dangerous functionality (like `warnings.catch_warnings` which can execute arbitrary code). ### Fix: The fix (PR #13505) adds a `blocked_filter_attributes` set containing dangerous attribute names and validates all `ast.Attribute` nodes against this blocklist: ```python # For Attribute nodes, validate that dangerous dunder attributes are not accessed if isinstance(node, ast.Attribute) and node.attr in self.blocked_filter_attributes: raise VectorStoreOperationException( f"Access to attribute '{node.attr}' is not allowed in filter expressions. " "This attribute could be used to escape the filter sandbox." ) ``` ## Reproduction Steps The reproduction script is located at `repro/reproduction_steps.sh`. ### What the script does: 1. Creates a Python virtual environment 2. Installs the vulnerable semantic-kernel version 1.39.3 3. Creates an InMemoryVectorStore with a test collection 4. Tests various filter expressions that access dangerous dunder attributes: - `lambda x: x.__class__.__name__ == 'TestDataModel'` - `lambda x: x.__class__.__base__ is not None` - `lambda x: x.__class__.__mro__ is not None` - `lambda x: x.__dict__ is not None` - `lambda x: x.__getattribute__ is not None` - `lambda x: x.__class__.__bases__ is not None` ### Expected evidence of reproduction: All tests pass in the vulnerable version, demonstrating that dangerous dunder attributes can be accessed without restriction. The output shows: ``` [+] Test 1 PASSED: Filter with __class__ executed: True [+] Test 2 PASSED: Filter with __base__ executed: True [+] Test 3 PASSED: Filter with __mro__ executed: True [+] Test 4 PASSED: Filter with __dict__ executed: True [+] Test 5 PASSED: Filter with __getattribute__ executed: True [+] Test 6 PASSED: Filter with __bases__ executed: True [+] Test 7 PASSED: Filter with method access executed: True ``` ## Evidence **Log location:** `$ROOT/logs/` (created by reproduction script) **Key excerpts from reproduction:** The script confirmed that in semantic-kernel 1.39.3, the following dangerous filter expressions execute successfully: 1. `lambda x: x.__class__.__name__ == 'TestDataModel'` - Access to `__class__` attribute 2. `lambda x: x.__class__.__base__ is not None` - Access to `__base__` attribute 3. `lambda x: x.__class__.__mro__ is not None` - Access to `__mro__` attribute 4. `lambda x: x.__dict__ is not None` - Access to `__dict__` attribute 5. `lambda x: x.__getattribute__ is not None` - Access to `__getattribute__` attribute 6. `lambda x: x.__class__.__bases__ is not None` - Access to `__bases__` attribute **Environment details:** - Python 3.11 - semantic-kernel 1.39.3 (vulnerable) - pydantic (dependency) - numpy (dependency) - scipy (dependency) ## Recommendations / Next Steps ### Immediate Actions: 1. **Upgrade to semantic-kernel 1.39.4 or later** - This version contains the fix that blocks dangerous dunder attributes in filter expressions. 2. **Avoid using InMemoryVectorStore in production** - Microsoft already recommends against using InMemoryVectorStore for production scenarios. Use a proper vector database instead (Azure AI Search, Redis, PostgreSQL with pgvector, etc.) 3. **Review existing code** - Check if any existing code uses string-based filters with the InMemoryVectorStore. If so, ensure the semantic-kernel version is upgraded. ### Testing Recommendations: 1. **Verify the fix** - After upgrading, test that filter expressions with blocked attributes raise `VectorStoreOperationException`: ```python # This should raise an exception after the fix filter_str = "lambda x: x.__class__.__name__ == 'TestDataModel'" ``` 2. **Regression tests** - Ensure legitimate filter expressions still work: ```python # These should continue to work filter_str = "lambda x: x.content == 'test'" filter_str = "lambda x: x.id.startswith('prefix')" ``` ### Long-term Recommendations: 1. **Input validation** - Never pass user-controlled or LLM-generated filter strings directly to the InMemoryVectorStore without strict validation. 2. **Security review** - Conduct security reviews of any code using dynamic filter expressions. ## Additional Notes ### Idempotency Confirmation: The reproduction script is idempotent and passes two consecutive runs: - First run: Installs dependencies and confirms vulnerability - Second run: Reuses virtual environment, still confirms vulnerability ### Edge Cases: 1. **Fixed version behavior** - In semantic-kernel 1.39.4+, attempting to use blocked attributes in filter expressions will raise: ``` VectorStoreOperationException: Access to attribute '__class__' is not allowed in filter expressions. This attribute could be used to escape the filter sandbox. ``` 2. **Partial RCE chains** - While the reproduction demonstrates access to dangerous attributes, a full RCE chain would require additional steps to find and invoke a class that executes arbitrary code. The vulnerable version allows these chains; the fixed version blocks them at the attribute access level. ### References: - GitHub Advisory: https://github.com/advisories/GHSA-xjw9-4gw8-4rqx - CVE: CVE-2026-26030 - Fix PR: https://github.com/microsoft/semantic-kernel/pull/13505 - Release: https://github.com/microsoft/semantic-kernel/releases/tag/python-1.39.4 ## Reproduction Details Reproduced: 2026-02-19T21:13:51.862Z Duration: 1513 seconds Tool calls: 114 Turns: 89 Handoffs: 2 ## Quick Verification Run one of these commands to verify locally: pruva-verify REPRO-2026-00099 pruva-verify GHSA-xjw9-4gw8-4rqx pruva-verify CVE-2026-26030 Or open in GitHub Codespaces (zero-friction, auto-runs): https://github.com/codespaces/new?ref=repro/REPRO-2026-00099&repo=N3mes1s/pruva-sandbox Or download and run the script manually: curl -O https://api.pruva.dev/v1/reproductions/REPRO-2026-00099/artifacts/repro/reproduction_steps.sh chmod +x reproduction_steps.sh ./reproduction_steps.sh WARNING: Run in a sandboxed environment. This exploits a real vulnerability. ## References - GitHub Advisory: https://github.com/advisories/GHSA-xjw9-4gw8-4rqx - NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-26030 ## Artifacts - repro/rca_report.md (analysis, 7835 bytes) - repro/reproduction_steps.sh (reproduction_script, 6495 bytes) - bundle/ticket.md (ticket, 1801 bytes) - bundle/source.json (other, 6303 bytes) - bundle/ticket.json (other, 8826 bytes) - logs/test_variant_1.39.4.py (script, 7750 bytes) - logs/variant_test_1.39.4.log (log, 5000 bytes) - logs/variant_test_1.39.3.log (log, 3743 bytes) - logs/test_variant_1.39.3.py (script, 7750 bytes) ## API Access - JSON: https://api.pruva.dev/v1/reproductions/REPRO-2026-00099 - Script: https://api.pruva.dev/v1/reproductions/REPRO-2026-00099/artifacts/repro/reproduction_steps.sh - Web: https://pruva.dev/r/REPRO-2026-00099 ## For AI Assistants To help users with this reproduction: 1. The `pruva-verify` command is the simplest way to run the reproduction 2. The script downloads and runs automatically in an isolated temp directory 3. Always recommend running in a VM, container, or disposable environment 4. The root cause explains the technical vulnerability --- Generated by Pruva | https://pruva.dev