# REPRO-2026-00099: Semantic Kernel: RCE via InMemoryVectorStore Filter

## Summary
Status: published
Severity: critical
Type: security
Confidence: Unknown

## Identifiers
REPRO ID: REPRO-2026-00099
GHSA: GHSA-xjw9-4gw8-4rqx
CVE: CVE-2026-26030

## Package
Name: semantic-kernel
Ecosystem: pip
Affected: < 1.39.4
Fixed: 1.39.4

## Root Cause
# Root Cause Analysis Report
## GHSA-xjw9-4gw8-4rqx: Microsoft Semantic Kernel InMemoryVectorStore RCE

## Summary

The Microsoft Semantic Kernel Python SDK contains a critical Remote Code Execution (RCE) vulnerability in its `InMemoryVectorStore` filter functionality. The vulnerability allows attackers to escape the filter sandbox by accessing Python's dunder (double underscore) attributes such as `__class__`, `__bases__`, `__subclasses__`, `__mro__`, `__dict__`, and `__getattribute__` within filter expressions. These attributes can be chained together to traverse Python's object hierarchy and gain access to sensitive classes and functions, potentially leading to arbitrary code execution.

## Impact

**Package:** semantic-kernel (Python SDK)
**Affected Versions:** < 1.39.4
**Fixed Version:** 1.39.4
**Risk Level:** CRITICAL (CVSS 10.0)

**Consequences:**
- An attacker can escape the filter sandbox and execute arbitrary Python code
- The vulnerability can be triggered through user-controlled filter expressions
- Complete system compromise is possible through Python's introspection capabilities
- The InMemoryVectorStore is not recommended for production use, but this vulnerability affects any application using it for testing or development

## Root Cause

The vulnerability exists in the `_parse_and_validate_filter` method in `python/semantic_kernel/connectors/in_memory.py`. The method uses an allowlist approach to validate AST (Abstract Syntax Tree) nodes and function calls in filter expressions, but it fails to validate `ast.Attribute` nodes for dangerous attribute names.

### Technical Details:

1. The filter parser walks the AST and validates:
   - Node types against `allowed_filter_ast_nodes` (which includes `ast.Attribute`)
   - Name nodes against lambda parameter names
   - Function calls against `allowed_filter_functions`

2. **Missing validation:** The code does NOT check if `ast.Attribute` nodes access dangerous dunder attributes like:
   - `__class__` - Access to object's class
   - `__bases__` - Access to base classes
   - `__subclasses__` - Access to all subclasses of a class
   - `__mro__` - Method Resolution Order (object hierarchy)
   - `__dict__` - Access to object's attributes
   - `__getattribute__` - Access to attribute retrieval method

3. **Exploitation chain:** An attacker can chain these attributes to traverse from a simple data object to the root `object` class, enumerate all loaded classes via `__subclasses__()`, and find classes that expose dangerous functionality (like `warnings.catch_warnings` which can execute arbitrary code).

### Fix:

The fix (PR #13505) adds a `blocked_filter_attributes` set containing dangerous attribute names and validates all `ast.Attribute` nodes against this blocklist:

```python
# For Attribute nodes, validate that dangerous dunder attributes are not accessed
if isinstance(node, ast.Attribute) and node.attr in self.blocked_filter_attributes:
    raise VectorStoreOperationException(
        f"Access to attribute '{node.attr}' is not allowed in filter expressions. "
        "This attribute could be used to escape the filter sandbox."
    )
```

## Reproduction Steps

The reproduction script is located at `repro/reproduction_steps.sh`.

### What the script does:

1. Creates a Python virtual environment
2. Installs the vulnerable semantic-kernel version 1.39.3
3. Creates an InMemoryVectorStore with a test collection
4. Tests various filter expressions that access dangerous dunder attributes:
   - `lambda x: x.__class__.__name__ == 'TestDataModel'`
   - `lambda x: x.__class__.__base__ is not None`
   - `lambda x: x.__class__.__mro__ is not None`
   - `lambda x: x.__dict__ is not None`
   - `lambda x: x.__getattribute__ is not None`
   - `lambda x: x.__class__.__bases__ is not None`

### Expected evidence of reproduction:

All tests pass in the vulnerable version, demonstrating that dangerous dunder attributes can be accessed without restriction. The output shows:

```
[+] Test 1 PASSED: Filter with __class__ executed: True
[+] Test 2 PASSED: Filter with __base__ executed: True
[+] Test 3 PASSED: Filter with __mro__ executed: True
[+] Test 4 PASSED: Filter with __dict__ executed: True
[+] Test 5 PASSED: Filter with __getattribute__ executed: True
[+] Test 6 PASSED: Filter with __bases__ executed: True
[+] Test 7 PASSED: Filter with method access executed: True
```

## Evidence

**Log location:** `$ROOT/logs/` (created by reproduction script)

**Key excerpts from reproduction:**

The script confirmed that in semantic-kernel 1.39.3, the following dangerous filter expressions execute successfully:

1. `lambda x: x.__class__.__name__ == 'TestDataModel'` - Access to `__class__` attribute
2. `lambda x: x.__class__.__base__ is not None` - Access to `__base__` attribute
3. `lambda x: x.__class__.__mro__ is not None` - Access to `__mro__` attribute
4. `lambda x: x.__dict__ is not None` - Access to `__dict__` attribute
5. `lambda x: x.__getattribute__ is not None` - Access to `__getattribute__` attribute
6. `lambda x: x.__class__.__bases__ is not None` - Access to `__bases__` attribute

**Environment details:**
- Python 3.11
- semantic-kernel 1.39.3 (vulnerable)
- pydantic (dependency)
- numpy (dependency)
- scipy (dependency)

## Recommendations / Next Steps

### Immediate Actions:

1. **Upgrade to semantic-kernel 1.39.4 or later** - This version contains the fix that blocks dangerous dunder attributes in filter expressions.

2. **Avoid using InMemoryVectorStore in production** - Microsoft already recommends against using InMemoryVectorStore for production scenarios. Use a proper vector database instead (Azure AI Search, Redis, PostgreSQL with pgvector, etc.)

3. **Review existing code** - Check if any existing code uses string-based filters with the InMemoryVectorStore. If so, ensure the semantic-kernel version is upgraded.

### Testing Recommendations:

1. **Verify the fix** - After upgrading, test that filter expressions with blocked attributes raise `VectorStoreOperationException`:
   ```python
   # This should raise an exception after the fix
   filter_str = "lambda x: x.__class__.__name__ == 'TestDataModel'"
   ```

2. **Regression tests** - Ensure legitimate filter expressions still work:
   ```python
   # These should continue to work
   filter_str = "lambda x: x.content == 'test'"
   filter_str = "lambda x: x.id.startswith('prefix')"
   ```

### Long-term Recommendations:

1. **Input validation** - Never pass user-controlled or LLM-generated filter strings directly to the InMemoryVectorStore without strict validation.

2. **Security review** - Conduct security reviews of any code using dynamic filter expressions.

## Additional Notes

### Idempotency Confirmation:

The reproduction script is idempotent and passes two consecutive runs:
- First run: Installs dependencies and confirms vulnerability
- Second run: Reuses virtual environment, still confirms vulnerability

### Edge Cases:

1. **Fixed version behavior** - In semantic-kernel 1.39.4+, attempting to use blocked attributes in filter expressions will raise:
   ```
   VectorStoreOperationException: Access to attribute '__class__' is not allowed in filter expressions. This attribute could be used to escape the filter sandbox.
   ```

2. **Partial RCE chains** - While the reproduction demonstrates access to dangerous attributes, a full RCE chain would require additional steps to find and invoke a class that executes arbitrary code. The vulnerable version allows these chains; the fixed version blocks them at the attribute access level.

### References:

- GitHub Advisory: https://github.com/advisories/GHSA-xjw9-4gw8-4rqx
- CVE: CVE-2026-26030
- Fix PR: https://github.com/microsoft/semantic-kernel/pull/13505
- Release: https://github.com/microsoft/semantic-kernel/releases/tag/python-1.39.4


## Reproduction Details
Reproduced: 2026-02-19T21:13:51.862Z
Duration: 1513 seconds
Tool calls: 114
Turns: 89
Handoffs: 2


## Quick Verification
Run one of these commands to verify locally:

    pruva-verify REPRO-2026-00099
    pruva-verify GHSA-xjw9-4gw8-4rqx
    pruva-verify CVE-2026-26030

Or open in GitHub Codespaces (zero-friction, auto-runs):

    https://github.com/codespaces/new?ref=repro/REPRO-2026-00099&repo=N3mes1s/pruva-sandbox

Or download and run the script manually:

    curl -O https://api.pruva.dev/v1/reproductions/REPRO-2026-00099/artifacts/repro/reproduction_steps.sh
    chmod +x reproduction_steps.sh
    ./reproduction_steps.sh

WARNING: Run in a sandboxed environment. This exploits a real vulnerability.

## References
- GitHub Advisory: https://github.com/advisories/GHSA-xjw9-4gw8-4rqx
- NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-26030


## Artifacts
- repro/rca_report.md (analysis, 7835 bytes)
- repro/reproduction_steps.sh (reproduction_script, 6495 bytes)
- bundle/ticket.md (ticket, 1801 bytes)
- bundle/source.json (other, 6303 bytes)
- bundle/ticket.json (other, 8826 bytes)
- logs/test_variant_1.39.4.py (script, 7750 bytes)
- logs/variant_test_1.39.4.log (log, 5000 bytes)
- logs/variant_test_1.39.3.log (log, 3743 bytes)
- logs/test_variant_1.39.3.py (script, 7750 bytes)

## API Access
- JSON: https://api.pruva.dev/v1/reproductions/REPRO-2026-00099
- Script: https://api.pruva.dev/v1/reproductions/REPRO-2026-00099/artifacts/repro/reproduction_steps.sh
- Web: https://pruva.dev/r/REPRO-2026-00099

## For AI Assistants
To help users with this reproduction:
1. The `pruva-verify` command is the simplest way to run the reproduction
2. The script downloads and runs automatically in an isolated temp directory
3. Always recommend running in a VM, container, or disposable environment
4. The root cause explains the technical vulnerability

---
Generated by Pruva | https://pruva.dev