What's the vulnerability?

An RCE vulnerability has been identified in Microsoft Semantic Kernel Python SDK, specifically within the InMemoryVectorStore filter functionality.

Root Cause Analysis

# Root Cause Analysis Report
## GHSA-xjw9-4gw8-4rqx: Microsoft Semantic Kernel InMemoryVectorStore RCE

## Summary

The Microsoft Semantic Kernel Python SDK contains a critical Remote Code Execution (RCE) vulnerability in its `InMemoryVectorStore` filter functionality. The vulnerability allows attackers to escape the filter sandbox by accessing Python's dunder (double underscore) attributes such as `__class__`, `__bases__`, `__subclasses__`, `__mro__`, `__dict__`, and `__getattribute__` within filter expressions. These attributes can be chained together to traverse Python's object hierarchy and gain access to sensitive classes and functions, potentially leading to arbitrary code execution.

## Impact

**Package:** semantic-kernel (Python SDK)
**Affected Versions:** < 1.39.4
**Fixed Version:** 1.39.4
**Risk Level:** CRITICAL (CVSS 10.0)

**Consequences:**
- An attacker can escape the filter sandbox and execute arbitrary Python code
- The vulnerability can be triggered through user-controlled filter expressions
- Complete system compromise is possible through Python's introspection capabilities
- The InMemoryVectorStore is not recommended for production use, but this vulnerability affects any application using it for testing or development

## Root Cause

The vulnerability exists in the `_parse_and_validate_filter` method in `python/semantic_kernel/connectors/in_memory.py`. The method uses an allowlist approach to validate AST (Abstract Syntax Tree) nodes and function calls in filter expressions, but it fails to validate `ast.Attribute` nodes for dangerous attribute names.

### Technical Details:

1. The filter parser walks the AST and validates:
   - Node types against `allowed_filter_ast_nodes` (which includes `ast.Attribute`)
   - Name nodes against lambda parameter names
   - Function calls against `allowed_filter_functions`

2. **Missing validation:** The code does NOT check if `ast.Attribute` nodes access dangerous dunder attributes like:
   - `__class__` - Access to object's class
   - `__bases__` - Access to base classes
   - `__subclasses__` - Access to all subclasses of a class
   - `__mro__` - Method Resolution Order (object hierarchy)
   - `__dict__` - Access to object's attributes
   - `__getattribute__` - Access to attribute retrieval method

3. **Exploitation chain:** An attacker can chain these attributes to traverse from a simple data object to the root `object` class, enumerate all loaded classes via `__subclasses__()`, and find classes that expose dangerous functionality (like `warnings.catch_warnings` which can execute arbitrary code).

### Fix:

The fix (PR #13505) adds a `blocked_filter_attributes` set containing dangerous attribute names and validates all `ast.Attribute` nodes against this blocklist:

```python
# For Attribute nodes, validate that dangerous dunder attributes are not accessed
if isinstance(node, ast.Attribute) and node.attr in self.blocked_filter_attributes:
    raise VectorStoreOperationException(
        f"Access to attribute '{node.attr}' is not allowed in filter expressions. "
        "This attribute could be used to escape the filter sandbox."
    )
```

## Reproduction Steps

The reproduction script is located at `repro/reproduction_steps.sh`.

### What the script does:

1. Creates a Python virtual environment
2. Installs the vulnerable semantic-kernel version 1.39.3
3. Creates an InMemoryVectorStore with a test collection
4. Tests various filter expressions that access dangerous dunder attributes:
   - `lambda x: x.__class__.__name__ == 'TestDataModel'`
   - `lambda x: x.__class__.__base__ is not None`
   - `lambda x: x.__class__.__mro__ is not None`
   - `lambda x: x.__dict__ is not None`
   - `lambda x: x.__getattribute__ is not None`
   - `lambda x: x.__class__.__bases__ is not None`

### Expected evidence of reproduction:

All tests pass in the vulnerable version, demonstrating that dangerous dunder attributes can be accessed without restriction. The output shows:

```
[+] Test 1 PASSED: Filter with __class__ executed: True
[+] Test 2 PASSED: Filter with __base__ executed: True
[+] Test 3 PASSED: Filter with __mro__ executed: True
[+] Test 4 PASSED: Filter with __dict__ executed: True
[+] Test 5 PASSED: Filter with __getattribute__ executed: True
[+] Test 6 PASSED: Filter with __bases__ executed: True
[+] Test 7 PASSED: Filter with method access executed: True
```

## Evidence

**Log location:** `$ROOT/logs/` (created by reproduction script)

**Key excerpts from reproduction:**

The script confirmed that in semantic-kernel 1.39.3, the following dangerous filter expressions execute successfully:

1. `lambda x: x.__class__.__name__ == 'TestDataModel'` - Access to `__class__` attribute
2. `lambda x: x.__class__.__base__ is not None` - Access to `__base__` attribute
3. `lambda x: x.__class__.__mro__ is not None` - Access to `__mro__` attribute
4. `lambda x: x.__dict__ is not None` - Access to `__dict__` attribute
5. `lambda x: x.__getattribute__ is not None` - Access to `__getattribute__` attribute
6. `lambda x: x.__class__.__bases__ is not None` - Access to `__bases__` attribute

**Environment details:**
- Python 3.11
- semantic-kernel 1.39.3 (vulnerable)
- pydantic (dependency)
- numpy (dependency)
- scipy (dependency)

## Recommendations / Next Steps

### Immediate Actions:

1. **Upgrade to semantic-kernel 1.39.4 or later** - This version contains the fix that blocks dangerous dunder attributes in filter expressions.

2. **Avoid using InMemoryVectorStore in production** - Microsoft already recommends against using InMemoryVectorStore for production scenarios. Use a proper vector database instead (Azure AI Search, Redis, PostgreSQL with pgvector, etc.)

3. **Review existing code** - Check if any existing code uses string-based filters with the InMemoryVectorStore. If so, ensure the semantic-kernel version is upgraded.

### Testing Recommendations:

1. **Verify the fix** - After upgrading, test that filter expressions with blocked attributes raise `VectorStoreOperationException`:
   ```python
   # This should raise an exception after the fix
   filter_str = "lambda x: x.__class__.__name__ == 'TestDataModel'"
   ```

2. **Regression tests** - Ensure legitimate filter expressions still work:
   ```python
   # These should continue to work
   filter_str = "lambda x: x.content == 'test'"
   filter_str = "lambda x: x.id.startswith('prefix')"
   ```

### Long-term Recommendations:

1. **Input validation** - Never pass user-controlled or LLM-generated filter strings directly to the InMemoryVectorStore without strict validation.

2. **Security review** - Conduct security reviews of any code using dynamic filter expressions.

## Additional Notes

### Idempotency Confirmation:

The reproduction script is idempotent and passes two consecutive runs:
- First run: Installs dependencies and confirms vulnerability
- Second run: Reuses virtual environment, still confirms vulnerability

### Edge Cases:

1. **Fixed version behavior** - In semantic-kernel 1.39.4+, attempting to use blocked attributes in filter expressions will raise:
   ```
   VectorStoreOperationException: Access to attribute '__class__' is not allowed in filter expressions. This attribute could be used to escape the filter sandbox.
   ```

2. **Partial RCE chains** - While the reproduction demonstrates access to dangerous attributes, a full RCE chain would require additional steps to find and invoke a class that executes arbitrary code. The vulnerable version allows these chains; the fixed version blocks them at the attribute access level.

### References:

- GitHub Advisory: https://github.com/advisories/GHSA-xjw9-4gw8-4rqx
- CVE: CVE-2026-26030
- Fix PR: https://github.com/microsoft/semantic-kernel/pull/13505
- Release: https://github.com/microsoft/semantic-kernel/releases/tag/python-1.39.4
One Command

Verify with pruva-verify

Run the Pruva CLI to automatically fetch and execute the reproduction script.

pruva-verify REPRO-2026-00099
or pruva-verify GHSA-xjw9-4gw8-4rqx
or pruva-verify CVE-2026-26030
Install: curl -fsSL https://pruva.dev/install.sh | sh

Or Run Manually

1

Download the script

curl -O https://pruva.dev/api/v1/reproductions/REPRO-2026-00099/artifacts/reproduction_steps.sh
2

Make executable

chmod +x reproduction_steps.sh
3

Run the script

./reproduction_steps.sh
Run in a VM, container, or disposable environment. This exploits a real vulnerability.

How Pruva Reproduced This

Watch the AI agent's step-by-step process.

Loading session...

Artifacts

No artifacts available