PYSEC-2026-373
LangChain serialization injection vulnerability enables secret extraction in dumps/loads APIs
상세
## Summary
A serialization injection vulnerability exists in LangChain's `dumps()` and `dumpd()` functions. The functions do not escape dictionaries with `'lc'` keys when serializing free-form dictionaries. The `'lc'` key is used internally by LangChain to mark serialized objects. When user-controlled data contains this key structure, it is treated as a legitimate LangChain object during deserialization rather than plain user data.
### Attack surface
The core vulnerability was in `dumps()` and `dumpd()`: these functions failed to escape user-controlled dictionaries containing `'lc'` keys. When this unescaped data was later deserialized via `load()` or `loads()`, the injected structures were treated as legitimate LangChain objects rather than plain user data.
This escaping bug enabled several attack vectors: 1. **Injection via user data**: Malicious LangChain object structures could be injected through user-controlled fields like `metadata`, `additional_kwargs`, or `response_metadata` 2. **Class instantiation within trusted namespaces**: Injected manifests could instantiate any `Serializable` subclass, but only within the pre-approved trusted namespaces (`langchain_core`, `langchain`, `langchain_community`). This includes classes with side effects in `__init__` (network calls, file operations, etc.). Note that namespace validation was already enforced before this patch, so arbitrary classes outside these trusted namespaces could not be instantiated.
### Security hardening
This patch fixes the escaping bug in `dumps()` and `dumpd()` and introduces new restrictive defaults in `load()` and `loads()`: allowlist enforcement via `allowed_objects="core"` (restricted to [serialization mappings](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/load/mapping.py)), `secrets_from_env` changed from `True` to `False`, and default Jinja2 template blocking via `init_validator`. These are breaking changes for some use cases.
## Who is affected?
Applications are vulnerable if they:
1. **Use `astream_events(version="v1")`** — The v1 implementation internally uses vulnerable serialization. Note: `astream_events(version="v2")` is not vulnerable. 2. **Use `Runnable.astream_log()`** — This method internally uses vulnerable serialization for streaming outputs. 3. **Call `dumps()` or `dumpd()` on untrusted data, then deserialize with `load()` or `loads()`** — Trusting your own serialization output makes you vulnerable if user-controlled data (e.g., from LLM responses, metadata fields, or user inputs) contains `'lc'` key structures. 4. **Deserialize untrusted data with `load()` or `loads()`** — Directly deserializing untrusted data that may contain injected `'lc'` structures. 5. **Use `RunnableWithMessageHistory`** — Internal serialization in message history handling. 6. **Use `InMemoryVectorStore.load()`** to deserialize untrusted documents. 7. Load untrusted generations from cache using **`langchain-community` caches**. 8. Load untrusted manifests from the LangChain Hub via **`hub.pull`**. 9. Use **`StringRunEvaluatorChain`** on untrusted runs. 10. Use **`create_lc_store`** or **`create_kv_docstore`** with untrusted documents. 11. Use **`MultiVectorRetriever`** with byte stores containing untrusted documents. 12. Use **`LangSmithRunChatLoader`** with runs containing untrusted messages.
The most common attack vector is through **LLM response fields** like `additional_kwargs` or `response_metadata`, which can be controlled via prompt injection and then serialized/deserialized in streaming operations.
## Impact
Attackers who control serialized data can extract environment variable secrets by injecting `{"lc": 1, "type": "secret", "id": ["ENV_VAR"]}` to load environment variables during deserialization (when `secrets_from_env=True`, which was the old default). They can also instantiate classes with controlled parameters by injecting constructor structures to instantiate any class within trusted namespaces with attacker-controlled parameters, potentially triggering side effects such as network calls or file operations.
Key severity factors:
- Affects the serialization path - applications trusting their own serialization output are vulnerable - Enables secret extraction when combined with `secrets_from_env=True` (the old default) - LLM responses in `additional_kwargs` can be controlled via prompt injection
## Exploit example
```python from langchain_core.load import dumps, load import os
# Attacker injects secret structure into user-controlled data attacker_dict = { "user_data": { "lc": 1, "type": "secret", "id": ["OPENAI_API_KEY"] } }
serialized = dumps(attacker_dict) # Bug: does NOT escape the 'lc' key
os.environ["OPENAI_API_KEY"] = "sk-secret-key-12345" deserialized = load(serialized, secrets_from_env=True)
print(deserialized["user_data"]) # "sk-secret-key-12345" - SECRET LEAKED!
```
## Security hardening changes (breaking changes)
This patch introduces three breaking changes to `load()` and `loads()`:
1. **New `allowed_objects` parameter** (defaults to `'core'`): Enforces allowlist of classes that can be deserialized. The `'all'` option corresponds to the list of objects [specified in `mappings.py`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/load/mapping.py) while the `'core'` option limits to objects within `langchain_core`. We recommend that users explicitly specify which objects they want to allow for serialization/deserialization. 2. **`secrets_from_env` default changed from `True` to `False`**: Disables automatic secret loading from environment 3. **New `init_validator` parameter** (defaults to `default_init_validator`): Blocks Jinja2 templates by default
## Migration guide
### No changes needed for most users
If you're deserializing standard LangChain types (messages, documents, prompts, trusted partner integrations like `ChatOpenAI`, `ChatAnthropic`, etc.), your code will work without changes:
```python from langchain_core.load import load
# Uses default allowlist from serialization mappings obj = load(serialized_data)
```
### For custom classes
If you're deserializing custom classes not in the serialization mappings, add them to the allowlist:
```python from langchain_core.load import load from my_package import MyCustomClass
# Specify the classes you need obj = load(serialized_data, allowed_objects=[MyCustomClass]) ```
### For Jinja2 templates
Jinja2 templates are now blocked by default because they can execute arbitrary code. If you need Jinja2 templates, pass `init_validator=None`: ```python from langchain_core.load import load from langchain_core.prompts import PromptTemplate
obj = load( serialized_data, allowed_objects=[PromptTemplate], init_validator=None )
```
> [!WARNING] > Only disable `init_validator` if you trust the serialized data. Jinja2 templates can execute arbitrary Python code.
### For secrets from environment
`secrets_from_env` now defaults to `False`. If you need to load secrets from environment variables:
```python from langchain_core.load import load
obj = load(serialized_data, secrets_from_env=True) ```
## Credits
* Dumps bug was reported by @yardenporat * Changes for security hardening due to findings from @0xn3va and @VladimirEliTokarev
이 버전이 영향받나요?
사용 중인 패키지 버전을 입력하면 즉시 평가합니다.
영향 패키지
참고
- https://github.com/langchain-ai/langchain/security/advisories/GHSA-c67j-w6g6-q2cm [WEB]
- https://nvd.nist.gov/vuln/detail/CVE-2025-68664 [ADVISORY]
- https://github.com/langchain-ai/langchain/pull/34455 [WEB]
- https://github.com/langchain-ai/langchain/pull/34458 [WEB]
- https://github.com/langchain-ai/langchain/commit/5ec0fa69de31bbe3d76e4cf9cd65a6accb8466c8 [WEB]
- https://github.com/langchain-ai/langchain/commit/d9ec4c5cc78960abd37da79b0250f5642e6f0ce6 [WEB]
- https://github.com/langchain-ai/langchain [PACKAGE]
- https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.3.81 [WEB]
- https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D1.2.5 [WEB]
- https://pypi.org/project/langchain-core [PACKAGE]
- https://github.com/advisories/GHSA-c67j-w6g6-q2cm [ADVISORY]