GHSA-rf74-v2fm-23pw
Natural Language Toolkit (NLTK) has unbounded recursion in JSONTaggedDecoder.decode_obj() may cause DoS
Details
### Summary `JSONTaggedDecoder.decode_obj()` in `nltk/jsontags.py` calls itself recursively without any depth limit. A deeply nested JSON structure exceeding `sys.getrecursionlimit()` (default: 1000) will raise an unhandled `RecursionError`, crashing the Python process.
### Affected code File: `nltk/jsontags.py`, lines 47–52 ```python @classmethod def decode_obj(cls, obj): if isinstance(obj, dict): obj = {key: cls.decode_obj(val) for (key, val) in obj.items()} elif isinstance(obj, list): obj = list(cls.decode_obj(val) for val in obj) ```
### Proof of Concept ```python import sys, json from nltk.jsontags import JSONTaggedDecoder
depth = sys.getrecursionlimit() + 50 # e.g. 1050 payload = '{"x":' * depth + "null" + "}" * depth
# Raises RecursionError, crashing the process json.loads(payload, cls=JSONTaggedDecoder) ```
### Impact Any code path that passes externally-supplied JSON to `JSONTaggedDecoder` is vulnerable to denial of service. The severity depends on whether such a path exists in the calling code (e.g. `nltk/data.py`).
### Suggested Fix Add a depth parameter with a hard limit: ```python @classmethod def decode_obj(cls, obj, _depth=0): if _depth > 100: raise ValueError("JSON nesting too deep") if isinstance(obj, dict): obj = {key: cls.decode_obj(val, _depth + 1) for (key, val) in obj.items()} elif isinstance(obj, list): obj = list(cls.decode_obj(val, _depth + 1) for val in obj) ```
Are you affected?
Enter the version of the package you're using.
Affected packages
0 No fixed version published yet for nltk (pip). Pin to a known-safe version or switch to an alternative.