MEDIUM 6.5

GHSA-3j69-69wj-xqx2

UltraJSON: Malformed/Truncated UTF-8 Accepted and Silently Rewritten in ujson.dumps()

상세

### Summary `ujson.dumps()` (or `ujson.dump()` or `ujson.encode()`) have a `reject_bytes=False` option. When set, they may accept malformed or truncated UTF-8 byte sequences, silently rewriting them into different Unicode characters instead of rejecting them. This leads to input validation bypass and data integrity issues.

### Details

The expected behavior is that for `x` being any bytes string, `x == ujson.loads(ujson.dumps(x, reject_bytes=False)).encode(errors="surrogatepass")` should always either be true or `ujson.dumps()` will throw an exception. In reality, some strings which should've been errors are silently rewritten as other strings:

* Invalid continuation bytes are replaced with valid ones: `b'\xcf\x13'` -> `b'\xcf\x93'` * Unterminated sequence completes the sequence: `b'\xc3'` -> `b'\xc3\x80'` * ... or leads to reading past the end of string: `b'\xf0\x90\x94'` -> `b"\xf0\x90\x94\x80inxcontrib'"`

### Impact

An application relying on reject_bytes=False for UTF-8 handling may experience:

- Data integrity issues - Experience validation bypass if said validation occurs before serialisation

### Remediation

The missing/broken UTF-8 validation checks were added/fixed in https://github.com/ultrajson/ultrajson/commit/169eaf36b1116fece5034ee79a7a0ef3f6deedcf. We recommend upgrading to [UltraJSON 5.13.0](https://github.com/ultrajson/ultrajson/releases/tag/5.13.0).

### Workarounds

Decoding bytes to strings in Python before passing them to `ujson.dumps()` avoids this issue.

이 버전이 영향받나요?

사용 중인 패키지 버전을 입력하면 즉시 평가합니다.

영향 패키지

PyPI / ujson

최초 영향 버전: 0 수정 버전: 5.13.0

수정 pip install --upgrade 'ujson>=5.13.0'

상세

이 버전이 영향받나요?

영향 패키지

참고