MEDIUM

GHSA-7h4p-rffg-7823

vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels

Details

## Summary

All temperature validation gates use comparison operators (`<`, `>`), which silently evaluate to `False` for `NaN` and for positive `Infinity` in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: `-Infinity` is correctly caught.

## Root Cause

`sampling_params.py:384`: ```python if 0 < self.temperature < _MAX_TEMP: # NaN → False; +Inf → False ```

`sampling_params.py:462`: ```python if self.temperature < 0.0: # NaN → False; +Inf → False raise VLLMValidationError(...) ```

No `math.isnan()` or `math.isinf()` check exists anywhere in `sampling_params.py`.

Python semantics (verified): `float('nan') < 0.0` → `False`, `float('inf') < 0.0` → `False`.

## Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

## Remediation

Add `math.isfinite(self.temperature)` check in `_verify_args()`. Reject non-finite float values with a 400 error.

## Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

Are you affected?

Enter the version of the package you're using.

Affected packages

PyPI / vllm

Introduced in: 0

No fixed version published yet for vllm (pip). Pin to a known-safe version or switch to an alternative.

Details

Are you affected?

Affected packages

References