MEDIUM 5.0

GHSA-g23j-2vwm-5c25

local-deep-research has an SSRF bypass in `safe_get`

Details

### Summary The URL checking logic in local-deep-research has a logical flaw that could be bypassed by attackers, leading to SSRF attacks.

### Details The current project uses `validate_url` to validate the input URL. The main logic is to perform security checks on the host portion of the URL extracted by urlparse to prevent SSRF attacks.

However, there are indeed differences in parsing between urlparse and the library that actually sends the request. For example, in `safe_get`, `validate_url` is first used to perform an SSRF check, and then `requests.get` is used to send the actual request.

The core issue: urlparse() and requests disagree on which host a URL like `http://127.0.0.1:6666\@1.1.1.1` points to:

- urlparse() treats \ as a regular character and @ as the userinfo-host delimiter, so it extracts hostname as `1.1.1.1` (public) - requests treats \ as a path character, connecting to `127.0.0.1` (internal)

Below is a test code I wrote following the code. ``` #!/usr/bin/env python3 """Standalone demo: import project via absolute path and call safe_get."""

from __future__ import annotations

import importlib.util import enum import sys import types from pathlib import Path

# Hardcoded absolute path to the project's "src" directory. SRC_ROOT = Path( r"d:\BaiduNetdiskDownload\local-deep-research-main\local-deep-research-main\src" )

# Python 3.10 compatibility: # project constants import StrEnum (available in Python 3.11+). if not hasattr(enum, "StrEnum"): class _CompatStrEnum(str, enum.Enum): pass

enum.StrEnum = _CompatStrEnum # type: ignore[attr-defined]

def _load_safe_get(): """Load safe_get directly from file, bypassing package __init__ imports.""" ldr_pkg_name = "local_deep_research" security_pkg_name = "local_deep_research.security"

# Build lightweight package modules so relative imports in safe_requests.py # resolve without executing package __init__.py files. if ldr_pkg_name not in sys.modules: ldr_pkg = types.ModuleType(ldr_pkg_name) ldr_pkg.__path__ = [str(SRC_ROOT / "local_deep_research")] # type: ignore[attr-defined] sys.modules[ldr_pkg_name] = ldr_pkg

if security_pkg_name not in sys.modules: security_pkg = types.ModuleType(security_pkg_name) security_pkg.__path__ = [str(SRC_ROOT / "local_deep_research" / "security")] # type: ignore[attr-defined] sys.modules[security_pkg_name] = security_pkg

module_name = "local_deep_research.security.safe_requests" module_path = SRC_ROOT / "local_deep_research" / "security" / "safe_requests.py"

spec = importlib.util.spec_from_file_location(module_name, module_path) if spec is None or spec.loader is None: raise ImportError(f"Cannot load module from {module_path}")

module = importlib.util.module_from_spec(spec) sys.modules[module_name] = module spec.loader.exec_module(module) return module.safe_get

safe_get = _load_safe_get()

def main() -> None: # Hardcoded URL for demonstration. url = "http://127.0.0.1:6666" # url = "http://127.0.0.1:6666\@1.1.1.1"

safe_get(url, timeout=15)

if __name__ == "__main__": main() ``` When an attacker uses `http://127.0.0.1:6666/`, the existing detection logic can detect that this is an internal network address and block it.

However, when an attacker uses `http://127.0.0.1:6666\@1.1.1.1`, the detection logic resolves the host to `1.1.1.1`, which is a public IP address, thus passing the verification. But in the actual request process, this URL is forwarded by requests.get to `http://127.0.0.1:6666`, bypassing the detection and achieving an SSRF attack.

### PoC ``` http://127.0.0.1:6666\@1.1.1.1 ```

### Impact SSRF

---

## Maintainer note (2026-05-15)

Thanks @Fushuling and @RacerZ-fighting for the detailed report. The remediation spans four PRs, all merged to `main` and shipped in **v1.6.10**:

**#3873** (merged 2026-05-08) — the load-bearing fix for the parser-differential bypass: - New `RFC_FORBIDDEN_URL_CHARS_RE` in `security/ssrf_validator.py` rejects URLs containing backslash, ASCII control bytes, or whitespace — RFC 3986 forbids these and their presence signals a parser-differential attempt. - Host extraction switched from `urllib.parse.urlparse(url).hostname` to `urllib3.util.parse_url(url).host`. `urllib3` is the parser `requests` uses internally, so the validator and the HTTP client now agree on the destination by construction — closing the `\@` divergence that drove the PoC. - Same two-layer defence applied to `NotificationURLValidator.validate_service_url`. - 53 new tests across `test_ssrf_validator.py`, `test_notification_validator.py`, `test_safe_requests.py`, and `test_ssrf_redirect_bypass.py`, including the advisory PoC `http://127.0.0.1:6666\@1.1.1.1` and the post-prepare canonical form `http://127.0.0.1:6666/%5C@1.1.1.1`.

**#3882** (merged 2026-05-08) — hardens the metadata-IP block and redacts userinfo from log output so rejected URLs don't leak credentials to logs.

**#3889** (merged 2026-05-09) — locks in real-world URL fixtures and behavior invariants from #3873/#3882 as regression tests.

**#3932** (merged 2026-05-10) — blocks IPv6 transition prefixes (`2002::/16` 6to4, `64:ff9b::/96` NAT64, `2001::/32` Teredo, `100::/64` discard) so private IPv4 destinations cannot be reached via an IPv6-wrapped form. NAT64 has an operator opt-in (`LDR_SECURITY_ALLOW_NAT64=true`) for IPv6-only deployments, but cloud metadata IPs remain blocked regardless.

### Affected versions

- **The specific parser-differential bypass** described above exists from **v1.3.0** (when `validate_url` was first introduced) through **v1.6.9**. The validator used `urlparse(url).hostname` for that entire span. - **Versions before v1.3.0** had no SSRF validator at all — requests went directly to `requests.get()` without any host check. Those versions are vulnerable to SSRF via this URL and any other internal address; the parser-differential trick is unnecessary.

In both cases the remediation is the same: **upgrade to v1.6.10 or later.**

Are you affected?

Enter the version of the package you're using.

Affected packages

PyPI / local-deep-research

Introduced in: 0 Fixed in: 1.6.10

Fix pip install --upgrade 'local-deep-research>=1.6.10'

Details

Are you affected?

Affected packages

References