diff --git a/advisories/github-reviewed/2026/04/GHSA-mw35-8rx3-xf9r/GHSA-mw35-8rx3-xf9r.json b/advisories/github-reviewed/2026/04/GHSA-mw35-8rx3-xf9r/GHSA-mw35-8rx3-xf9r.json new file mode 100644 index 0000000000000..87a568d06bc80 --- /dev/null +++ b/advisories/github-reviewed/2026/04/GHSA-mw35-8rx3-xf9r/GHSA-mw35-8rx3-xf9r.json @@ -0,0 +1,58 @@ +{ + "schema_version": "1.4.0", + "id": "GHSA-mw35-8rx3-xf9r", + "modified": "2026-04-24T16:15:01Z", + "published": "2026-04-24T16:15:00Z", + "aliases": [ + "CVE-2026-41486" + ], + "summary": "Ray: Remote Code Execution via Parquet Arrow Extension Type Deserialization", + "details": "# Remote Code Execution via Parquet Arrow Extension Type Deserialization\n\n## Summary\n\nRay Data registers custom Arrow extension types (`ray.data.arrow_tensor`, `ray.data.arrow_tensor_v2`, `ray.data.arrow_variable_shaped_tensor`) globally in PyArrow. When PyArrow reads a Parquet file containing one of these extension types, it calls `__arrow_ext_deserialize__` on the field's metadata bytes. Ray's implementation passes these bytes directly to `cloudpickle.loads()`, achieving arbitrary code execution during schema parsing, before any row data is read.\n\nIn May 2024, Ray fixed a related vulnerability in `PyExtensionType`-based extension types ([issue #41314](https://github.com/ray-project/ray/issues/41314), [PR #45084](https://github.com/ray-project/ray/pull/45084)). In July 2025, [PR #54831](https://github.com/ray-project/ray/pull/54831) introduced `cloudpickle.loads()` into the replacement extension types' deserialization path, reintroducing the same class of vulnerability.\n\n## Impact\n\n- **Affected versions**: Ray 2.49.0 through 2.54.0 (latest release as of March 2026). The vulnerable `_deserialize_with_fallback` function with `cloudpickle.loads()` was introduced in commit `f6d21db1a4` ([PR #54831](https://github.com/ray-project/ray/pull/54831), July 2025), first released in Ray 2.49.0.\n- **Affected configurations**: Any process that uses Ray Data and reads Parquet files. The extension types are registered globally in PyArrow, so all Parquet reads in the process are affected, including `ray.data.read_parquet()`, `pyarrow.parquet.read_table()`, `pandas.read_parquet()`, etc.\n- **Attacker prerequisites**: The attacker must place a crafted Parquet file where a Ray Data pipeline reads it. No authentication or cluster access is required. The Parquet file must contain a column with a `ray.data.arrow_tensor` (or v2, or variable-shaped) extension type name, which makes this a targeted attack against Ray Data users.\n- **CIA impact**: Arbitrary command execution as the Ray worker process user, resulting in full server compromise.\n- **Severity**: Critical\n\n", + "severity": [ + { + "type": "CVSS_V4", + "score": "CVSS:4.0/AV:N/AC:L/AT:P/PR:N/UI:A/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H" + } + ], + "affected": [ + { + "package": { + "ecosystem": "PyPI", + "name": "ray" + }, + "ranges": [ + { + "type": "ECOSYSTEM", + "events": [ + { + "introduced": "2.49.0" + }, + { + "fixed": "2.55.0" + } + ] + } + ] + } + ], + "references": [ + { + "type": "WEB", + "url": "https://github.com/ray-project/ray/security/advisories/GHSA-mw35-8rx3-xf9r" + }, + { + "type": "PACKAGE", + "url": "https://github.com/ray-project/ray" + } + ], + "database_specific": { + "cwe_ids": [ + "CWE-502", + "CWE-94" + ], + "severity": "HIGH", + "github_reviewed": true, + "github_reviewed_at": "2026-04-24T16:15:00Z", + "nvd_published_at": null + } +} \ No newline at end of file