Wheel provenance
The vllm-node image installs vLLM + flashinfer from wheels built locally
inside the Dockerfile. There is no upstream wheel URL. The wheels are
recorded by git SHA inside the image:
/workspace/wheels/.vllm-commit = ace95c9cf
/workspace/wheels/.flashinfer-commit = 1d54c5c6
Pip records each wheel's install source as a local file:
file:///workspace/wheels/vllm-0.22.1rc1.dev124+gace95c9cf.d20260603.cu132-cp312-cp312-linux_aarch64.whl
file:///workspace/wheels/flashinfer_python-0.6.12+1d54c5c6-cp39-abi3-linux_aarch64.whl
To reproduce the exact image used in these blog posts, build the
vllm-node image from the eugr/spark-vllm-docker repo with those refs
passed as build args:
git clone https://github.com/eugr/spark-vllm-docker.git
cd spark-vllm-docker
./build-and-copy.sh \
--vllm-ref ace95c9cf \
--flashinfer-ref 1d54c5c6
(build-and-copy.sh is the repo's wrapper around docker buildx build
that also distributes the resulting image tarball to all worker GX10s.)
What the Dockerfile does
Multi-stage build, base = nvidia/cuda:13.2.0-devel-ubuntu24.04:
-
base — installs
torch==2.11.0fromhttps://download.pytorch.org/whl/cu130, plus cuDNN 9 for CUDA 13.TORCH_CUDA_ARCH_LIST="12.1a"is set here (Blackwellsm_121a). -
NCCL — built from
https://github.com/zyang-dev/nccl.gitbranchdgxspark-3node-ring, gencodearch=compute_121,code=sm_121. Installed as.deb. -
flashinfer-builder — clones
https://github.com/flashinfer-ai/flashinfer.git, checks out${FLASHINFER_REF}(=1d54c5c6for this image), buildsflashinfer-python,flashinfer-cubin,flashinfer-jit-cachewheels into/workspace/wheels/, dumps.flashinfer-commit. -
vllm-builder — clones
https://github.com/vllm-project/vllm.git, checks out${VLLM_REF}(=ace95c9cf), stripsflashinfer,triton,fastsafetensorsfrom requirements (they come from the pre-built wheels), thenuv build --wheel, dumps.vllm-commit. -
runner — fresh
cuda:13.2.0-devel, bind-mounts the wheels directory, installs everything withuv pip install.
Known regression — torch metadata lies
vLLM wheels built off ace95c9cf ship metadata pinning
torch==2.10.0, but the compiled ABI requires torch==2.11.0. If you
ever pip install something that touches the torch dependency it will
silently downgrade and break the runtime. Fix:
pip install --force-reinstall --no-deps torch==2.11.0 \
--index-url https://download.pytorch.org/whl/cu130
The Dockerfile above installs torch==2.11.0 before the wheels, so a
clean image is already correct — this only bites on rebuilds and post-hoc
pip changes.