Release Artifacts

View as Markdown

Dynamo Release Artifacts

This document provides a comprehensive inventory of all Dynamo release artifacts including container images, Python wheels, Helm charts, and Rust crates.

See also: Support Matrix for hardware and platform compatibility | Feature Matrix for backend feature support

Release history in this document begins at v0.6.0.

Current Release: Dynamo v0.8.1

Patch Release: v0.8.1.post1 (Jan 23, 2026)

v0.8.1.post1 is a patch release for PyPI wheels and TRT-LLM container only (no GitHub release). All other artifacts remain at v0.8.1.

ArtifactVersionChangeLink
ai-dynamo0.8.1.post1Updated TRT-LLM to v1.2.0rc6.post2PyPI
ai-dynamo-runtime0.8.1.post1Updated TRT-LLM to v1.2.0rc6.post2PyPI
tensorrtllm-runtime0.8.1.post1TRT-LLM v1.2.0rc6.post2NGC

Container Images

Image:TagDescriptionBackendCUDAArchNGCNotes
vllm-runtime:0.8.1Runtime container for vLLM backendvLLM v0.12.0v12.9AMD64/ARM64link
vllm-runtime:0.8.1-cuda13Runtime container for vLLM backend (CUDA 13)vLLM v0.12.0v13.0AMD64/ARM64*Fails to launch
sglang-runtime:0.8.1Runtime container for SGLang backendSGLang v0.5.6.post2v12.9AMD64/ARM64link
sglang-runtime:0.8.1-cuda13Runtime container for SGLang backend (CUDA 13)SGLang v0.5.6.post2v13.0AMD64/ARM64*linkExperimental
tensorrtllm-runtime:0.8.1Runtime container for TensorRT-LLM backendTRT-LLM v1.2.0rc6.post1v13.0AMD64/ARM64link
dynamo-frontend:0.8.1API gateway with Endpoint Prediction Protocol (EPP)AMD64/ARM64link
kubernetes-operator:0.8.1Kubernetes operator for Dynamo deploymentsAMD64/ARM64link

* Multimodal inference on CUDA 13 images: works on AMD64 for all backends; works on ARM64 only for TensorRT-LLM (vllm-runtime:*-cuda13 and sglang-runtime:*-cuda13 do not support multimodality on ARM64).

Python Wheels

We recommend using the TensorRT-LLM NGC container instead of the ai-dynamo[trtllm] wheel. See the NGC container collection for supported images.

PackageDescriptionPythonPlatformPyPI
ai-dynamo==0.8.1Main package with backend integrations (vLLM, SGLang, TRT-LLM)3.103.12Linux (glibc v2.28+)link
ai-dynamo-runtime==0.8.1Core Python bindings for Dynamo runtime3.103.12Linux (glibc v2.28+)link
kvbm==0.8.1KV Block Manager for disaggregated KV cache3.12Linux (glibc v2.28+)link

Helm Charts

ChartDescriptionNGC
dynamo-crds-0.8.1Custom Resource Definitions for Dynamo Kubernetes resourceslink
dynamo-platform-0.8.1Platform services (etcd, NATS) for Dynamo clusterlink
dynamo-graph-0.8.1Deployment graph controller for Dynamo workloadslink

Rust Crates

CrateDescriptionMSRV (Rust)crates.io
dynamo-runtime@0.8.1Core distributed runtime libraryv1.82link
dynamo-llm@0.8.1LLM inference enginev1.82link
dynamo-async-openai@0.8.1Async OpenAI-compatible API clientv1.82link
dynamo-parsers@0.8.1Protocol parsers (SSE, JSON streaming)v1.82link
dynamo-memory@0.8.1Memory management utilitiesv1.82link
dynamo-config@0.8.1Configuration managementv1.82link

Quick Install Commands

Container Images (NGC)

For detailed run instructions, see the Container README or backend-specific guides: vLLM | SGLang | TensorRT-LLM

$# Runtime containers
$docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1
$docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1
$docker pull nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.8.1.post1
$
$# CUDA 13 variants (experimental)
$# vLLM CUDA 13 image fails to launch (known issue)
$# docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1-cuda13
$docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1-cuda13
$
$# Infrastructure containers
$docker pull nvcr.io/nvidia/ai-dynamo/dynamo-frontend:0.8.1
$docker pull nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.8.1

Python Wheels (PyPI)

For detailed installation instructions, see the Local Quick Start in the README.

$# Install Dynamo with a specific backend (Recommended)
$uv pip install "ai-dynamo[vllm]==0.8.1.post1"
$uv pip install "ai-dynamo[sglang]==0.8.1.post1"
$# TensorRT-LLM requires the NVIDIA PyPI index and pip
$pip install --pre --extra-index-url https://pypi.nvidia.com "ai-dynamo[trtllm]==0.8.1.post1"
$
$# Install Dynamo core only
$uv pip install ai-dynamo==0.8.1.post1
$
$# Install standalone KVBM (Python 3.12 only)
$uv pip install kvbm==0.8.1

Helm Charts (NGC)

For Kubernetes deployment instructions, see the Kubernetes Installation Guide.

$helm install dynamo-crds oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds --version 0.8.1
$helm install dynamo-platform oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform --version 0.8.1
$helm install dynamo-graph oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-graph --version 0.8.1

Rust Crates (crates.io)

For API documentation, see each crate on docs.rs. To build Dynamo from source, see Building from Source.

$cargo add dynamo-runtime@0.8.1
$cargo add dynamo-llm@0.8.1
$cargo add dynamo-async-openai@0.8.1
$cargo add dynamo-parsers@0.8.1
$cargo add dynamo-memory@0.8.1
$cargo add dynamo-config@0.8.1

CUDA and Driver Requirements

For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the Support Matrix.

Known Issues

For a complete list of known issues, refer to the release notes for each patch:

Known Artifact Issues

VersionArtifactIssueStatus
v0.8.1vllm-runtime:0.8.1-cuda13Container fails to launch.Known issue
v0.8.1sglang-runtime:0.8.1-cuda13, vllm-runtime:0.8.1-cuda13Multimodality not expected to work on ARM64. Works on AMD64.Known limitation
v0.8.0sglang-runtime:0.8.0-cuda13CuDNN installation issue caused PyTorch v2.9.1 compatibility problems with nn.Conv3d, resulting in performance degradation and excessive memory usage in multimodal workloads.Fixed in v0.8.1 (#5461)

Release History

  • v0.8.1.post1 Patch: Updated TRT-LLM to v1.2.0rc6.post2 (PyPI wheels and TRT-LLM container only)
  • Standalone Frontend Container: dynamo-frontend added in v0.8.0
  • CUDA 13 Runtimes: Experimental CUDA 13 runtime for vLLM and SGLang in v0.8.0
  • New Rust Crates: dynamo-memory and dynamo-config added in v0.8.0

GitHub Releases

VersionRelease DateGitHubDocs
v0.8.1Jan 23, 2026ReleaseDocs
v0.8.0Jan 15, 2026ReleaseDocs
v0.7.1Dec 15, 2025ReleaseGitHub
v0.7.0Nov 26, 2025ReleaseGitHub
v0.6.1Nov 6, 2025ReleaseGitHub
v0.6.0Oct 28, 2025ReleaseGitHub

Container Images

NGC Collection: ai-dynamo

To access a specific version, append ?version=TAG to the container URL: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/{container}?version={tag}

vllm-runtime

Image:TagvLLMArchCUDANotes
vllm-runtime:0.8.1v0.12.0AMD64/ARM64v12.9
vllm-runtime:0.8.0v0.12.0AMD64/ARM64v12.9
vllm-runtime:0.8.0-cuda13v0.12.0AMD64/ARM64v13.0Experimental
vllm-runtime:0.7.0.post2v0.11.2AMD64/ARM64v12.8Patch
vllm-runtime:0.7.1v0.11.0AMD64/ARM64v12.8
vllm-runtime:0.7.0.post1v0.11.0AMD64/ARM64v12.8Patch
vllm-runtime:0.7.0v0.11.0AMD64/ARM64v12.8
vllm-runtime:0.6.1.post1v0.11.0AMD64/ARM64v12.8Patch
vllm-runtime:0.6.1v0.11.0AMD64/ARM64v12.8
vllm-runtime:0.6.0v0.11.0AMD64v12.8

sglang-runtime

Image:TagSGLangArchCUDANotes
sglang-runtime:0.8.1v0.5.6.post2AMD64/ARM64v12.9
sglang-runtime:0.8.1-cuda13v0.5.6.post2AMD64/ARM64v13.0Experimental
sglang-runtime:0.8.0v0.5.6.post2AMD64/ARM64v12.9
sglang-runtime:0.8.0-cuda13v0.5.6.post2AMD64/ARM64v13.0Experimental
sglang-runtime:0.7.1v0.5.4.post3AMD64/ARM64v12.9
sglang-runtime:0.7.0.post1v0.5.4.post3AMD64/ARM64v12.9Patch
sglang-runtime:0.7.0v0.5.4.post3AMD64/ARM64v12.9
sglang-runtime:0.6.1.post1v0.5.3.post2AMD64/ARM64v12.9Patch
sglang-runtime:0.6.1v0.5.3.post2AMD64/ARM64v12.9
sglang-runtime:0.6.0v0.5.3.post2AMD64v12.8

tensorrtllm-runtime

Image:TagTRT-LLMArchCUDANotes
tensorrtllm-runtime:0.8.1.post1v1.2.0rc6.post2AMD64/ARM64v13.0Patch
tensorrtllm-runtime:0.8.1v1.2.0rc6.post1AMD64/ARM64v13.0
tensorrtllm-runtime:0.8.0v1.2.0rc6.post1AMD64/ARM64v13.0
tensorrtllm-runtime:0.7.0.post2v1.2.0rc2AMD64/ARM64v13.0Patch
tensorrtllm-runtime:0.7.1v1.2.0rc3AMD64/ARM64v13.0
tensorrtllm-runtime:0.7.0.post1v1.2.0rc3AMD64/ARM64v13.0Patch
tensorrtllm-runtime:0.7.0v1.2.0rc2AMD64/ARM64v13.0
tensorrtllm-runtime:0.6.1-cuda13v1.2.0rc1AMD64/ARM64v13.0Experimental
tensorrtllm-runtime:0.6.1.post1v1.1.0rc5AMD64/ARM64v12.9Patch
tensorrtllm-runtime:0.6.1v1.1.0rc5AMD64/ARM64v12.9
tensorrtllm-runtime:0.6.0v1.1.0rc5AMD64/ARM64v12.9

dynamo-frontend

Image:TagArchNotes
dynamo-frontend:0.8.1AMD64/ARM64
dynamo-frontend:0.8.0AMD64/ARM64Initial

kubernetes-operator

Image:TagArchNotes
kubernetes-operator:0.8.1AMD64/ARM64
kubernetes-operator:0.8.0AMD64/ARM64
kubernetes-operator:0.7.1AMD64/ARM64
kubernetes-operator:0.7.0.post1AMD64/ARM64Patch
kubernetes-operator:0.7.0AMD64/ARM64
kubernetes-operator:0.6.1AMD64/ARM64
kubernetes-operator:0.6.0AMD64/ARM64

Python Wheels

PyPI: ai-dynamo | ai-dynamo-runtime | kvbm

To access a specific version: https://pypi.org/project/{package}/{version}/

ai-dynamo (wheel)

PackagePythonPlatformNotes
ai-dynamo==0.8.1.post13.103.12Linux (glibc v2.28+)TRT-LLM v1.2.0rc6.post2
ai-dynamo==0.8.13.103.12Linux (glibc v2.28+)
ai-dynamo==0.8.03.103.12Linux (glibc v2.28+)
ai-dynamo==0.7.13.103.12Linux (glibc v2.28+)
ai-dynamo==0.7.03.103.12Linux (glibc v2.28+)
ai-dynamo==0.6.13.103.12Linux (glibc v2.28+)
ai-dynamo==0.6.03.103.12Linux (glibc v2.28+)

ai-dynamo-runtime (wheel)

PackagePythonPlatformNotes
ai-dynamo-runtime==0.8.1.post13.103.12Linux (glibc v2.28+)TRT-LLM v1.2.0rc6.post2
ai-dynamo-runtime==0.8.13.103.12Linux (glibc v2.28+)
ai-dynamo-runtime==0.8.03.103.12Linux (glibc v2.28+)
ai-dynamo-runtime==0.7.13.103.12Linux (glibc v2.28+)
ai-dynamo-runtime==0.7.03.103.12Linux (glibc v2.28+)
ai-dynamo-runtime==0.6.13.103.12Linux (glibc v2.28+)
ai-dynamo-runtime==0.6.03.103.12Linux (glibc v2.28+)

kvbm (wheel)

PackagePythonPlatformNotes
kvbm==0.8.13.12Linux (glibc v2.28+)
kvbm==0.8.03.12Linux (glibc v2.28+)
kvbm==0.7.13.12Linux (glibc v2.28+)
kvbm==0.7.03.12Linux (glibc v2.28+)Initial

Helm Charts

NGC Helm Registry: ai-dynamo

Direct download: https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/{chart}-{version}.tgz

dynamo-crds (Helm chart)

ChartNotes
dynamo-crds-0.8.1
dynamo-crds-0.8.0
dynamo-crds-0.7.1
dynamo-crds-0.7.0
dynamo-crds-0.6.1
dynamo-crds-0.6.0

dynamo-platform (Helm chart)

ChartNotes
dynamo-platform-0.8.1
dynamo-platform-0.8.0
dynamo-platform-0.7.1
dynamo-platform-0.7.0
dynamo-platform-0.6.1
dynamo-platform-0.6.0

dynamo-graph (Helm chart)

ChartNotes
dynamo-graph-0.8.1
dynamo-graph-0.8.0
dynamo-graph-0.7.1
dynamo-graph-0.7.0
dynamo-graph-0.6.1
dynamo-graph-0.6.0

Rust Crates

crates.io: dynamo-runtime | dynamo-llm | dynamo-async-openai | dynamo-parsers | dynamo-memory | dynamo-config

To access a specific version: https://crates.io/crates/{crate}/{version}

dynamo-runtime (crate)

CrateMSRV (Rust)Notes
dynamo-runtime@0.8.1v1.82
dynamo-runtime@0.8.0v1.82
dynamo-runtime@0.7.1v1.82
dynamo-runtime@0.7.0v1.82
dynamo-runtime@0.6.1v1.82
dynamo-runtime@0.6.0v1.82

dynamo-llm (crate)

CrateMSRV (Rust)Notes
dynamo-llm@0.8.1v1.82
dynamo-llm@0.8.0v1.82
dynamo-llm@0.7.1v1.82
dynamo-llm@0.7.0v1.82
dynamo-llm@0.6.1v1.82
dynamo-llm@0.6.0v1.82

dynamo-async-openai (crate)

CrateMSRV (Rust)Notes
dynamo-async-openai@0.8.1v1.82
dynamo-async-openai@0.8.0v1.82
dynamo-async-openai@0.7.1v1.82
dynamo-async-openai@0.7.0v1.82
dynamo-async-openai@0.6.1v1.82
dynamo-async-openai@0.6.0v1.82

dynamo-parsers (crate)

CrateMSRV (Rust)Notes
dynamo-parsers@0.8.1v1.82
dynamo-parsers@0.8.0v1.82
dynamo-parsers@0.7.1v1.82
dynamo-parsers@0.7.0v1.82
dynamo-parsers@0.6.1v1.82
dynamo-parsers@0.6.0v1.82

dynamo-memory (crate)

CrateMSRV (Rust)Notes
dynamo-memory@0.8.1v1.82
dynamo-memory@0.8.0v1.82Initial

dynamo-config (crate)

CrateMSRV (Rust)Notes
dynamo-config@0.8.1v1.82
dynamo-config@0.8.0v1.82Initial