Quickstart
This guide covers running Dynamo using the CLI on your local machine or VM.
Looking to deploy on Kubernetes instead? See the Kubernetes Installation Guide and Kubernetes Quickstart for cluster deployments.
Install Dynamo
Option A: Containers (Recommended)
Containers have all dependencies pre-installed. No setup required.
To run frontend and worker in the same container, either:
- Run processes in background with
&(see Run Dynamo section below), or - Open a second terminal and use
docker exec -it <container_id> bash
See Release Artifacts for available versions and backend guides for run instructions: SGLang | TensorRT-LLM | vLLM
Option B: Install from PyPI
Install system dependencies and the Dynamo wheel for your chosen backend:
SGLang
For CUDA 13 (B300/GB300), the container is recommended. See SGLang install docs for details.
TensorRT-LLM
TensorRT-LLM requires pip due to a transitive Git URL dependency that
uv doesn’t resolve. We recommend using the TensorRT-LLM container for
broader compatibility. See the TRT-LLM backend guide
for details.
vLLM
Run Dynamo
(Optional) Before running Dynamo, verify your system configuration:
python3 deploy/sanity_check.py
Start the frontend, then start a worker for your chosen backend.
To run in a single terminal (useful in containers), append > logfile.log 2>&1 &
to run processes in background. Example: python3 -m dynamo.frontend --store-kv file > dynamo.frontend.log 2>&1 &
In another terminal (or same terminal if using background mode), start a worker:
SGLang
TensorRT-LLM
vLLM
For dependency-free local development, disable KV event publishing (avoids NATS):
- vLLM: Add
--kv-events-config '{"enable_kv_cache_events": false}' - SGLang: No flag needed (KV events disabled by default)
- TensorRT-LLM: No flag needed (KV events disabled by default)
TensorRT-LLM only: The warning Cannot connect to ModelExpress server/transport error. Using direct download.
is expected and can be safely ignored.