Managing Models with DynamoModel
Overview
DynamoModel is a Kubernetes Custom Resource that represents a machine learning model deployed on Dynamo. It enables you to:
- Deploy LoRA adapters on top of running base models
- Track model endpoints and their readiness across your cluster
- Manage model lifecycle declaratively with Kubernetes
DynamoModel works alongside DynamoGraphDeployment (DGD) or DynamoComponentDeployment (DCD) resources. While DGD/DCD deploy the inference infrastructure (pods, services), DynamoModel handles model-specific operations like loading LoRA adapters.
Quick Start
Prerequisites
Before creating a DynamoModel, you need:
- A running
DynamoGraphDeploymentorDynamoComponentDeployment - Components configured with
modelRefpointing to your base model - Pods are ready and serving your base model
For complete setup including DGD configuration, see Integration with DynamoGraphDeployment.
Deploy a LoRA Adapter
1. Create your DynamoModel:
2. Apply and verify:
Expected output:
Thatās it! The operator automatically discovers endpoints and loads the LoRA.
For detailed status monitoring, see Monitoring & Operations.
Understanding DynamoModel
Model Types
DynamoModel supports three model types:
Most users will use lora to deploy fine-tuned models on top of their base model deployments.
How It Works
When you create a DynamoModel, the operator:
- Discovers endpoints: Finds all pods running your
baseModelName(by matchingmodelRef.namein DGD/DCD) - Creates service: Automatically creates a Kubernetes Service to track these pods
- Loads LoRA: Calls the LoRA load API on each endpoint (for
loratype) - Updates status: Reports which endpoints are ready
Key linkage:
Configuration Overview
DynamoModel requires just a few key fields to deploy a model or adapter:
Example minimal LoRA configuration:
For complete field specifications, validation rules, and all options, see: š DynamoModel API Reference
Status Summary
The status shows discovered endpoints and their readiness:
Key status fields:
totalEndpoints/readyEndpoints: Counts of discovered vs ready endpointsendpoints[]: List with addresses, pod names, and ready statusconditions: Standard Kubernetes conditions (EndpointsReady, ServicesFound)
For detailed status usage, see the Monitoring & Operations section below
Common Use Cases
Use Case 1: S3-Hosted LoRA Adapter
Deploy a LoRA adapter stored in an S3 bucket.
Prerequisites:
- S3 bucket accessible from your pods (IAM role or credentials)
- Base model
meta-llama/Llama-3.3-70B-Instructrunning via DGD/DCD
Verification:
Use Case 2: HuggingFace-Hosted LoRA
Deploy a LoRA adapter from HuggingFace Hub.
Prerequisites:
- HuggingFace Hub accessible from your pods
- If private repo: HF token configured as secret and mounted in pods
- Base model
Qwen/Qwen3-0.6Brunning via DGD/DCD
With HuggingFace token:
Use Case 3: Multiple LoRAs on Same Base Model
Deploy multiple LoRA adapters on the same base model deployment.
Both LoRAs will be loaded on all pods serving Qwen/Qwen3-0.6B. Your application can then route requests to the appropriate adapter.
Monitoring & Operations
Checking Status
Quick status check:
Example output:
Detailed status:
Example output:
Understanding Readiness
An endpoint is ready when:
- The pod is running and healthy
- The LoRA load API call succeeded
Condition states:
EndpointsReady=True: All endpoints are ready (full availability)EndpointsReady=False, Reason=NotReady: Not all endpoints ready (check message for counts)EndpointsReady=False, Reason=NoEndpoints: No endpoints found
When readyEndpoints < totalEndpoints, the operator automatically retries loading every 30 seconds.
Viewing Endpoints
Get endpoint addresses:
Output:
Get endpoint pod names:
Check readiness of each endpoint:
Output:
Updating a Model
To update a LoRA (e.g., deploy a new version):
The operator will detect the change and reload the LoRA on all endpoints.
Deleting a Model
For LoRA models, the operator will:
- Unload the LoRA from all endpoints
- Clean up associated resources
- Remove the DynamoModel CR
The base model deployment (DGD/DCD) continues running normally.
Troubleshooting
No Endpoints Found
Symptom:
Common Causes:
-
Base model deployment not running
Solution: Deploy your DGD/DCD first, wait for pods to be ready.
-
baseModelNamemismatchSolution: Ensure
baseModelNamein DynamoModel exactly matchesmodelRef.namein DGD. -
Pods not ready
Solution: Wait for pods to reach
RunningandReadystate. -
Wrong namespace Solution: Ensure DynamoModel is in the same namespace as your DGD/DCD.
LoRA Load Failures
Symptom:
Common Causes:
-
Source URI not accessible
Solution:
- For S3: Verify bucket permissions, IAM role, credentials
- For HuggingFace: Verify token is valid, repo exists and is accessible
-
Invalid LoRA format Solution: Ensure your LoRA weights are in the format expected by your backend framework (vLLM, SGLang, etc.)
-
Endpoint API errors
Solution: Check the backend frameworkās logs in the worker pods:
-
Out of memory Solution: LoRA adapters require additional memory. Increase memory limits in your DGD:
Status Shows Not Ready
Symptom: Some endpoints remain not ready for extended periods.
Diagnosis:
Common Causes:
- Network issues: Pod canāt reach S3/HuggingFace
- Resource constraints: Pod is OOMing or being throttled
- API endpoint not responding: Backend framework isnāt serving the LoRA API
When to wait vs investigate:
- Wait: If readyEndpoints is increasing over time (LoRAs loading progressively)
- Investigate: If stuck at same readyEndpoints for >5 minutes
Viewing Events and Logs
Check events:
View operator logs:
Common events and messages:
Integration with DynamoGraphDeployment
This section shows the complete end-to-end workflow for deploying base models and LoRA adapters together.
DynamoModel and DynamoGraphDeployment work together to provide complete model deployment:
- DGD: Deploys the infrastructure (pods, services, resources)
- DynamoModel: Manages model-specific operations (LoRA loading)
Linking Models to Components
The connection is established through the modelRef field in your DGD:
Complete example:
Deployment Workflow
Recommended order:
What happens behind the scenes:
The operator automatically handles all service discovery - you donāt configure services, labels, or selectors manually.
API Reference
For complete field specifications, validation rules, and detailed type definitions, see:
Summary
DynamoModel provides declarative model management for Dynamo deployments:
ā Simple: 2-step deployment of LoRA adapters ā Automatic: Endpoint discovery and loading handled by operator ā Observable: Rich status reporting and conditions ā Integrated: Works seamlessly with DynamoGraphDeployment
Next Steps:
- Try the Quick Start example
- Explore Common Use Cases
- Check the API Reference for advanced configuration