# Metrics Developer Guide This guide explains how to create and use custom metrics in Dynamo components using the Dynamo metrics API. ## Metrics Exposure All metrics created via the Dynamo metrics API are automatically exposed on the `/metrics` HTTP endpoint in Prometheus Exposition Format text when the following environment variable is set: - `DYN_SYSTEM_PORT=` - Port for the metrics endpoint (set to positive value to enable, default: `-1` disabled) Example: ```bash DYN_SYSTEM_PORT=8081 python -m dynamo.vllm --model ``` Prometheus Exposition Format text metrics will be available at: `http://localhost:8081/metrics` ## Metric Name Constants The [prometheus_names.rs](https://github.com/ai-dynamo/dynamo/tree/main/lib/runtime/src/metrics/prometheus_names.rs) module provides centralized metric name constants and sanitization functions to ensure consistency across all Dynamo components. --- ## Metrics API in Rust The metrics API is accessible through the `.metrics()` method on runtime, namespace, component, and endpoint objects. See [Runtime Hierarchy](/dynamo/v-0-9-0/user-guides/observability-local/metrics#runtime-hierarchy) for details on the hierarchical structure. ### Available Methods - `.metrics().create_counter()`: Create a counter metric - `.metrics().create_gauge()`: Create a gauge metric - `.metrics().create_histogram()`: Create a histogram metric - `.metrics().create_countervec()`: Create a counter with labels - `.metrics().create_gaugevec()`: Create a gauge with labels - `.metrics().create_histogramvec()`: Create a histogram with labels ### Creating Metrics ```rust use dynamo_runtime::DistributedRuntime; let runtime = DistributedRuntime::new()?; let endpoint = runtime.namespace("my_namespace").component("my_component").endpoint("my_endpoint"); // Simple metrics let requests_total = endpoint.metrics().create_counter( "requests_total", "Total requests", &[] )?; let active_connections = endpoint.metrics().create_gauge( "active_connections", "Active connections", &[] )?; let latency = endpoint.metrics().create_histogram( "latency_seconds", "Request latency", &[], Some(vec![0.001, 0.01, 0.1, 1.0, 10.0]) )?; ``` ### Using Metrics ```rust // Counters requests_total.inc(); // Gauges active_connections.set(42.0); active_connections.inc(); active_connections.dec(); // Histograms latency.observe(0.023); // 23ms ``` ### Vector Metrics with Labels ```rust // Create vector metrics with label names let requests_by_model = endpoint.metrics().create_countervec( "requests_by_model", "Requests by model", &["model_type", "model_size"], &[] )?; let memory_by_gpu = endpoint.metrics().create_gaugevec( "gpu_memory_bytes", "GPU memory by device", &["gpu_id", "memory_type"], &[] )?; // Use with specific label values requests_by_model.with_label_values(&["llama", "7b"]).inc(); memory_by_gpu.with_label_values(&["0", "allocated"]).set(8192.0); ``` ### Advanced Features **Custom histogram buckets:** ```rust let latency = endpoint.metrics().create_histogram( "latency_seconds", "Request latency", &[], Some(vec![0.001, 0.01, 0.1, 1.0, 10.0]) )?; ``` **Constant labels:** ```rust let counter = endpoint.metrics().create_counter( "requests_total", "Total requests", &[("region", "us-west"), ("env", "prod")] )?; ``` --- ## Metrics API in Python Python components can create and manage Prometheus metrics using the same metrics API through Python bindings. ### Available Methods - `endpoint.metrics.create_counter()` / `create_intcounter()`: Create a counter metric - `endpoint.metrics.create_gauge()` / `create_intgauge()`: Create a gauge metric - `endpoint.metrics.create_histogram()`: Create a histogram metric - `endpoint.metrics.create_countervec()` / `create_intcountervec()`: Create a counter with labels - `endpoint.metrics.create_gaugevec()` / `create_intgaugevec()`: Create a gauge with labels - `endpoint.metrics.create_histogramvec()`: Create a histogram with labels All metrics are imported from `dynamo.prometheus_metrics`. ### Creating Metrics ```python from dynamo.runtime import DistributedRuntime drt = DistributedRuntime() endpoint = drt.namespace("my_namespace").component("my_component").endpoint("my_endpoint") # Simple metrics requests_total = endpoint.metrics.create_intcounter( "requests_total", "Total requests" ) active_connections = endpoint.metrics.create_intgauge( "active_connections", "Active connections" ) latency = endpoint.metrics.create_histogram( "latency_seconds", "Request latency", buckets=[0.001, 0.01, 0.1, 1.0, 10.0] ) ``` ### Using Metrics ```python # Counters requests_total.inc() requests_total.inc_by(5) # Gauges active_connections.set(42) active_connections.inc() active_connections.dec() # Histograms latency.observe(0.023) # 23ms ``` ### Vector Metrics with Labels ```python # Create vector metrics with label names requests_by_model = endpoint.metrics.create_intcountervec( "requests_by_model", "Requests by model", ["model_type", "model_size"] ) memory_by_gpu = endpoint.metrics.create_intgaugevec( "gpu_memory_bytes", "GPU memory by device", ["gpu_id", "memory_type"] ) # Use with specific label values requests_by_model.inc({"model_type": "llama", "model_size": "7b"}) memory_by_gpu.set(8192, {"gpu_id": "0", "memory_type": "allocated"}) ``` ### Advanced Features **Constant labels:** ```python counter = endpoint.metrics.create_intcounter( "requests_total", "Total requests", [("region", "us-west"), ("env", "prod")] ) ``` **Metric introspection:** ```python print(counter.name()) # "my_namespace_my_component_my_endpoint_requests_total" print(counter.const_labels()) # {"dynamo_namespace": "my_namespace", ...} print(gauge_vec.variable_labels()) # ["model_type", "model_size"] ``` **Update patterns:** Background thread updates: ```python import threading import time def update_loop(): while True: active_connections.set(compute_current_connections()) time.sleep(2) threading.Thread(target=update_loop, daemon=True).start() ``` Callback-based updates (called before each `/metrics` scrape): ```python def update_metrics(): active_connections.set(compute_current_connections()) endpoint.metrics.register_callback(update_metrics) ``` ### Examples Example scripts: [lib/bindings/python/examples/metrics/](https://github.com/ai-dynamo/dynamo/tree/main/lib/bindings/python/examples/metrics/) ```bash cd ~/dynamo/lib/bindings/python/examples/metrics DYN_SYSTEM_PORT=8081 ./server_with_loop.py DYN_SYSTEM_PORT=8081 ./server_with_callback.py ``` --- ## Related Documentation - [Metrics Overview](/dynamo/v-0-9-0/user-guides/observability-local/metrics) - [Prometheus and Grafana Setup](/dynamo/v-0-9-0/user-guides/observability-local/prometheus-grafana-setup) - [Distributed Runtime Architecture](/dynamo/v-0-9-0/design-docs/distributed-runtime) - [Python Metrics Examples](https://github.com/ai-dynamo/dynamo/tree/main/lib/bindings/python/examples/metrics/)