--- title: KV Cache Transfer in Disaggregated Serving --- In disaggregated serving architectures, KV cache must be transferred between prefill and decode workers. TensorRT-LLM supports two methods for this transfer: ## Default Method: UCX By default, TensorRT-LLM uses UCX (Unified Communication X) for KV cache transfer between prefill and decode workers. UCX provides high-performance communication optimized for GPU-to-GPU transfers. ## Beta Method: NIXL TensorRT-LLM also supports using **NIXL** (NVIDIA Inference Xfer Library) for KV cache transfer. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments. **Note:** NIXL support in TensorRT-LLM is currently beta and may have some sharp edges. ## Using NIXL for KV Cache Transfer **Note:** NIXL version shipped with current dynamo is not supported by tensorrt-llm<=1.2.0rc2. In order to use NIXL backend for KV cache transfer, users are required to build container image with tensorrt-llm>=1.2.0rc3. To enable NIXL for KV cache transfer in disaggregated serving: 1. **Build the container with NIXL support(tensorrt-llm==1.2.0rc3):** ```bash ./container/build.sh --framework trtllm \ --tensorrtllm-pip-wheel tensorrt-llm==1.2.0rc3 ``` 2. **Run the containerized environment:** See [run container](/dynamo/v-0-7-0/components/backends/tensor-rt-llm#run-container) section to learn how to start the container image built in previous step. Within container, unset `TRTLLM_USE_UCX_KVCACHE` variable so NIXL can be used instead of UCX. ```bash unset TRTLLM_USE_UCX_KVCACHE ``` 3. **Start the disaggregated service:** See [disaggregated serving](/dynamo/v-0-7-0/components/backends/tensor-rt-llm#disaggregated-serving) to see how to start the deployment. 4. **Send the request:** See [client](/dynamo/v-0-7-0/components/backends/tensor-rt-llm#client) section to learn how to send the request to deployment. **Important:** Ensure that ETCD and NATS services are running before starting the service.