From 242c93f74494e32569cc836d5075a88ccddc53a7 Mon Sep 17 00:00:00 2001 From: R0CKSTAR Date: Wed, 25 Mar 2026 19:54:36 +0800 Subject: [PATCH] [Docs] Adds vllm-musa to custom_op.md (#37840) Signed-off-by: Xiaodong Ye --- docs/design/custom_op.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/design/custom_op.md b/docs/design/custom_op.md index 17a571591..4aefeb558 100644 --- a/docs/design/custom_op.md +++ b/docs/design/custom_op.md @@ -266,7 +266,7 @@ Currently, thanks to [vLLM's hardware-plugin mechanism](./plugin_system.md), the - **Official device plugins:** [vllm-ascend](https://github.com/vllm-project/vllm-ascend) (for Huawei Ascend NPU), [vllm-spyre](https://github.com/vllm-project/vllm-spyre) (for Spyre), [vllm-gaudi](https://github.com/vllm-project/vllm-gaudi) (for Intel Gaudi), [vllm-neuron](https://github.com/vllm-project/vllm-neuron) (for AWS Neuron), [vllm-meta](https://github.com/vllm-project/vllm-metal) (for Apple Silicon), etc. -- **Non-official device plugins:** [vllm-metax](https://github.com/MetaX-MACA/vLLM-metax) (for MetaX GPU), [vllm-kunlun](https://github.com/baidu/vLLM-Kunlun) (for Baidu Kunlun XPU), etc. +- **Non-official device plugins:** [vllm-metax](https://github.com/MetaX-MACA/vLLM-metax) (for MetaX GPU), [vllm-kunlun](https://github.com/baidu/vLLM-Kunlun) (for Baidu Kunlun XPU), [vllm-musa](https://github.com/MooreThreads/vllm-musa) (for Moore Threads GPU), etc. In this case, `CustomOp` can enable these hardware manufacturers to seamlessly replace vLLM's operations with their deep-optimized kernels for specific devices at runtime, by just registering an OOT `CustomOp` and implementing the `forward_oot()` method. @@ -289,7 +289,7 @@ Taking `MMEncoderAttention` as an example: def __init__(...): super().__init__(...) - + def forward_oot(...): # Call optimized device-specific kernels. ...