[Docs] Adds vllm-musa to custom_op.md (#37840)

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
2026-03-25 19:54:36 +08:00
parent a889b7f584
commit 242c93f744
1 changed files with 2 additions and 2 deletions
--- a/docs/design/custom_op.md
+++ b/docs/design/custom_op.md
@@ -266,7 +266,7 @@ Currently, thanks to [vLLM's hardware-plugin mechanism](./plugin_system.md), the

 - **Official device plugins:** [vllm-ascend](https://github.com/vllm-project/vllm-ascend) (for Huawei Ascend NPU), [vllm-spyre](https://github.com/vllm-project/vllm-spyre)
 (for Spyre), [vllm-gaudi](https://github.com/vllm-project/vllm-gaudi) (for Intel Gaudi), [vllm-neuron](https://github.com/vllm-project/vllm-neuron) (for AWS Neuron), [vllm-meta](https://github.com/vllm-project/vllm-metal) (for Apple Silicon), etc.
- **Non-official device plugins:** [vllm-metax](https://github.com/MetaX-MACA/vLLM-metax) (for MetaX GPU), [vllm-kunlun](https://github.com/baidu/vLLM-Kunlun) (for Baidu Kunlun XPU), etc.
+- **Non-official device plugins:** [vllm-metax](https://github.com/MetaX-MACA/vLLM-metax) (for MetaX GPU), [vllm-kunlun](https://github.com/baidu/vLLM-Kunlun) (for Baidu Kunlun XPU), [vllm-musa](https://github.com/MooreThreads/vllm-musa) (for Moore Threads GPU), etc.

 In this case, `CustomOp` can enable these hardware manufacturers to seamlessly replace vLLM's operations with their deep-optimized kernels for specific devices at runtime, by just registering an OOT `CustomOp` and implementing the `forward_oot()` method.

@@ -289,7 +289,7 @@ Taking `MMEncoderAttention` as an example:

        def __init__(...):
            super().__init__(...)
-        
+
        def forward_oot(...):
            # Call optimized device-specific kernels.
            ...