vllm.compilation.matcher_utils ¶
   QUANT_OPS  module-attribute  ¶
 QUANT_OPS: dict[QuantKey, OpOverload] = {
    kFp8StaticTensorSym: default,
    kFp8DynamicTensorSym: default,
    kFp8DynamicTokenSym: default,
}
  MatcherCustomOp ¶
  Bases: ABC
Source code in vllm/compilation/matcher_utils.py
   __call__ ¶
     __init__ ¶
 __init__(enabled: bool)
Source code in vllm/compilation/matcher_utils.py
   empty ¶
     empty_f32 ¶
     forward_custom  abstractmethod  ¶
     forward_native  abstractmethod  ¶
     MatcherFusedAddRMSNorm ¶
  Bases: MatcherCustomOp
Source code in vllm/compilation/matcher_utils.py
   MatcherQuantFP8 ¶
  Bases: MatcherCustomOp
Source code in vllm/compilation/matcher_utils.py
   __init__ ¶
  Source code in vllm/compilation/matcher_utils.py
   forward_custom ¶
  Source code in vllm/compilation/matcher_utils.py
   forward_native ¶
     inputs ¶
     make_scale ¶
 make_scale(input: Tensor)
Source code in vllm/compilation/matcher_utils.py
   MatcherRMSNorm ¶
  Bases: MatcherCustomOp
Source code in vllm/compilation/matcher_utils.py
   MatcherSiluAndMul ¶
  Bases: MatcherCustomOp