vllm.v1.attention.backends.short_conv_attn ¶
   ShortConvAttentionBackend ¶
  Bases: AttentionBackend
Source code in vllm/v1/attention/backends/short_conv_attn.py
    ShortConvAttentionMetadata  dataclass  ¶
 Source code in vllm/v1/attention/backends/short_conv_attn.py
   token_chunk_offset_ptr  class-attribute instance-attribute  ¶
 token_chunk_offset_ptr: Tensor | None = None
  __init__ ¶
 __init__(
    num_prefills: int,
    num_prefill_tokens: int,
    num_decodes: int,
    num_decode_tokens: int,
    query_start_loc: Tensor,
    state_indices_tensor: Tensor,
    has_initial_states_p: Tensor | None,
    nums_dict: dict | None = None,
    batch_ptr: Tensor | None = None,
    token_chunk_offset_ptr: Tensor | None = None,
) -> None
  ShortConvAttentionMetadataBuilder ¶
  Bases: BaseMambaAttentionMetadataBuilder[ShortConvAttentionMetadata]
Source code in vllm/v1/attention/backends/short_conv_attn.py
   build ¶
 build(
    common_prefix_len: int,
    common_attn_metadata: CommonAttentionMetadata,
    fast_build: bool = False,
) -> ShortConvAttentionMetadata