vllm.v1.sample.logits_processor.builtin ¶
   LogitBiasLogitsProcessor ¶
  Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
   logits_slice  instance-attribute  ¶
    __init__ ¶
  Source code in vllm/v1/sample/logits_processor/builtin.py
   _device_tensor ¶
     apply ¶
     is_argmax_invariant ¶
 is_argmax_invariant() -> bool
Logit bias can rebalance token probabilities and change the outcome of argmax in greedy sampling.
  update_state ¶
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
   MinPLogitsProcessor ¶
  Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
   min_p_cpu_tensor  instance-attribute  ¶
 min_p_cpu_tensor = zeros(
    (max_num_reqs,),
    dtype=float32,
    device="cpu",
    pin_memory=is_pin_memory,
)
  min_p_device  instance-attribute  ¶
    __init__ ¶
 __init__(
    vllm_config: VllmConfig,
    device: device,
    is_pin_memory: bool,
)
Source code in vllm/v1/sample/logits_processor/builtin.py
   apply ¶
  Source code in vllm/v1/sample/logits_processor/builtin.py
   get_min_p_by_index ¶
     update_state ¶
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
   MinTokensLogitsProcessor ¶
  Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
   logits_slice  instance-attribute  ¶
    __init__ ¶
 __init__(
    vllm_config: VllmConfig,
    device: device,
    is_pin_memory: bool,
)
Source code in vllm/v1/sample/logits_processor/builtin.py
   _device_tensor ¶
     add_request  staticmethod  ¶
 add_request(
    params: SamplingParams,
    _: list[int] | None,
    output_tok_ids: list[int],
) -> tuple[int, Sequence[int], set[int]] | None
Source code in vllm/v1/sample/logits_processor/builtin.py
   apply ¶
     is_argmax_invariant ¶
 is_argmax_invariant() -> bool
By censoring stop tokens, min-tokens can change the outcome of the argmax operation in greedy sampling.
  update_state ¶
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
   process_dict_updates ¶
 process_dict_updates(
    req_entries: dict[int, T],
    batch_update: BatchUpdate | None,
    new_state: Callable[
        [SamplingParams, list[int] | None, list[int]],
        T | None,
    ],
) -> bool
Utility function to update dict state for sparse LogitsProcessors.