vllm.beam_search ¶
   BeamSearchInstance ¶
 Source code in vllm/beam_search.py
   beams  instance-attribute  ¶
 beams: list[BeamSearchSequence] = [
    BeamSearchSequence(
        tokens=prompt_tokens,
        logprobs=[] if logprobs is None else list(logprobs),
        lora_request=lora_request,
        **kwargs,
    )
]
  __init__ ¶
 __init__(
    prompt_tokens: list[int],
    lora_request: LoRARequest | None = None,
    logprobs: list[dict[int, Logprob]] | None = None,
    **kwargs,
)
Source code in vllm/beam_search.py
   BeamSearchOutput  dataclass  ¶
 The output of beam search. It contains the list of the best beam search sequences. The length of the list is equal to the beam width.
Source code in vllm/beam_search.py
    BeamSearchSequence  dataclass  ¶
 A sequence for beam search. It keeps track of the tokens and the log probability of the sequence. The text field is optional and will only be filled when the sequence is about to be returned to the user.
Source code in vllm/beam_search.py
   mm_processor_kwargs  class-attribute instance-attribute  ¶
    multi_modal_data  class-attribute instance-attribute  ¶
 multi_modal_data: Optional[MultiModalDataDict] = None
  __init__ ¶
 __init__(
    tokens: list[int],
    logprobs: list[dict[int, Logprob]],
    lora_request: LoRARequest | None = None,
    cum_logprob: float = 0.0,
    text: str | None = None,
    finish_reason: str | None = None,
    stop_reason: int | str | None = None,
    multi_modal_data: Optional[MultiModalDataDict] = None,
    mm_processor_kwargs: dict[str, Any] | None = None,
) -> None
  create_sort_beams_key_function ¶
     get_beam_search_score ¶
 get_beam_search_score(
    tokens: list[int],
    cumulative_logprob: float,
    eos_token_id: int,
    length_penalty: float = 1.0,
) -> float
Calculate the beam search score with length penalty.
Adapted from
https://github.com/huggingface/transformers/blob/ccb92be23def445f2afdea94c31286f84b89eb5b/src/transformers/generation/beam_search.py#L938