vllm.model_executor.model_loader.utils ¶
 Utilities for selecting and loading models.
  _MODEL_ARCH_BY_HASH  module-attribute  ¶
  Caches the outputs of _get_model_architecture.
  ParamMapping  dataclass  ¶
 A class to handle parameter mapping for model weight loading. It creates a bidirectional mapping between packed parameters and their constituent parts.
Source code in vllm/model_executor/model_loader/utils.py
   inverse_packed_mapping  class-attribute instance-attribute  ¶
    __init__ ¶
 __init__(
    packed_mapping: dict[str, list[str]],
    inverse_packed_mapping: dict[
        str, tuple[str, int]
    ] = dict(),
) -> None
  __post_init__ ¶
  Source code in vllm/model_executor/model_loader/utils.py
   get_sub_modules ¶
     _get_model_architecture ¶
 _get_model_architecture(
    model_config: ModelConfig,
) -> tuple[type[Module], str]
Source code in vllm/model_executor/model_loader/utils.py
   configure_quant_config ¶
 configure_quant_config(
    quant_config: QuantizationConfig,
    model_class: type[Module],
)
Pass packed_modules_mapping by reference to quant_config so that quant_config can properly match fused modules
Note that model attributes are passed by reference to quant_config, enabling them to be updated by model_class.new (ex. chatglm, qwen)
Once the SupportsQuant mixin has been added to all models, this function can be removed
Source code in vllm/model_executor/model_loader/utils.py
   device_loading_context ¶
  Source code in vllm/model_executor/model_loader/utils.py
   get_architecture_class_name ¶
 get_architecture_class_name(
    model_config: ModelConfig,
) -> str
  get_model_architecture ¶
 get_model_architecture(
    model_config: ModelConfig,
) -> tuple[type[Module], str]
Source code in vllm/model_executor/model_loader/utils.py
   get_model_cls ¶
 get_model_cls(model_config: ModelConfig) -> type[Module]
  initialize_model ¶
 initialize_model(
    vllm_config: VllmConfig,
    *,
    prefix: str = "",
    model_class: type[Module] | None = None,
    model_config: ModelConfig | None = None,
) -> Module
Initialize a model with the given configurations.
Source code in vllm/model_executor/model_loader/utils.py
   process_weights_after_loading ¶
 process_weights_after_loading(
    model: Module,
    model_config: ModelConfig,
    target_device: device,
) -> None