vllm.entrypoints.pooling.embed.protocol ¶
EmbeddingRequest module-attribute ¶
EmbeddingRequest: TypeAlias = (
EmbeddingCompletionRequest | EmbeddingChatRequest
)
EmbeddingBytesResponse ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/pooling/embed/protocol.py
EmbeddingChatRequest ¶
Bases: PoolingBasicRequestMixin, ChatRequestMixin, EmbedRequestMixin
Source code in vllm/entrypoints/pooling/embed/protocol.py
mm_processor_kwargs class-attribute instance-attribute ¶
mm_processor_kwargs: dict[str, Any] | None = Field(
default=None,
description="Additional kwargs to pass to the HF processor.",
)
build_tok_params ¶
build_tok_params(
model_config: ModelConfig,
) -> TokenizeParams
Source code in vllm/entrypoints/pooling/embed/protocol.py
EmbeddingCompletionRequest ¶
Bases: PoolingBasicRequestMixin, CompletionRequestMixin, EmbedRequestMixin
Source code in vllm/entrypoints/pooling/embed/protocol.py
build_tok_params ¶
build_tok_params(
model_config: ModelConfig,
) -> TokenizeParams
Source code in vllm/entrypoints/pooling/embed/protocol.py
EmbeddingResponse ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/pooling/embed/protocol.py
created class-attribute instance-attribute ¶
id class-attribute instance-attribute ¶
id: str = Field(
default_factory=lambda: f"embd-{random_uuid()}"
)
EmbeddingResponseData ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/pooling/embed/protocol.py
_get_max_total_output_tokens ¶
_get_max_total_output_tokens(
model_config: ModelConfig,
) -> tuple[int | None, int]