vllm.v1.pool.metadata ¶
PoolingCursor dataclass ¶
Source code in vllm/v1/pool/metadata.py
__getitem__ ¶
__getitem__(indices: slice)
Source code in vllm/v1/pool/metadata.py
__init__ ¶
__init__(
index: list[int],
first_token_indices_gpu: Tensor,
last_token_indices_gpu: Tensor,
prompt_lens_cpu: Tensor,
seq_lens_cpu: Tensor,
num_scheduled_tokens_cpu: Tensor,
) -> None
is_finished ¶
PoolingMetadata dataclass ¶
Tensors for pooling.
Source code in vllm/v1/pool/metadata.py
__getitem__ ¶
__getitem__(indices: slice)
Source code in vllm/v1/pool/metadata.py
__init__ ¶
__init__(
prompt_lens: Tensor,
prompt_token_ids: Tensor | None,
pooling_params: list[PoolingParams],
pooling_states: list[PoolingStates],
pooling_cursor: PoolingCursor | None = None,
) -> None
__post_init__ ¶
Source code in vllm/v1/pool/metadata.py
build_pooling_cursor ¶
get_prompt_token_ids ¶
Source code in vllm/v1/pool/metadata.py
build_pooling_cursor ¶
build_pooling_cursor(
num_scheduled_tokens: list[int],
seq_lens_cpu: Tensor,
prompt_lens: Tensor,
device: device,
)