vllm.transformers_utils.configs.funaudiochat ¶
FunAudioChatAudioEncoderConfig ¶
Bases: PretrainedConfig
Source code in vllm/transformers_utils/configs/funaudiochat.py
enable_audio_invert_tower instance-attribute ¶
__init__ ¶
__init__(
_attn_implementation: str | None = None,
num_mel_bins: int = 128,
encoder_layers: int = 32,
encoder_attention_heads: int = 20,
encoder_ffn_dim: int = 5120,
d_model: int = 1280,
dropout: float = 0.0,
attention_dropout: float = 0.0,
activation_function: str = "gelu",
activation_dropout: float = 0.0,
scale_embedding: bool = False,
initializer_range: float = 0.02,
max_source_positions: int = 1500,
n_window: int = 100,
output_dim: int = 3584,
bos_token_id: int | None = None,
codebook_size: int | None = None,
continuous_features_mode: str = "replace",
crq_transformer_config: dict | None = None,
eos_token_id: int | None = None,
group_size: int = 5,
enable_audio_invert_tower: bool = True,
pad_token_id: int | None = None,
**kwargs,
) -> None
Source code in vllm/transformers_utils/configs/funaudiochat.py
FunAudioChatConfig ¶
Bases: PretrainedConfig
Source code in vllm/transformers_utils/configs/funaudiochat.py
attribute_map class-attribute instance-attribute ¶
hidden_size instance-attribute ¶
__init__ ¶
__init__(
audio_config: PretrainedConfig | dict | None = None,
text_config: PretrainedConfig | dict | None = None,
audio_token_index: int = 151646,
ignore_index: int = -100,
hidden_size: int | None = None,
**kwargs,
) -> None