transformer_lens.model_bridge.supported_architectures.mamba module

Architecture adapter for HF’s MambaForCausalLM (Mamba-1).

class transformer_lens.model_bridge.supported_architectures.mamba.MambaArchitectureAdapter(cfg: Any)

Bases: ArchitectureAdapter

Wraps HF’s MambaForCausalLM. No attention, no positional embeddings.

SSM config fields (state_size, conv_kernel, expand, time_step_rank, intermediate_size) are propagated from the HF config via _HF_PASSTHROUGH_ATTRS in sources/transformers.py.

applicable_phases: list[int] = []
component_mapping: ComponentMapping | None
create_stateful_cache(hf_model: Any, batch_size: int, device: Any, dtype: dtype) Any

Build a MambaCache for the stateful generation loop.

uses_split_attention: bool
weight_processing_conversions: Dict[str, ParamProcessingConversion | str] | None