transformer_lens.model_bridge.supported_architectures.mamba module¶

Architecture adapter for HF’s MambaForCausalLM (Mamba-1).

class transformer_lens.model_bridge.supported_architectures.mamba.MambaArchitectureAdapter(cfg: Any)¶

Bases: ArchitectureAdapter

Wraps HF’s MambaForCausalLM. No attention, no positional embeddings.

SSM config fields (state_size, conv_kernel, expand, time_step_rank, intermediate_size) are propagated from the HF config via _HF_PASSTHROUGH_ATTRS in sources/transformers.py.

applicable_phases: list[int] = [4]¶

component_mapping: ComponentMapping | None¶

create_stateful_cache(hf_model: Any, batch_size: int, device: Any, dtype: dtype) → Any¶: Build a cache for the stateful generation loop.

uses_split_attention: bool¶

weight_processing_conversions: Dict[str, ParamProcessingConversion | str] | None¶