transformer_lens.model_bridge.supported_architectures.mamba module¶
Architecture adapter for HF’s MambaForCausalLM (Mamba-1).
- class transformer_lens.model_bridge.supported_architectures.mamba.MambaArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterWraps HF’s MambaForCausalLM. No attention, no positional embeddings.
SSM config fields (state_size, conv_kernel, expand, time_step_rank, intermediate_size) are propagated from the HF config via
_HF_PASSTHROUGH_ATTRSin sources/transformers.py.- applicable_phases: list[int] = []¶
- component_mapping: ComponentMapping | None¶
- create_stateful_cache(hf_model: Any, batch_size: int, device: Any, dtype: dtype) Any¶
Build a MambaCache for the stateful generation loop.
- uses_split_attention: bool¶
- weight_processing_conversions: Dict[str, ParamProcessingConversion | str] | None¶