transformer_lens.model_bridge.generalized_components.altup_block module¶
Block bridge for AltUp (Alternating Updates) decoder layers.
- class transformer_lens.model_bridge.generalized_components.altup_block.AltUpBlockBridge(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None, hook_alias_overrides: Dict[str, str] | None = None)¶
Bases:
GeneralizedComponentBlock bridge for a decoder layer that operates on a stacked AltUp residual.
Direct GeneralizedComponent subclass (not BlockBridge) because the layer’s residual is a stacked
[num_altup_inputs, batch, seq, d_model]tensor, not a single stream.hook_in/hook_outcarry the full stack;hook_resid_pre/hook_resid_postexpose the active stream (altup_active_idx) as a conventional[batch, seq, d_model]residual and are patchable (written back into the stack).- forward(*args: Any, **kwargs: Any) Any¶
Delegate to the HF layer, hooking the AltUp stack and the active residual stream.
- hook_aliases: Dict[str, str | List[str]] = {'hook_attn_out': 'self_attn.hook_out', 'hook_mlp_out': 'mlp.hook_out'}¶
- is_list_item: bool = True¶
- real_components: Dict[str, tuple]¶
- training: bool¶