transformer_lens.model_bridge.generalized_components.altup_block module

Block bridge for AltUp (Alternating Updates) decoder layers.

class transformer_lens.model_bridge.generalized_components.altup_block.AltUpBlockBridge(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None, hook_alias_overrides: Dict[str, str] | None = None)

Bases: GeneralizedComponent

Block bridge for a decoder layer that operates on a stacked AltUp residual.

Direct GeneralizedComponent subclass (not BlockBridge) because the layer’s residual is a stacked [num_altup_inputs, batch, seq, d_model] tensor, not a single stream. hook_in/hook_out carry the full stack; hook_resid_pre/hook_resid_post expose the active stream (altup_active_idx) as a conventional [batch, seq, d_model] residual and are patchable (written back into the stack).

forward(*args: Any, **kwargs: Any) Any

Delegate to the HF layer, hooking the AltUp stack and the active residual stream.

hook_aliases: Dict[str, str | List[str]] = {'hook_attn_out': 'self_attn.hook_out', 'hook_mlp_out': 'mlp.hook_out'}
is_list_item: bool = True
real_components: Dict[str, tuple]
training: bool