transformer_lens.model_bridge.generalized_components.mlp module

MLP bridge component.

This module contains the bridge component for MLP layers.

class transformer_lens.model_bridge.generalized_components.mlp.MLPBridge(name: str | None, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = {}, optional: bool = False)

Bases: GeneralizedComponent

Bridge component for MLP layers.

This component wraps an MLP layer from a remote model and provides a consistent interface for accessing its weights and performing MLP operations.

__init__(name: str | None, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = {}, optional: bool = False)

Initialize the MLP bridge.

Parameters:
  • name – The name of the component in the model (None if no container exists)

  • config – Optional configuration (unused for MLPBridge)

  • submodules – Dictionary of submodules to register (e.g., gate_proj, up_proj, down_proj)

  • optional – If True, setup skips this bridge when absent (hybrid architectures).

forward(*args, **kwargs) Tensor

Forward pass through the MLP bridge.

Parameters:
  • *args – Positional arguments for the original component

  • **kwargs – Keyword arguments for the original component

Returns:

Output hidden states

hook_aliases: Dict[str, str | List[str]] = {'hook_post': 'out.hook_in', 'hook_pre': 'in.hook_out'}
property_aliases: Dict[str, str] = {'W_gate': 'gate.weight', 'W_in': 'in.weight', 'W_out': 'out.weight', 'b_gate': 'gate.bias', 'b_in': 'in.bias', 'b_out': 'out.bias'}
real_components: Dict[str, tuple]
training: bool