transformer_lens.model_bridge.supported_architectures.neo module

Neo architecture adapter.

class transformer_lens.model_bridge.supported_architectures.neo.NeoArchitectureAdapter(cfg: Any)

Bases: ArchitectureAdapter

Architecture adapter for Neo models.

__init__(cfg: Any) None

Initialize the Neo architecture adapter.

class transformer_lens.model_bridge.supported_architectures.neo.NeoLinearTransposeConversion(rearrange_pattern: str | None = None, **axes_lengths)

Bases: BaseTensorConversion

Transpose Linear weights to Conv1D format and rearrange for GPT-Neo.

GPT-Neo uses standard PyTorch Linear layers with weights shaped [out_features, in_features]. This conversion transposes them to Conv1D format [in_features, out_features] and then applies einops rearrangement for attention heads.

__init__(rearrange_pattern: str | None = None, **axes_lengths)

Initialize the conversion.

Parameters:
  • rearrange_pattern – Optional einops pattern for rearrangement after transpose

  • **axes_lengths – Additional axes lengths for einops (e.g., n=n_heads)

handle_conversion(input_value: Tensor, *full_context) Tensor

Transpose from Linear to Conv1D format and optionally rearrange.

revert(input_value: Tensor, *full_context) Tensor

Revert rearrangement and transpose back to Linear format.