transformer_lens.model_bridge.supported_architectures.neo module¶
Neo architecture adapter.
- class transformer_lens.model_bridge.supported_architectures.neo.NeoArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterArchitecture adapter for Neo models.
- __init__(cfg: Any) None¶
Initialize the Neo architecture adapter.
- class transformer_lens.model_bridge.supported_architectures.neo.NeoLinearTransposeConversion(rearrange_pattern: str | None = None, **axes_lengths)¶
Bases:
BaseTensorConversionTranspose Linear weights to Conv1D format and rearrange for GPT-Neo.
GPT-Neo uses standard PyTorch Linear layers with weights shaped [out_features, in_features]. This conversion transposes them to Conv1D format [in_features, out_features] and then applies einops rearrangement for attention heads.
- __init__(rearrange_pattern: str | None = None, **axes_lengths)¶
Initialize the conversion.
- Parameters:
rearrange_pattern – Optional einops pattern for rearrangement after transpose
**axes_lengths – Additional axes lengths for einops (e.g., n=n_heads)
- handle_conversion(input_value: Tensor, *full_context) Tensor¶
Transpose from Linear to Conv1D format and optionally rearrange.
- revert(input_value: Tensor, *full_context) Tensor¶
Revert rearrangement and transpose back to Linear format.