transformer_lens.model_bridge.supported_architectures.gemma1 module¶
Gemma1 architecture adapter.
- class transformer_lens.model_bridge.supported_architectures.gemma1.Gemma1ArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterArchitecture adapter for Gemma1 models.
- __init__(cfg: Any) None¶
Initialize the Gemma1 architecture adapter.
- setup_component_testing(hf_model: Any, bridge_model: Any = None) None¶
Set up rotary embedding references for Gemma1 component testing.
Gemma1 uses RoPE (Rotary Position Embeddings). We set the rotary_emb reference on all attention bridge instances for component testing.
- Parameters:
hf_model – The HuggingFace Gemma1 model instance
bridge_model – The TransformerBridge model (if available, set rotary_emb on actual instances)
- setup_hook_compatibility(bridge: Any) None¶
Setup hook compatibility for Gemma1 models.
Gemma1 scales embeddings by sqrt(d_model) in its forward pass, but the HuggingFace embed_tokens layer doesn’t include this scaling. We need to apply it to hook_embed to match HookedTransformer behavior.
- Parameters:
bridge – The TransformerBridge instance