transformer_lens.model_bridge.supported_architectures.llama module¶

Llama architecture adapter.

class transformer_lens.model_bridge.supported_architectures.llama.LlamaArchitectureAdapter(cfg: Any)¶

Architecture adapter for Llama models.

Optional Parameters (may not exist in state_dict):¶

LLaMA models do NOT have biases on attention and MLP projections:

Weight processing must handle these missing biases gracefully using ProcessWeights._safe_get_tensor() or by checking for None values.

setup_component_testing(hf_model: Any, bridge_model: Any = None) → None¶

Set up rotary embedding references for Llama component testing.

Llama uses RoPE (Rotary Position Embeddings). We set the rotary_emb reference on all attention bridge instances for component testing.

Parameters:

hf_model – The HuggingFace Llama model instance
bridge_model – The TransformerBridge model (if available, set rotary_emb on actual instances)