transformer_lens.model_bridge.supported_architectures.phi3 module¶
Phi-3 architecture adapter.
- class transformer_lens.model_bridge.supported_architectures.phi3.Phi3ArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterArchitecture adapter for Phi-3 models.
- __init__(cfg: Any) None¶
Initialize the Phi-3 architecture adapter.
- Parameters:
cfg – The configuration object.
- prepare_loading(model_name: str, model_kwargs: dict) None¶
Patch cached Phi-3 remote code for transformers v5 compatibility.
- preprocess_weights(state_dict: dict[str, Tensor]) dict[str, Tensor]¶
Fold layer norms into joint QKV/gate_up projections.
Standard fold_ln can’t handle joint projections (shape mismatch on round-trip), so we scale the full joint weights directly.
- setup_component_testing(hf_model: Any, bridge_model: Any = None) None¶
Set up rotary embedding references for Phi-3 component testing.
- Parameters:
hf_model – The HuggingFace Phi-3 model instance
bridge_model – The TransformerBridge model (if available)