transformer_lens.model_bridge.supported_architectures.phi3 module

Phi-3 architecture adapter.

class transformer_lens.model_bridge.supported_architectures.phi3.Phi3ArchitectureAdapter(cfg: Any)

Bases: ArchitectureAdapter

Architecture adapter for Phi-3 models.

__init__(cfg: Any) None

Initialize the Phi-3 architecture adapter.

Parameters:

cfg – The configuration object.

prepare_loading(model_name: str, model_kwargs: dict) None

Patch cached Phi-3 remote code for transformers v5 compatibility.

preprocess_weights(state_dict: dict[str, Tensor]) dict[str, Tensor]

Fold layer norms into joint QKV/gate_up projections.

Standard fold_ln can’t handle joint projections (shape mismatch on round-trip), so we scale the full joint weights directly.

setup_component_testing(hf_model: Any, bridge_model: Any = None) None

Set up rotary embedding references for Phi-3 component testing.

Parameters:
  • hf_model – The HuggingFace Phi-3 model instance

  • bridge_model – The TransformerBridge model (if available)