transformer_lens.model_bridge.supported_architectures.phimoe module¶
PhiMoE architecture adapter.
- class transformer_lens.model_bridge.supported_architectures.phimoe.PhiMoEArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterArchitecture adapter for Microsoft PhiMoE models.
PhiMoE is a Phi-style decoder with LayerNorm, split Q/K/V attention, and a sparse MoE block. This adapter targets the native Transformers implementation (
trust_remote_code=False); the archived remote implementation is not compatible with modern Transformers generation/cache semantics.- __init__(cfg: Any) None¶
Initialize the PhiMoE architecture adapter.
- prepare_loading(model_name: str, model_kwargs: dict) None¶
Force eager attention for consistent hookable generation.
- prepare_model(hf_model: Any) None¶
Force eager attention on the loaded HF model.