transformer_lens.model_bridge.supported_architectures.phimoe module

PhiMoE architecture adapter.

class transformer_lens.model_bridge.supported_architectures.phimoe.PhiMoEArchitectureAdapter(cfg: Any)

Bases: ArchitectureAdapter

Architecture adapter for Microsoft PhiMoE models.

PhiMoE is a Phi-style decoder with LayerNorm, split Q/K/V attention, and a sparse MoE block. This adapter targets the native Transformers implementation (trust_remote_code=False); the archived remote implementation is not compatible with modern Transformers generation/cache semantics.

__init__(cfg: Any) None

Initialize the PhiMoE architecture adapter.

prepare_loading(model_name: str, model_kwargs: dict) None

Force eager attention for consistent hookable generation.

prepare_model(hf_model: Any) None

Force eager attention on the loaded HF model.