transformer_lens.model_bridge.supported_architectures.granite module¶
Granite architecture adapter.
Base adapter for the IBM Granite model family. Provides shared config setup and helper methods used by GraniteMoe and GraniteMoeHybrid variants.
- class transformer_lens.model_bridge.supported_architectures.granite.GraniteArchitectureAdapter(cfg: Any)¶
Bases:
ArchitectureAdapterArchitecture adapter for IBM Granite models (dense).
Granite is a Llama-like architecture with RMSNorm, rotary position embeddings (RoPE), GQA, and a gated MLP (SiLU activation). Granite-specific scaling multipliers are handled by the HF model’s native forward pass.
Optional Parameters (may not exist in state_dict):¶
Granite models do NOT have biases on attention and MLP projections:
blocks.{i}.attn.b_Q/b_K/b_V/b_O - No bias on attention projections
blocks.{i}.mlp.b_in/b_gate/b_out - No bias on MLP projections
blocks.{i}.ln1.b, blocks.{i}.ln2.b, ln_final.b - RMSNorm has no bias
- __init__(cfg: Any) None¶
Initialize the Granite architecture adapter.
- setup_component_testing(hf_model: Any, bridge_model: Any = None) None¶
Set up rotary embedding references for Granite component testing.
- Parameters:
hf_model – The HuggingFace Granite model instance
bridge_model – The TransformerBridge model (if available)