transformer_lens.model_bridge.supported_architectures.granite module¶

Granite architecture adapter.

Base adapter for the IBM Granite model family. Provides shared config setup and helper methods used by GraniteMoe and GraniteMoeHybrid variants.

class transformer_lens.model_bridge.supported_architectures.granite.GraniteArchitectureAdapter(cfg: Any)¶

Bases: ArchitectureAdapter

Architecture adapter for IBM Granite models (dense).

Granite is a Llama-like architecture with RMSNorm, rotary position embeddings (RoPE), GQA, and a gated MLP (SiLU activation). Granite-specific scaling multipliers are handled by the HF model’s native forward pass.

Optional Parameters (may not exist in state_dict):¶

Granite models do NOT have biases on attention and MLP projections:

blocks.{i}.attn.b_Q/b_K/b_V/b_O - No bias on attention projections
blocks.{i}.mlp.b_in/b_gate/b_out - No bias on MLP projections
blocks.{i}.ln1.b, blocks.{i}.ln2.b, ln_final.b - RMSNorm has no bias

__init__(cfg: Any) → None¶: Initialize the Granite architecture adapter.

setup_component_testing(hf_model: Any, bridge_model: Any = None) → None¶

Set up rotary embedding references for Granite component testing.

Parameters:

hf_model – The HuggingFace Granite model instance
bridge_model – The TransformerBridge model (if available)