transformer_lens.model_bridge.supported_architectures.granite module

Granite architecture adapter.

Base adapter for the IBM Granite model family. Provides shared config setup and helper methods used by GraniteMoe and GraniteMoeHybrid variants.

class transformer_lens.model_bridge.supported_architectures.granite.GraniteArchitectureAdapter(cfg: Any)

Bases: ArchitectureAdapter

Architecture adapter for IBM Granite models (dense).

Granite is a Llama-like architecture with RMSNorm, rotary position embeddings (RoPE), GQA, and a gated MLP (SiLU activation). Granite-specific scaling multipliers are handled by the HF model’s native forward pass.

Optional Parameters (may not exist in state_dict):

Granite models do NOT have biases on attention and MLP projections:

  • blocks.{i}.attn.b_Q/b_K/b_V/b_O - No bias on attention projections

  • blocks.{i}.mlp.b_in/b_gate/b_out - No bias on MLP projections

  • blocks.{i}.ln1.b, blocks.{i}.ln2.b, ln_final.b - RMSNorm has no bias

__init__(cfg: Any) None

Initialize the Granite architecture adapter.

setup_component_testing(hf_model: Any, bridge_model: Any = None) None

Set up rotary embedding references for Granite component testing.

Parameters:
  • hf_model – The HuggingFace Granite model instance

  • bridge_model – The TransformerBridge model (if available)