Migrating to TransformerLens 3¶
TransformerLens 3 introduces TransformerBridge, a new way of loading and instrumenting models that replaces HookedTransformer.from_pretrained as the recommended path for new code. Existing HookedTransformer code continues to run through a compatibility layer, but adopting the bridge unlocks broader architecture support and puts you on the supported path going forward.
This page explains the differences and gives side-by-side migration recipes for the most common patterns.
Why the change?¶
HookedTransformer was a single unified implementation that every supported architecture had to be mapped into. That was beautiful in theory — interpretability code written once worked everywhere — but in practice it meant that adding a new architecture required reimplementing its forward pass inside TransformerLens, and any divergence from the HuggingFace version was a latent source of bugs.
TransformerBridge flips the arrangement. Instead of reimplementing models, it keeps the native HuggingFace implementation and wraps it behind a consistent interface through an architecture adapter. The adapter knows how the HF module graph maps onto a small set of generalized components (embedding, attention, MLP, normalization, blocks) and registers uniform hook points over them. The result is the same familiar TransformerLens experience — hooks, caches, patching — but applied to the real HF model, and extended to 50+ architectures out of the box.
Loading a model¶
The loading API changes shape, but the mental model is the same: give it a HuggingFace model id, get back an object with the TransformerLens surface.
# Before — TransformerLens 2.x
from transformer_lens import HookedTransformer
model = HookedTransformer.from_pretrained("gpt2", device="cpu")
# After — TransformerLens 3.x
from transformer_lens.model_bridge import TransformerBridge
bridge = TransformerBridge.boot_transformers("gpt2", device="cpu")
Parameters that carried over¶
device, dtype, and tokenizer all work the same way.
Parameters that moved¶
Weight-processing flags (fold_ln, center_writing_weights, center_unembed, fold_value_biases, refactor_factored_attn_matrices) no longer live on the load call. They moved to enable_compatibility_mode() — see the next section.
Parameters that were removed¶
n_devices, move_to_device, first_n_layers, and n_ctx are not part of boot_transformers. If you relied on any of these, file an issue describing your use case — the right pattern for multi-GPU loads under the bridge is still being worked out.
Parameters that are new¶
load_weights: bool = True— set toFalseto construct the bridge with just the config (useful for shape-checking without paying the weight-load cost).trust_remote_code: bool = False— pass through to HuggingFace for models that ship custom modeling code.hf_config_overrides: dict | None = None— override specific fields of the HF config before the model is constructed.hf_model/model_class— advanced: pass in a pre-loaded HF model or a specific model class.
Weight processing is now opt-in¶
This is the biggest behavioral change. HookedTransformer.from_pretrained applied fold_ln, center_writing_weights, and center_unembed by default. The bridge does not apply any of these on load — the raw HF weights are preserved.
If your existing code depends on folded/centered weights (e.g. for direct logit attribution, or any analysis that reasons about activations in the post-processed coordinate system), call enable_compatibility_mode after booting:
# Before
model = HookedTransformer.from_pretrained("gpt2") # fold_ln=True, center_*=True by default
# After
bridge = TransformerBridge.boot_transformers("gpt2")
bridge.enable_compatibility_mode() # applies fold_ln, center_writing_weights, center_unembed, fold_value_biases
enable_compatibility_mode defaults to the same processing HookedTransformer used to do. You can opt out of individual steps, or disable all processing with no_processing=True:
bridge.enable_compatibility_mode(
fold_ln=True,
center_writing_weights=True,
center_unembed=True,
fold_value_biases=True,
refactor_factored_attn_matrices=False, # same default as before
)
If you want no processing at all — the bridge’s native default — you can skip enable_compatibility_mode entirely, or call it with no_processing=True if you still want the hook/component compatibility layer without the weight transforms.
Hook names¶
The canonical hook names on the bridge use a uniform hook_in / hook_out convention. The old TransformerLens names are preserved through an alias layer, so existing code keeps working without changes:
# Both of these return the same tensor on a bridge
cache["blocks.0.hook_resid_pre"] # legacy alias — still works
cache["blocks.0.hook_in"] # canonical name — preferred for new code
For the full mapping of legacy → canonical names and the expected tensor shape at each hook point, see the Model Structure page.
APIs that are unchanged¶
These work identically on TransformerBridge and need no migration:
to_tokens,to_stringgeneraterun_with_hooksrun_with_cache__call__/forwardcfg.*— the bridge exposes a.cfgwith the same fields (n_layers,n_heads,d_model,d_vocab,n_ctx, …)W_Q,W_K,W_V,W_O,b_Q,b_K,b_V,b_O— attention weights are exposed with the same[n_heads, d_model, d_head]shape conventions
If your code only touches these APIs, the migration is genuinely just the loading call and (optionally) enable_compatibility_mode.
Model name aliases are deprecated¶
HookedTransformer.from_pretrained accepted a lot of short aliases ("llama-7b-hf", "gpt-neo-125M", etc.) that mapped to specific HuggingFace paths. The bridge accepts the official HuggingFace names directly, and emits a deprecation warning when you pass a legacy alias. The aliases will be removed in the next major version.
# Legacy (deprecated, still works with a warning)
TransformerBridge.boot_transformers("gpt2")
# Preferred
TransformerBridge.boot_transformers("openai-community/gpt2")
Check the TransformerBridge Models page for the canonical model ids.
Full before-and-after example¶
A typical HookedTransformer notebook setup:
from transformer_lens import HookedTransformer
model = HookedTransformer.from_pretrained(
"gpt2",
device="cuda",
dtype=torch.float32,
)
logits, cache = model.run_with_cache("The quick brown fox")
resid_pre = cache["blocks.0.hook_resid_pre"]
pattern = cache["blocks.0.attn.hook_pattern"]
The bridge equivalent:
from transformer_lens.model_bridge import TransformerBridge
bridge = TransformerBridge.boot_transformers(
"openai-community/gpt2",
device="cuda",
dtype=torch.float32,
)
bridge.enable_compatibility_mode() # match HookedTransformer's default weight processing
logits, cache = bridge.run_with_cache("The quick brown fox")
resid_pre = cache["blocks.0.hook_in"] # or "blocks.0.hook_resid_pre" via alias
pattern = cache["blocks.0.attn.hook_pattern"]
The cache, hook, and config APIs are the same. The only lines that had to change are the import, the load call, and — if you want the old weight-processing behavior — one extra call to enable_compatibility_mode.
When to stay on HookedTransformer¶
If your code runs unchanged on TransformerLens 3 via the compatibility layer and you don’t need architectures beyond what HookedTransformer already supported, there is no hard deadline to migrate. But new architectures, weight-processing controls, and hook refinements are landing on the bridge side — new work should target the bridge, and migrating existing projects is the long-term supported path.