Contents Menu Expand Light mode Dark mode Auto light/dark mode
TransformerLens Documentation
Logo
TransformerLens Documentation

Introduction

  • Getting Started
  • Getting Started in Mechanistic Interpretability
  • Gallery

News

  • TransformerLens 3.0
  • TransformerLens 2.0

Documentation

  • transformer_lens
    • transformer_lens package
      • transformer_lens.benchmarks package
        • transformer_lens.benchmarks.activation_cache module
        • transformer_lens.benchmarks.audio module
        • transformer_lens.benchmarks.backward_gradients module
        • transformer_lens.benchmarks.component_benchmark module
        • transformer_lens.benchmarks.component_outputs module
        • transformer_lens.benchmarks.forward_pass module
        • transformer_lens.benchmarks.generation module
        • transformer_lens.benchmarks.granular_weight_processing module
        • transformer_lens.benchmarks.hook_registration module
        • transformer_lens.benchmarks.hook_structure module
        • transformer_lens.benchmarks.main_benchmark module
        • transformer_lens.benchmarks.multimodal module
        • transformer_lens.benchmarks.text_quality module
        • transformer_lens.benchmarks.utils module
        • transformer_lens.benchmarks.weight_processing module
      • transformer_lens.cache package
        • transformer_lens.cache.key_value_cache module
        • transformer_lens.cache.key_value_cache_entry module
      • transformer_lens.components package
        • transformer_lens.components.abstract_attention module
        • transformer_lens.components.attention module
        • transformer_lens.components.bert_block module
        • transformer_lens.components.bert_embed module
        • transformer_lens.components.bert_mlm_head module
        • transformer_lens.components.bert_nsp_head module
        • transformer_lens.components.bert_pooler module
        • transformer_lens.components.embed module
        • transformer_lens.components.grouped_query_attention module
        • transformer_lens.components.layer_norm module
        • transformer_lens.components.layer_norm_pre module
        • transformer_lens.components.pos_embed module
        • transformer_lens.components.rms_norm module
        • transformer_lens.components.rms_norm_pre module
        • transformer_lens.components.t5_attention module
        • transformer_lens.components.t5_block module
        • transformer_lens.components.token_typed_embed module
        • transformer_lens.components.transformer_block module
        • transformer_lens.components.unembed module
      • transformer_lens.config package
        • transformer_lens.config.TransformerBridgeConfig module
        • transformer_lens.config.TransformerLensConfig module
      • transformer_lens.conversion_utils package
        • transformer_lens.conversion_utils.conversion_steps package
          • transformer_lens.conversion_utils.conversion_steps.arithmetic_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.attention_auto_conversion module
          • transformer_lens.conversion_utils.conversion_steps.base_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.callable_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.chain_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.rearrange_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.repeat_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.split_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.tensor_conversion_set module
          • transformer_lens.conversion_utils.conversion_steps.ternary_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.transpose_tensor_conversion module
          • transformer_lens.conversion_utils.conversion_steps.zeros_like_conversion module
        • transformer_lens.conversion_utils.helpers package
          • transformer_lens.conversion_utils.helpers.find_property module
          • transformer_lens.conversion_utils.helpers.merge_quantiziation_fields module
        • transformer_lens.conversion_utils.hook_conversion_utils module
        • transformer_lens.conversion_utils.param_processing_conversion module
      • transformer_lens.factories package
        • transformer_lens.factories.activation_function_factory module
        • transformer_lens.factories.architecture_adapter_factory module
        • transformer_lens.factories.mlp_factory module
      • transformer_lens.lit package
        • transformer_lens.lit.constants module
        • transformer_lens.lit.dataset module
        • transformer_lens.lit.model module
        • transformer_lens.lit.utils module
      • transformer_lens.model_bridge package
        • transformer_lens.model_bridge.generalized_components package
          • transformer_lens.model_bridge.generalized_components.alibi_joint_qkv_attention module
          • transformer_lens.model_bridge.generalized_components.alibi_utils module
          • transformer_lens.model_bridge.generalized_components.attention module
          • transformer_lens.model_bridge.generalized_components.audio_feature_extractor module
          • transformer_lens.model_bridge.generalized_components.base module
          • transformer_lens.model_bridge.generalized_components.block module
          • transformer_lens.model_bridge.generalized_components.bloom_attention module
          • transformer_lens.model_bridge.generalized_components.bloom_block module
          • transformer_lens.model_bridge.generalized_components.bloom_mlp module
          • transformer_lens.model_bridge.generalized_components.clip_vision_encoder module
          • transformer_lens.model_bridge.generalized_components.codegen_attention module
          • transformer_lens.model_bridge.generalized_components.conv1d module
          • transformer_lens.model_bridge.generalized_components.conv_pos_embed module
          • transformer_lens.model_bridge.generalized_components.depthwise_conv1d module
          • transformer_lens.model_bridge.generalized_components.embedding module
          • transformer_lens.model_bridge.generalized_components.gated_delta_net module
          • transformer_lens.model_bridge.generalized_components.gated_mlp module
          • transformer_lens.model_bridge.generalized_components.gated_rms_norm module
          • transformer_lens.model_bridge.generalized_components.joint_gate_up_mlp module
          • transformer_lens.model_bridge.generalized_components.joint_qkv_attention module
          • transformer_lens.model_bridge.generalized_components.joint_qkv_position_embeddings_attention module
          • transformer_lens.model_bridge.generalized_components.linear module
          • transformer_lens.model_bridge.generalized_components.mla_attention module
          • transformer_lens.model_bridge.generalized_components.mlp module
          • transformer_lens.model_bridge.generalized_components.moe module
          • transformer_lens.model_bridge.generalized_components.mpt_alibi_attention module
          • transformer_lens.model_bridge.generalized_components.normalization module
          • transformer_lens.model_bridge.generalized_components.pos_embed module
          • transformer_lens.model_bridge.generalized_components.position_embedding_hooks_mixin module
          • transformer_lens.model_bridge.generalized_components.position_embeddings_attention module
          • transformer_lens.model_bridge.generalized_components.rms_normalization module
          • transformer_lens.model_bridge.generalized_components.rotary_embedding module
          • transformer_lens.model_bridge.generalized_components.siglip_vision_encoder module
          • transformer_lens.model_bridge.generalized_components.ssm2_mixer module
          • transformer_lens.model_bridge.generalized_components.ssm_block module
          • transformer_lens.model_bridge.generalized_components.ssm_mixer module
          • transformer_lens.model_bridge.generalized_components.symbolic module
          • transformer_lens.model_bridge.generalized_components.t5_block module
          • transformer_lens.model_bridge.generalized_components.unembedding module
          • transformer_lens.model_bridge.generalized_components.vision_projection module
        • transformer_lens.model_bridge.sources package
          • transformer_lens.model_bridge.sources.transformers module
        • transformer_lens.model_bridge.supported_architectures package
          • transformer_lens.model_bridge.supported_architectures.apertus module
          • transformer_lens.model_bridge.supported_architectures.baichuan module
          • transformer_lens.model_bridge.supported_architectures.bert module
          • transformer_lens.model_bridge.supported_architectures.bloom module
          • transformer_lens.model_bridge.supported_architectures.codegen module
          • transformer_lens.model_bridge.supported_architectures.cohere module
          • transformer_lens.model_bridge.supported_architectures.deepseek_v3 module
          • transformer_lens.model_bridge.supported_architectures.falcon module
          • transformer_lens.model_bridge.supported_architectures.gemma1 module
          • transformer_lens.model_bridge.supported_architectures.gemma2 module
          • transformer_lens.model_bridge.supported_architectures.gemma3 module
          • transformer_lens.model_bridge.supported_architectures.gemma3_multimodal module
          • transformer_lens.model_bridge.supported_architectures.gpt2 module
          • transformer_lens.model_bridge.supported_architectures.gpt2_lm_head_custom module
          • transformer_lens.model_bridge.supported_architectures.gpt_bigcode module
          • transformer_lens.model_bridge.supported_architectures.gpt_oss module
          • transformer_lens.model_bridge.supported_architectures.gptj module
          • transformer_lens.model_bridge.supported_architectures.granite module
          • transformer_lens.model_bridge.supported_architectures.granite_moe module
          • transformer_lens.model_bridge.supported_architectures.granite_moe_hybrid module
          • transformer_lens.model_bridge.supported_architectures.hubert module
          • transformer_lens.model_bridge.supported_architectures.internlm2 module
          • transformer_lens.model_bridge.supported_architectures.llama module
          • transformer_lens.model_bridge.supported_architectures.llava module
          • transformer_lens.model_bridge.supported_architectures.llava_next module
          • transformer_lens.model_bridge.supported_architectures.llava_onevision module
          • transformer_lens.model_bridge.supported_architectures.mamba module
          • transformer_lens.model_bridge.supported_architectures.mamba2 module
          • transformer_lens.model_bridge.supported_architectures.mingpt module
          • transformer_lens.model_bridge.supported_architectures.mistral module
          • transformer_lens.model_bridge.supported_architectures.mixtral module
          • transformer_lens.model_bridge.supported_architectures.mpt module
          • transformer_lens.model_bridge.supported_architectures.nanogpt module
          • transformer_lens.model_bridge.supported_architectures.neel_solu_old module
          • transformer_lens.model_bridge.supported_architectures.neo module
          • transformer_lens.model_bridge.supported_architectures.neox module
          • transformer_lens.model_bridge.supported_architectures.olmo module
          • transformer_lens.model_bridge.supported_architectures.olmo2 module
          • transformer_lens.model_bridge.supported_architectures.olmo3 module
          • transformer_lens.model_bridge.supported_architectures.olmoe module
          • transformer_lens.model_bridge.supported_architectures.openelm module
          • transformer_lens.model_bridge.supported_architectures.opt module
          • transformer_lens.model_bridge.supported_architectures.phi module
          • transformer_lens.model_bridge.supported_architectures.phi3 module
          • transformer_lens.model_bridge.supported_architectures.pythia module
          • transformer_lens.model_bridge.supported_architectures.qwen module
          • transformer_lens.model_bridge.supported_architectures.qwen2 module
          • transformer_lens.model_bridge.supported_architectures.qwen3 module
          • transformer_lens.model_bridge.supported_architectures.qwen3_5 module
          • transformer_lens.model_bridge.supported_architectures.qwen3_moe module
          • transformer_lens.model_bridge.supported_architectures.qwen3_next module
          • transformer_lens.model_bridge.supported_architectures.stablelm module
          • transformer_lens.model_bridge.supported_architectures.t5 module
          • transformer_lens.model_bridge.supported_architectures.xglm module
        • transformer_lens.model_bridge.architecture_adapter module
        • transformer_lens.model_bridge.bridge module
        • transformer_lens.model_bridge.compat module
        • transformer_lens.model_bridge.component_setup module
        • transformer_lens.model_bridge.composition_scores module
        • transformer_lens.model_bridge.exceptions module
        • transformer_lens.model_bridge.get_params_util module
        • transformer_lens.model_bridge.types module
      • transformer_lens.pretrained package
        • transformer_lens.pretrained.weight_conversions package
          • transformer_lens.pretrained.weight_conversions.apertus module
          • transformer_lens.pretrained.weight_conversions.bert module
          • transformer_lens.pretrained.weight_conversions.bloom module
          • transformer_lens.pretrained.weight_conversions.coder module
          • transformer_lens.pretrained.weight_conversions.gemma module
          • transformer_lens.pretrained.weight_conversions.gpt2 module
          • transformer_lens.pretrained.weight_conversions.gptj module
          • transformer_lens.pretrained.weight_conversions.hubert module
          • transformer_lens.pretrained.weight_conversions.llama module
          • transformer_lens.pretrained.weight_conversions.mingpt module
          • transformer_lens.pretrained.weight_conversions.mistral module
          • transformer_lens.pretrained.weight_conversions.mixtral module
          • transformer_lens.pretrained.weight_conversions.nanogpt module
          • transformer_lens.pretrained.weight_conversions.neel_solu_old module
          • transformer_lens.pretrained.weight_conversions.neo module
          • transformer_lens.pretrained.weight_conversions.neox module
          • transformer_lens.pretrained.weight_conversions.olmo module
          • transformer_lens.pretrained.weight_conversions.olmo2 module
          • transformer_lens.pretrained.weight_conversions.olmo3 module
          • transformer_lens.pretrained.weight_conversions.olmoe module
          • transformer_lens.pretrained.weight_conversions.openai module
          • transformer_lens.pretrained.weight_conversions.opt module
          • transformer_lens.pretrained.weight_conversions.phi module
          • transformer_lens.pretrained.weight_conversions.phi3 module
          • transformer_lens.pretrained.weight_conversions.qwen module
          • transformer_lens.pretrained.weight_conversions.qwen2 module
          • transformer_lens.pretrained.weight_conversions.qwen3 module
          • transformer_lens.pretrained.weight_conversions.t5 module
      • transformer_lens.tools package
        • transformer_lens.tools.model_registry package
          • transformer_lens.tools.model_registry.alias_drift module
          • transformer_lens.tools.model_registry.api module
          • transformer_lens.tools.model_registry.discover_architectures module
          • transformer_lens.tools.model_registry.exceptions module
          • transformer_lens.tools.model_registry.generate_report module
          • transformer_lens.tools.model_registry.hf_scraper module
          • transformer_lens.tools.model_registry.registry_io module
          • transformer_lens.tools.model_registry.relevancy module
          • transformer_lens.tools.model_registry.schemas module
          • transformer_lens.tools.model_registry.validate module
          • transformer_lens.tools.model_registry.verification module
          • transformer_lens.tools.model_registry.verify_models module
      • transformer_lens.utilities package
        • transformer_lens.utilities.activation_functions module
        • transformer_lens.utilities.addmm module
        • transformer_lens.utilities.aliases module
        • transformer_lens.utilities.architectures module
        • transformer_lens.utilities.attention module
        • transformer_lens.utilities.attribute_utils module
        • transformer_lens.utilities.bridge_components module
        • transformer_lens.utilities.cache module
        • transformer_lens.utilities.components_utils module
        • transformer_lens.utilities.defaults_utils module
        • transformer_lens.utilities.devices module
        • transformer_lens.utilities.exploratory_utils module
        • transformer_lens.utilities.gpu_utils module
        • transformer_lens.utilities.hf_utils module
        • transformer_lens.utilities.initialization_utils module
        • transformer_lens.utilities.library_utils module
        • transformer_lens.utilities.lm_utils module
        • transformer_lens.utilities.logits_utils module
        • transformer_lens.utilities.matrix module
        • transformer_lens.utilities.multi_gpu module
        • transformer_lens.utilities.slice module
        • transformer_lens.utilities.tensors module
        • transformer_lens.utilities.tokenize_utils module
      • transformer_lens.HookedAudioEncoder module
      • transformer_lens.evals module
      • transformer_lens.head_detector module
      • transformer_lens.hook_points module
      • transformer_lens.loading_from_pretrained module
      • transformer_lens.patching module
      • transformer_lens.supported_models module
      • transformer_lens.train module
      • transformer_lens.utils module
      • transformer_lens.weight_processing module
  • Model Tables
    • HookedTransformer Model Properties
    • TransformerBridge Models
  • TransformerBridge Model Structure

Resources

  • Migrating to TransformerLens 3
  • Tutorials
  • Citation
  • Contributing
    • Architecture Adapter Creation Guide
    • HuggingFace Model Analysis Guide
  • Transformer Lens Main Demo Notebook
  • Setup
  • Introduction
  • Features
  • Exploratory Analysis Demo
  • Special Cases

Development

  • Contributing
    • Architecture Adapter Creation Guide
    • HuggingFace Model Analysis Guide
  • Code Coverage
  • Github
Back to top

Gallery¶

Research done involving TransformerLens:

  • Progress Measures for Grokking via Mechanistic Interpretability (ICLR Spotlight, 2023) by Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt

  • Finding Neurons in a Haystack: Case Studies with Sparse Probing by Wes Gurnee, Neel Nanda, Matthew Pauly, Katherine Harvey, Dmitrii Troitskii, Dimitris Bertsimas

  • Towards Automated Circuit Discovery for Mechanistic Interpretability by Arthur Conmy, Augustine N. Mavor-Parker, Aengus Lynch, Stefan Heimersheim, Adrià Garriga-Alonso

  • Actually, Othello-GPT Has A Linear Emergent World Representation by Neel Nanda

  • A circuit for Python docstrings in a 4-layer attention-only transformer by Stefan Heimersheim and Jett Janiak

  • A Toy Model of Universality (ICML, 2023) by Bilal Chughtai, Lawrence Chan, Neel Nanda

  • N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models (2023, ICLR Workshop RTML) by Alex Foote, Neel Nanda, Esben Kran, Ioannis Konstas, Fazl Barez

  • Eliciting Latent Predictions from Transformers with the Tuned Lens by Nora Belrose, Zach Furman, Logan Smith, Danny Halawi, Igor Ostrovsky, Lev McKinney, Stella Biderman, Jacob Steinhardt

User contributed examples of the library being used in action:

  • Induction Heads Phase Change Replication: A partial replication of In-Context Learning and Induction Heads from Connor Kissane

  • Decision Transformer Interpretability: A set of scripts for training decision transformers which uses transformer lens to view intermediate activations, perform attribution and ablations. A write up of the initial work can be found here.

Next
TransformerLens 3.0
Previous
Getting Started in Mechanistic Interpretability
Copyright © 2023, Neel Nanda
Made with Sphinx and @pradyunsg's Furo