transformer_lens.model_bridge.sources package¶

Submodules¶

transformer_lens.model_bridge.sources.transformers module

Module contents¶

Sources module.

This module provides functionality to load and convert models from HuggingFace to TransformerLens format.

Boot a model from HuggingFace.

Parameters:

model_name – The name of the model to load.
hf_config_overrides – Optional overrides applied to the HuggingFace config before model load.
device – The device to use. If None, will be determined automatically. Mutually exclusive with device_map.
dtype – The dtype to use for the model.
tokenizer – Optional pre-initialized tokenizer to use; if not provided one will be created.
load_weights – If False, load model without weights (on meta device) for config inspection only.
model_class – Optional HuggingFace model class to use instead of the default auto-detected class. When the class name matches a key in SUPPORTED_ARCHITECTURES, the corresponding adapter is selected automatically (e.g., BertForNextSentencePrediction).
hf_model – Optional pre-loaded HuggingFace model to use instead of loading one. Useful for models loaded with custom configurations (e.g., quantization via BitsAndBytesConfig). When provided, load_weights is ignored.
device_map – HuggingFace-style device map ("auto", "balanced", dict, etc.) for dispatched inference. Explicit maps may include CPU targets; disk / meta offload targets are still rejected because Bridge component wrappers need additional offload-hook routing work. Mutually exclusive with device.
n_devices – Convenience: split the model across this many CUDA devices (translated to a max_memory dict internally). Requires CUDA with at least this many visible devices.
max_memory – Optional per-device memory budget for HF’s dispatcher.
n_ctx – Optional context length override. The bridge normally uses the model’s documented max context from the HF config. Setting this writes to whichever HF field the model uses (n_positions / max_position_embeddings / etc.), so callers don’t need to know the field name. If larger than the model’s default, a warning is emitted — quality may degrade past the trained length for rotary models.
revision – Optional HF revision string (branch, tag, or commit). Forwarded to AutoConfig.from_pretrained and AutoModelForCausalLM.from_pretrained. Mutually exclusive with checkpoint_index and checkpoint_value.
checkpoint_index – Index into the available training checkpoints for the model family. Convenience over revision for checkpointed models like EleutherAI/pythia* and stanford-crfm/*. Resolved to a revision string via the known per-family naming conventions (step{value} for Pythia, checkpoint-{value} for stanford-crfm).
checkpoint_value – Training step or token count of the desired checkpoint. Alternative to checkpoint_index; must be one of the labels returned by get_checkpoint_labels.

Returns:

The bridge to the loaded model.

transformer_lens.model_bridge.sources.build_bridge_config_from_hf(hf_config: Any, architecture: str, model_name: str, dtype: dtype) → TransformerBridgeConfig¶: Translate an HF config into a TransformerBridgeConfig.

transformer_lens.model_bridge.sources.build_bridge_from_module(model: Module, architecture: str, *, hf_config: Any | None = None, tl_config: TransformerBridgeConfig | None = None, tokenizer: Any | None = None, dtype: dtype | None = None, device: Any | None = None, model_name: str = 'external', post_adapter_hook: Callable[[ArchitectureAdapter], None] | None = None) → TransformerBridge¶

Build a TransformerBridge around a pre-loaded model.

The bridge never moves, casts, or mutates the supplied model.

Parameters:

model – Any nn.Module whose submodule tree matches the adapter’s expected dot-paths for architecture.
architecture – Architecture identifier registered in the ArchitectureAdapterFactory (e.g. "LlamaForCausalLM", "TransformerLensNative").
hf_config – Optional HF-style config; translated via build_bridge_config_from_hf(). Mutually exclusive with tl_config.
tl_config – Optional pre-built TransformerBridgeConfig; bypasses HF translation. Mutually exclusive with hf_config.
tokenizer – Optional tokenizer. If supplied, passes through setup_tokenizer and detects BOS/EOS behavior.
dtype – Recorded on cfg.dtype. Default None reads from the model’s first parameter; explicit values override.
device – Recorded on cfg.device. Default None reads from the model’s first parameter.
model_name – Recorded on cfg.model_name.
post_adapter_hook – Optional callback invoked after adapter selection and before adapter.prepare_model(). Source-specific overlays mutate component_mapping here.

Returns:

A TransformerBridge wrapping the supplied model.

transformer_lens.model_bridge.sources.check_model_support(model_id: str) → dict¶

Check if a model is supported and get detailed support info.

This function provides detailed information about a model’s compatibility with TransformerLens, including architecture type and verification status.

Parameters:

model_id – The HuggingFace model ID to check (e.g., “gpt2”)

Returns:

is_supported: bool - Whether the model is supported
architecture_id: str | None - The architecture type if supported
verified: bool - Whether the model has been verified to work
suggestion: str | None - Suggested alternative if not supported

Return type:

Dictionary with support information

Example

>>> from transformer_lens.model_bridge.sources.transformers import check_model_support  
>>> info = check_model_support("openai-community/gpt2")  
>>> info["is_supported"]  
True

transformer_lens.model_bridge.sources.detect_tokenizer_bos_eos(tokenizer: Any) → tuple[bool, bool]¶

Detect whether the tokenizer prepends BOS and/or appends EOS.

Non-empty test string — “” is unreliable with token aliasing.

transformer_lens.model_bridge.sources.list_supported_models(architecture: str | None = None, verified_only: bool = False) → list[str]¶

List all models supported by TransformerLens.

This function provides convenient access to the model registry API for discovering which HuggingFace models can be loaded.

Parameters:

architecture – Filter by architecture ID (e.g., “GPT2LMHeadModel”). If None, returns all supported models.
verified_only – If True, only return models that have been verified to work with TransformerLens.

Returns:

List of model IDs (e.g., [“gpt2”, “gpt2-medium”, …])

Example

>>> from transformer_lens.model_bridge.sources.transformers import list_supported_models
>>> models = list_supported_models()
>>> gpt2_models = list_supported_models(architecture="GPT2LMHeadModel")

transformer_lens.model_bridge.sources package¶

Subpackages¶

Submodules¶

Module contents¶