transformer_lens.tools.model_registry package

Submodules

Module contents

Model Registry tools for TransformerLens.

This package provides tools for discovering and documenting HuggingFace models that are compatible with TransformerLens.

Main modules:
  • api: Public API for programmatic access to model registry data

  • schemas: Data classes for model entries, architecture gaps, etc.

  • verification: Verification tracking for model compatibility

  • exceptions: Custom exceptions for the model registry

Example usage:
>>> from transformer_lens.tools.model_registry import api  
>>> api.is_model_supported("openai-community/gpt2")  
True
>>> models = api.get_architecture_models("GPT2LMHeadModel")  
class transformer_lens.tools.model_registry.ArchitectureAnalysis(architecture_id: str, total_models: int, total_downloads: int, priority_score: float, top_models: list[str] = <factory>)

Bases: object

Analysis result for prioritizing architecture support.

architecture_id

The architecture identifier

Type:

str

total_models

Total models using this architecture

Type:

int

total_downloads

Sum of downloads across all models

Type:

int

priority_score

Computed priority score for implementation

Type:

float

top_models

Most popular models for this architecture

Type:

list[str]

architecture_id: str
priority_score: float
to_dict() dict

Convert to a JSON-serializable dictionary.

top_models: list[str]
total_downloads: int
total_models: int
class transformer_lens.tools.model_registry.ArchitectureGap(architecture_id: str, total_models: int, sample_models: list[str] = <factory>, total_downloads: int = 0, min_param_count: int | None = None, relevancy_score: float | None = None)

Bases: object

An unsupported architecture with model count and relevancy metrics.

architecture_id

The architecture type not supported by TransformerLens

Type:

str

total_models

Number of models on HuggingFace using this architecture

Type:

int

sample_models

Top models by downloads for this architecture (up to 10)

Type:

list[str]

total_downloads

Aggregate download count across all models of this architecture

Type:

int

min_param_count

Parameter count of the smallest model (None if unknown)

Type:

int | None

relevancy_score

Composite relevancy score (0-100), or None if not computed

Type:

float | None

architecture_id: str
classmethod from_dict(data: dict) ArchitectureGap

Create from a dictionary.

min_param_count: int | None = None
relevancy_score: float | None = None
sample_models: list[str]
to_dict() dict

Convert to a JSON-serializable dictionary.

total_downloads: int = 0
total_models: int
class transformer_lens.tools.model_registry.ArchitectureGapsReport(generated_at: date, gaps: list[ArchitectureGap], scan_info: ScanInfo | None = None, total_unsupported_architectures: int = 0, total_unsupported_models: int = 0)

Bases: object

Report containing unsupported architectures.

generated_at

Date when this report was generated

Type:

datetime.date

scan_info

Metadata about the scraping run

Type:

transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_unsupported_architectures

Number of unsupported architectures

Type:

int

total_unsupported_models

Total models across all unsupported architectures

Type:

int

gaps

List of architecture gaps sorted by model count

Type:

list[transformer_lens.tools.model_registry.schemas.ArchitectureGap]

classmethod from_dict(data: dict) ArchitectureGapsReport

Create from a dictionary.

gaps: list[ArchitectureGap]
generated_at: date
scan_info: ScanInfo | None = None
to_dict() dict

Convert to a JSON-serializable dictionary.

total_unsupported_architectures: int = 0
total_unsupported_models: int = 0
exception transformer_lens.tools.model_registry.ArchitectureNotSupportedError(architecture_id: str, model_count: int | None = None)

Bases: ModelRegistryError

Raised when an architecture is not supported by TransformerLens.

architecture_id

The architecture that is not supported

model_count

Number of models using this architecture (if known)

class transformer_lens.tools.model_registry.ArchitectureStats(architecture_id: str, is_supported: bool, model_count: int, verified_count: int = 0, example_models: list[str] = <factory>)

Bases: object

Statistics about an architecture including supported and gap info.

architecture_id

The architecture identifier

Type:

str

is_supported

Whether TransformerLens supports this architecture

Type:

bool

model_count

Number of models using this architecture

Type:

int

verified_count

Number of verified models (if supported)

Type:

int

example_models

Sample model IDs for this architecture

Type:

list[str]

architecture_id: str
example_models: list[str]
is_supported: bool
model_count: int
to_dict() dict

Convert to a JSON-serializable dictionary.

verified_count: int = 0
exception transformer_lens.tools.model_registry.DataNotLoadedError(data_type: str, path: str | None = None)

Bases: ModelRegistryError

Raised when registry data has not been loaded or is unavailable.

data_type

Type of data that was not loaded (e.g., “supported_models”)

path

Optional path where data was expected

exception transformer_lens.tools.model_registry.DataValidationError(file_path: str, errors: list[str])

Bases: ModelRegistryError

Raised when registry data fails validation.

file_path

Path to the file that failed validation

errors

List of validation error messages

class transformer_lens.tools.model_registry.ModelEntry(architecture_id: str, model_id: str, status: int = 0, verified_date: date | None = None, metadata: ModelMetadata | None = None, note: str | None = None, phase1_score: float | None = None, phase2_score: float | None = None, phase3_score: float | None = None)

Bases: object

A single model entry in the supported models list.

architecture_id

The architecture type (e.g., “GPT2LMHeadModel”)

Type:

str

model_id

The HuggingFace model ID (e.g., “gpt2”, “openai-community/gpt2”)

Type:

str

status

Verification status (0=unverified, 1=verified, 2=skipped, 3=failed)

Type:

int

verified_date

Date when verification was performed

Type:

datetime.date | None

metadata

Optional metadata from HuggingFace

Type:

transformer_lens.tools.model_registry.schemas.ModelMetadata | None

note

Optional note (skip/fail reason, e.g. “Estimated 48 GB exceeds 16 GB limit”)

Type:

str | None

phase1_score

Benchmark Phase 1 score (HF vs Bridge), 0-100 or None

Type:

float | None

phase2_score

Benchmark Phase 2 score (Bridge vs HT unprocessed), 0-100 or None

Type:

float | None

phase3_score

Benchmark Phase 3 score (Bridge vs HT processed), 0-100 or None

Type:

float | None

architecture_id: str
classmethod from_dict(data: dict) ModelEntry

Create from a dictionary.

metadata: ModelMetadata | None = None
model_id: str
note: str | None = None
phase1_score: float | None = None
phase2_score: float | None = None
phase3_score: float | None = None
status: int = 0
to_dict() dict

Convert to a JSON-serializable dictionary.

verified_date: date | None = None
class transformer_lens.tools.model_registry.ModelMetadata(downloads: int = 0, likes: int = 0, last_modified: ~datetime.datetime | None = None, tags: list[str] = <factory>, parameter_count: int | None = None)

Bases: object

Metadata for a model from HuggingFace.

downloads

Total download count for the model

Type:

int

likes

Number of likes/stars on HuggingFace

Type:

int

last_modified

When the model was last updated

Type:

datetime.datetime | None

tags

List of tags associated with the model

Type:

list[str]

parameter_count

Estimated number of parameters (if available)

Type:

int | None

downloads: int = 0
classmethod from_dict(data: dict) ModelMetadata

Create from a dictionary.

last_modified: datetime | None = None
likes: int = 0
parameter_count: int | None = None
tags: list[str]
to_dict() dict

Convert to a JSON-serializable dictionary.

exception transformer_lens.tools.model_registry.ModelNotFoundError(model_id: str, suggestion: str | None = None)

Bases: ModelRegistryError

Raised when a requested model ID is not found in the registry.

model_id

The model ID that was not found

suggestion

Optional suggested alternative model

exception transformer_lens.tools.model_registry.ModelRegistryError

Bases: Exception

Base exception for all model registry errors.

class transformer_lens.tools.model_registry.ScanInfo(total_scanned: int, task_filter: str, scan_duration_seconds: float | None = None)

Bases: object

Metadata about a scraping run.

total_scanned

Total number of models scanned in this run

Type:

int

task_filter

HuggingFace task filter used (e.g., “text-generation”)

Type:

str

scan_duration_seconds

How long the scan took in seconds (if available)

Type:

float | None

classmethod from_dict(data: dict) ScanInfo

Create from a dictionary.

scan_duration_seconds: float | None = None
task_filter: str
to_dict() dict

Convert to a JSON-serializable dictionary.

total_scanned: int
class transformer_lens.tools.model_registry.SupportedModelsReport(generated_at: date, total_models: int, models: list[ModelEntry], scan_info: ScanInfo | None = None, total_architectures: int = 0, total_verified: int = 0)

Bases: object

Report containing all supported models.

generated_at

Date when this report was generated

Type:

datetime.date

scan_info

Metadata about the scraping run

Type:

transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_architectures

Number of unique supported architectures

Type:

int

total_models

Total number of supported models

Type:

int

total_verified

Number of models that have been verified

Type:

int

models

List of all model entries

Type:

list[transformer_lens.tools.model_registry.schemas.ModelEntry]

classmethod from_dict(data: dict) SupportedModelsReport

Create from a dictionary.

generated_at: date
models: list[ModelEntry]
scan_info: ScanInfo | None = None
to_dict() dict

Convert to a JSON-serializable dictionary.

total_architectures: int = 0
total_models: int
total_verified: int = 0
class transformer_lens.tools.model_registry.VerificationHistory(records: list[~transformer_lens.tools.model_registry.verification.VerificationRecord] = <factory>, last_updated: ~datetime.datetime | None = None)

Bases: object

History of all model verifications.

records

List of all verification records

Type:

list[transformer_lens.tools.model_registry.verification.VerificationRecord]

last_updated

When this history was last updated

Type:

datetime.datetime | None

add_record(record: VerificationRecord) None

Add a new verification record.

Parameters:

record – The verification record to add

classmethod from_dict(data: dict) VerificationHistory

Create from a dictionary.

get_record(model_id: str) VerificationRecord | None

Get the most recent valid verification record for a model.

Parameters:

model_id – The model ID to look up

Returns:

The verification record, or None if not found or invalidated

invalidate(model_id: str, reason: str) bool

Invalidate the most recent verification for a model.

Parameters:
  • model_id – The model ID to invalidate

  • reason – Reason for invalidation

Returns:

True if a record was invalidated, False if not found

is_verified(model_id: str) bool

Check if a model has a valid verification.

Parameters:

model_id – The model ID to check

Returns:

True if the model has a valid (non-invalidated) verification

last_updated: datetime | None = None
records: list[VerificationRecord]
to_dict() dict

Convert to a JSON-serializable dictionary.

class transformer_lens.tools.model_registry.VerificationRecord(model_id: str, verified_date: date, architecture_id: str = 'Unknown', verified_by: str | None = None, transformerlens_version: str | None = None, notes: str | None = None, invalidated: bool = False, invalidation_reason: str | None = None)

Bases: object

A record of a model verification.

model_id

The HuggingFace model ID that was verified

Type:

str

architecture_id

The architecture type of the model

Type:

str

verified_date

Date when verification was performed

Type:

datetime.date

verified_by

Who performed the verification (user, CI, etc.)

Type:

str | None

transformerlens_version

Version of TransformerLens used

Type:

str | None

notes

Optional notes about the verification

Type:

str | None

invalidated

Whether this verification has been invalidated

Type:

bool

invalidation_reason

Reason for invalidation if applicable

Type:

str | None

architecture_id: str = 'Unknown'
classmethod from_dict(data: dict) VerificationRecord

Create from a dictionary.

invalidated: bool = False
invalidation_reason: str | None = None
model_id: str
notes: str | None = None
to_dict() dict

Convert to a JSON-serializable dictionary.

transformerlens_version: str | None = None
verified_by: str | None = None
verified_date: date