transformer_lens.tools.model_registry package¶

Submodules¶

Module contents¶

Model Registry tools for TransformerLens.

This package provides tools for discovering and documenting HuggingFace models that are compatible with TransformerLens.

Main modules:

api: Public API for programmatic access to model registry data
schemas: Data classes for model entries, architecture gaps, etc.
verification: Verification tracking for model compatibility
exceptions: Custom exceptions for the model registry

Example usage:

>>> from transformer_lens.tools.model_registry import api  
>>> api.is_model_supported("openai-community/gpt2")  
True
>>> models = api.get_architecture_models("GPT2LMHeadModel")  

class transformer_lens.tools.model_registry.ArchitectureAnalysis(architecture_id: str, total_models: int, total_downloads: int, priority_score: float, top_models: list[str] = <factory>)¶

Bases: object

Analysis result for prioritizing architecture support.

architecture_id¶

The architecture identifier

Type:: str

total_models¶

Total models using this architecture

Type:: int

total_downloads¶

Sum of downloads across all models

Type:: int

priority_score¶

Computed priority score for implementation

Type:: float

top_models¶

Most popular models for this architecture

Type:: list[str]

architecture_id: str¶

priority_score: float¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

top_models: list[str]¶

total_downloads: int¶

total_models: int¶

class transformer_lens.tools.model_registry.ArchitectureGap(architecture_id: str, total_models: int, sample_models: list[str] = <factory>, total_downloads: int = 0, min_param_count: int | None = None, relevancy_score: float | None = None)¶

Bases: object

An unsupported architecture with model count and relevancy metrics.

architecture_id¶

The architecture type not supported by TransformerLens

Type:: str

total_models¶

Number of models on HuggingFace using this architecture

Type:: int

sample_models¶

Top models by downloads for this architecture (up to 10)

Type:: list[str]

total_downloads¶

Aggregate download count across all models of this architecture

Type:: int

min_param_count¶

Parameter count of the smallest model (None if unknown)

Type:: int | None

relevancy_score¶

Composite relevancy score (0-100), or None if not computed

Type:: float | None

architecture_id: str¶

classmethod from_dict(data: dict) → ArchitectureGap¶: Create from a dictionary.

min_param_count: int | None = None¶

relevancy_score: float | None = None¶

sample_models: list[str]¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_downloads: int = 0¶

total_models: int¶

class transformer_lens.tools.model_registry.ArchitectureGapsReport(generated_at: date, gaps: list[ArchitectureGap], scan_info: ScanInfo | None = None, total_unsupported_architectures: int = 0, total_unsupported_models: int = 0)¶

Bases: object

Report containing unsupported architectures.

generated_at¶

Date when this report was generated

Type:: datetime.date

scan_info¶

Metadata about the scraping run

Type:: transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_unsupported_architectures¶

Number of unsupported architectures

Type:: int

total_unsupported_models¶

Total models across all unsupported architectures

Type:: int

gaps¶

List of architecture gaps sorted by model count

Type:: list[transformer_lens.tools.model_registry.schemas.ArchitectureGap]

classmethod from_dict(data: dict) → ArchitectureGapsReport¶: Create from a dictionary.

gaps: list[ArchitectureGap]¶

generated_at: date¶

scan_info: ScanInfo | None = None¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_unsupported_architectures: int = 0¶

total_unsupported_models: int = 0¶

exception transformer_lens.tools.model_registry.ArchitectureNotSupportedError(architecture_id: str, model_count: int | None = None)¶

Bases: ModelRegistryError

Raised when an architecture is not supported by TransformerLens.

architecture_id¶: The architecture that is not supported

model_count¶: Number of models using this architecture (if known)

class transformer_lens.tools.model_registry.ArchitectureStats(architecture_id: str, is_supported: bool, model_count: int, verified_count: int = 0, example_models: list[str] = <factory>)¶

Bases: object

Statistics about an architecture including supported and gap info.

architecture_id¶

The architecture identifier

Type:: str

is_supported¶

Whether TransformerLens supports this architecture

Type:: bool

model_count¶

Number of models using this architecture

Type:: int

verified_count¶

Number of verified models (if supported)

Type:: int

example_models¶

Sample model IDs for this architecture

Type:: list[str]

architecture_id: str¶

example_models: list[str]¶

is_supported: bool¶

model_count: int¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

verified_count: int = 0¶

exception transformer_lens.tools.model_registry.DataNotLoadedError(data_type: str, path: str | None = None)¶

Bases: ModelRegistryError

Raised when registry data has not been loaded or is unavailable.

data_type¶: Type of data that was not loaded (e.g., “supported_models”)

path¶: Optional path where data was expected

exception transformer_lens.tools.model_registry.DataValidationError(file_path: str, errors: list[str])¶

Bases: ModelRegistryError

Raised when registry data fails validation.

file_path¶: Path to the file that failed validation

errors¶: List of validation error messages

class transformer_lens.tools.model_registry.ModelEntry(architecture_id: str, model_id: str, status: int = 0, verified_date: date | None = None, metadata: ModelMetadata | None = None, note: str | None = None, phase1_score: float | None = None, phase2_score: float | None = None, phase3_score: float | None = None)¶

Bases: object

A single model entry in the supported models list.

architecture_id¶

The architecture type (e.g., “GPT2LMHeadModel”)

Type:: str

model_id¶

The HuggingFace model ID (e.g., “gpt2”, “openai-community/gpt2”)

Type:: str

status¶

Verification status (0=unverified, 1=verified, 2=skipped, 3=failed)

Type:: int

verified_date¶

Date when verification was performed

Type:: datetime.date | None

metadata¶

Optional metadata from HuggingFace

Type:: transformer_lens.tools.model_registry.schemas.ModelMetadata | None

note¶

Optional note (skip/fail reason, e.g. “Estimated 48 GB exceeds 16 GB limit”)

Type:: str | None

phase1_score¶

Benchmark Phase 1 score (HF vs Bridge), 0-100 or None

Type:: float | None

phase2_score¶

Benchmark Phase 2 score (Bridge vs HT unprocessed), 0-100 or None

Type:: float | None

phase3_score¶

Benchmark Phase 3 score (Bridge vs HT processed), 0-100 or None

Type:: float | None

architecture_id: str¶

classmethod from_dict(data: dict) → ModelEntry¶: Create from a dictionary.

metadata: ModelMetadata | None = None¶

model_id: str¶

note: str | None = None¶

phase1_score: float | None = None¶

phase2_score: float | None = None¶

phase3_score: float | None = None¶

status: int = 0¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

verified_date: date | None = None¶

class transformer_lens.tools.model_registry.ModelMetadata(downloads: int = 0, likes: int = 0, last_modified: ~datetime.datetime | None = None, tags: list[str] = <factory>, parameter_count: int | None = None)¶

Bases: object

Metadata for a model from HuggingFace.

downloads¶

Total download count for the model

Type:: int

likes¶

Number of likes/stars on HuggingFace

Type:: int

last_modified¶

When the model was last updated

Type:: datetime.datetime | None

tags¶

List of tags associated with the model

Type:: list[str]

parameter_count¶

Estimated number of parameters (if available)

Type:: int | None

downloads: int = 0¶

classmethod from_dict(data: dict) → ModelMetadata¶: Create from a dictionary.

last_modified: datetime | None = None¶

likes: int = 0¶

parameter_count: int | None = None¶

tags: list[str]¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

exception transformer_lens.tools.model_registry.ModelNotFoundError(model_id: str, suggestion: str | None = None)¶

Bases: ModelRegistryError

Raised when a requested model ID is not found in the registry.

model_id¶: The model ID that was not found

suggestion¶: Optional suggested alternative model

exception transformer_lens.tools.model_registry.ModelRegistryError¶

Bases: Exception

Base exception for all model registry errors.

class transformer_lens.tools.model_registry.ScanInfo(total_scanned: int, task_filter: str, scan_duration_seconds: float | None = None)¶

Bases: object

Metadata about a scraping run.

total_scanned¶

Total number of models scanned in this run

Type:: int

task_filter¶

HuggingFace task filter used (e.g., “text-generation”)

Type:: str

scan_duration_seconds¶

How long the scan took in seconds (if available)

Type:: float | None

classmethod from_dict(data: dict) → ScanInfo¶: Create from a dictionary.

scan_duration_seconds: float | None = None¶

task_filter: str¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_scanned: int¶

class transformer_lens.tools.model_registry.SupportedModelsReport(generated_at: date, total_models: int, models: list[ModelEntry], scan_info: ScanInfo | None = None, total_architectures: int = 0, total_verified: int = 0)¶

Bases: object

Report containing all supported models.

generated_at¶

Date when this report was generated

Type:: datetime.date

scan_info¶

Metadata about the scraping run

Type:: transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_architectures¶

Number of unique supported architectures

Type:: int

total_models¶

Total number of supported models

Type:: int

total_verified¶

Number of models that have been verified

Type:: int

models¶

List of all model entries

Type:: list[transformer_lens.tools.model_registry.schemas.ModelEntry]

classmethod from_dict(data: dict) → SupportedModelsReport¶: Create from a dictionary.

generated_at: date¶

models: list[ModelEntry]¶

scan_info: ScanInfo | None = None¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_architectures: int = 0¶

total_models: int¶

total_verified: int = 0¶

class transformer_lens.tools.model_registry.VerificationHistory(records: list[~transformer_lens.tools.model_registry.verification.VerificationRecord] = <factory>, last_updated: ~datetime.datetime | None = None)¶

Bases: object

History of all model verifications.

records¶

List of all verification records

Type:: list[transformer_lens.tools.model_registry.verification.VerificationRecord]

last_updated¶

When this history was last updated

Type:: datetime.datetime | None

add_record(record: VerificationRecord) → None¶

Add a new verification record.

Parameters:: record – The verification record to add

classmethod from_dict(data: dict) → VerificationHistory¶: Create from a dictionary.

get_record(model_id: str) → VerificationRecord | None¶

Get the most recent valid verification record for a model.

Parameters:: model_id – The model ID to look up
Returns:: The verification record, or None if not found or invalidated

invalidate(model_id: str, reason: str) → bool¶

Invalidate the most recent verification for a model.

Parameters:

model_id – The model ID to invalidate
reason – Reason for invalidation

Returns:

True if a record was invalidated, False if not found

is_verified(model_id: str) → bool¶

Check if a model has a valid verification.

Parameters:: model_id – The model ID to check
Returns:: True if the model has a valid (non-invalidated) verification

last_updated: datetime | None = None¶

records: list[VerificationRecord]¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

class transformer_lens.tools.model_registry.VerificationRecord(model_id: str, verified_date: date, architecture_id: str = 'Unknown', verified_by: str | None = None, transformerlens_version: str | None = None, notes: str | None = None, invalidated: bool = False, invalidation_reason: str | None = None)¶

Bases: object

A record of a model verification.

model_id¶

The HuggingFace model ID that was verified

Type:: str

architecture_id¶

The architecture type of the model

Type:: str

verified_date¶

Date when verification was performed

Type:: datetime.date

verified_by¶

Who performed the verification (user, CI, etc.)

Type:: str | None

transformerlens_version¶

Version of TransformerLens used

Type:: str | None

notes¶

Optional notes about the verification

Type:: str | None

invalidated¶

Whether this verification has been invalidated

Type:: bool

invalidation_reason¶

Reason for invalidation if applicable

Type:: str | None

architecture_id: str = 'Unknown'¶

classmethod from_dict(data: dict) → VerificationRecord¶: Create from a dictionary.

invalidated: bool = False¶

invalidation_reason: str | None = None¶

model_id: str¶

notes: str | None = None¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

transformerlens_version: str | None = None¶

verified_by: str | None = None¶

verified_date: date¶