transformer_lens.tools.model_registry.schemas module¶

Data schemas for the model registry.

This module defines the dataclasses used throughout the model registry for representing supported models, architecture gaps, and related metadata.

class transformer_lens.tools.model_registry.schemas.ArchitectureAnalysis(architecture_id: str, total_models: int, total_downloads: int, priority_score: float, top_models: list[str] = <factory>)¶

Bases: object

Analysis result for prioritizing architecture support.

architecture_id¶

The architecture identifier

Type:: str

total_models¶

Total models using this architecture

Type:: int

total_downloads¶

Sum of downloads across all models

Type:: int

priority_score¶

Computed priority score for implementation

Type:: float

top_models¶

Most popular models for this architecture

Type:: list[str]

architecture_id: str¶

priority_score: float¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

top_models: list[str]¶

total_downloads: int¶

total_models: int¶

class transformer_lens.tools.model_registry.schemas.ArchitectureGap(architecture_id: str, total_models: int, sample_models: list[str] = <factory>, total_downloads: int = 0, min_param_count: int | None = None, relevancy_score: float | None = None)¶

Bases: object

An unsupported architecture with model count and relevancy metrics.

architecture_id¶

The architecture type not supported by TransformerLens

Type:: str

total_models¶

Number of models on HuggingFace using this architecture

Type:: int

sample_models¶

Top models by downloads for this architecture (up to 10)

Type:: list[str]

total_downloads¶

Aggregate download count across all models of this architecture

Type:: int

min_param_count¶

Parameter count of the smallest model (None if unknown)

Type:: int | None

relevancy_score¶

Composite relevancy score (0-100), or None if not computed

Type:: float | None

architecture_id: str¶

classmethod from_dict(data: dict) → ArchitectureGap¶: Create from a dictionary.

min_param_count: int | None = None¶

relevancy_score: float | None = None¶

sample_models: list[str]¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_downloads: int = 0¶

total_models: int¶

class transformer_lens.tools.model_registry.schemas.ArchitectureGapsReport(generated_at: date, gaps: list[ArchitectureGap], scan_info: ScanInfo | None = None, total_unsupported_architectures: int = 0, total_unsupported_models: int = 0)¶

Bases: object

Report containing unsupported architectures.

generated_at¶

Date when this report was generated

Type:: datetime.date

scan_info¶

Metadata about the scraping run

Type:: transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_unsupported_architectures¶

Number of unsupported architectures

Type:: int

total_unsupported_models¶

Total models across all unsupported architectures

Type:: int

gaps¶

List of architecture gaps sorted by model count

Type:: list[transformer_lens.tools.model_registry.schemas.ArchitectureGap]

classmethod from_dict(data: dict) → ArchitectureGapsReport¶: Create from a dictionary.

gaps: list[ArchitectureGap]¶

generated_at: date¶

scan_info: ScanInfo | None = None¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_unsupported_architectures: int = 0¶

total_unsupported_models: int = 0¶

class transformer_lens.tools.model_registry.schemas.ArchitectureStats(architecture_id: str, is_supported: bool, model_count: int, verified_count: int = 0, example_models: list[str] = <factory>)¶

Bases: object

Statistics about an architecture including supported and gap info.

architecture_id¶

The architecture identifier

Type:: str

is_supported¶

Whether TransformerLens supports this architecture

Type:: bool

model_count¶

Number of models using this architecture

Type:: int

verified_count¶

Number of verified models (if supported)

Type:: int

example_models¶

Sample model IDs for this architecture

Type:: list[str]

architecture_id: str¶

example_models: list[str]¶

is_supported: bool¶

model_count: int¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

verified_count: int = 0¶

class transformer_lens.tools.model_registry.schemas.ModelEntry(architecture_id: str, model_id: str, status: int = 0, verified_date: date | None = None, metadata: ModelMetadata | None = None, note: str | None = None, phase1_score: float | None = None, phase2_score: float | None = None, phase3_score: float | None = None)¶

Bases: object

A single model entry in the supported models list.

architecture_id¶

The architecture type (e.g., “GPT2LMHeadModel”)

Type:: str

model_id¶

The HuggingFace model ID (e.g., “gpt2”, “openai-community/gpt2”)

Type:: str

status¶

Verification status (0=unverified, 1=verified, 2=skipped, 3=failed)

Type:: int

verified_date¶

Date when verification was performed

Type:: datetime.date | None

metadata¶

Optional metadata from HuggingFace

Type:: transformer_lens.tools.model_registry.schemas.ModelMetadata | None

note¶

Optional note (skip/fail reason, e.g. “Estimated 48 GB exceeds 16 GB limit”)

Type:: str | None

phase1_score¶

Benchmark Phase 1 score (HF vs Bridge), 0-100 or None

Type:: float | None

phase2_score¶

Benchmark Phase 2 score (Bridge vs HT unprocessed), 0-100 or None

Type:: float | None

phase3_score¶

Benchmark Phase 3 score (Bridge vs HT processed), 0-100 or None

Type:: float | None

architecture_id: str¶

classmethod from_dict(data: dict) → ModelEntry¶: Create from a dictionary.

metadata: ModelMetadata | None = None¶

model_id: str¶

note: str | None = None¶

phase1_score: float | None = None¶

phase2_score: float | None = None¶

phase3_score: float | None = None¶

status: int = 0¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

verified_date: date | None = None¶

class transformer_lens.tools.model_registry.schemas.ModelMetadata(downloads: int = 0, likes: int = 0, last_modified: ~datetime.datetime | None = None, tags: list[str] = <factory>, parameter_count: int | None = None)¶

Bases: object

Metadata for a model from HuggingFace.

downloads¶

Total download count for the model

Type:: int

likes¶

Number of likes/stars on HuggingFace

Type:: int

last_modified¶

When the model was last updated

Type:: datetime.datetime | None

tags¶

List of tags associated with the model

Type:: list[str]

parameter_count¶

Estimated number of parameters (if available)

Type:: int | None

downloads: int = 0¶

classmethod from_dict(data: dict) → ModelMetadata¶: Create from a dictionary.

last_modified: datetime | None = None¶

likes: int = 0¶

parameter_count: int | None = None¶

tags: list[str]¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

class transformer_lens.tools.model_registry.schemas.ScanInfo(total_scanned: int, task_filter: str, scan_duration_seconds: float | None = None)¶

Bases: object

Metadata about a scraping run.

total_scanned¶

Total number of models scanned in this run

Type:: int

task_filter¶

HuggingFace task filter used (e.g., “text-generation”)

Type:: str

scan_duration_seconds¶

How long the scan took in seconds (if available)

Type:: float | None

classmethod from_dict(data: dict) → ScanInfo¶: Create from a dictionary.

scan_duration_seconds: float | None = None¶

task_filter: str¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_scanned: int¶

class transformer_lens.tools.model_registry.schemas.SupportedModelsReport(generated_at: date, total_models: int, models: list[ModelEntry], scan_info: ScanInfo | None = None, total_architectures: int = 0, total_verified: int = 0)¶

Bases: object

Report containing all supported models.

generated_at¶

Date when this report was generated

Type:: datetime.date

scan_info¶

Metadata about the scraping run

Type:: transformer_lens.tools.model_registry.schemas.ScanInfo | None

total_architectures¶

Number of unique supported architectures

Type:: int

total_models¶

Total number of supported models

Type:: int

total_verified¶

Number of models that have been verified

Type:: int

models¶

List of all model entries

Type:: list[transformer_lens.tools.model_registry.schemas.ModelEntry]

classmethod from_dict(data: dict) → SupportedModelsReport¶: Create from a dictionary.

generated_at: date¶

models: list[ModelEntry]¶

scan_info: ScanInfo | None = None¶

to_dict() → dict¶: Convert to a JSON-serializable dictionary.

total_architectures: int = 0¶

total_models: int¶

total_verified: int = 0¶