transformer_lens.benchmarks.component_outputs module¶
Comprehensive component benchmarking utility for TransformerBridge.
This module provides utilities to benchmark all standard components in a TransformerBridge model against their HuggingFace equivalents, ensuring output parity.
- class transformer_lens.benchmarks.component_outputs.BenchmarkReport(model_name: str, total_components: int, passed_components: int, failed_components: int, component_results: ~typing.List[~transformer_lens.benchmarks.component_outputs.ComponentTestResult] = <factory>)¶
Bases:
objectComplete benchmark report for all components.
- component_results: List[ComponentTestResult]¶
- failed_components: int¶
- get_component_type_summary() Dict[str, Dict[str, int]]¶
Get a summary of results grouped by component type.
- Returns:
Dictionary mapping component types to their pass/fail counts
- get_failure_by_severity() Dict[str, List[ComponentTestResult]]¶
Group failures by severity level.
- Returns:
Dictionary mapping severity levels to lists of failed components
- model_name: str¶
- property pass_rate: float¶
Calculate the pass rate as a percentage.
- passed_components: int¶
- print_detailed_analysis() None¶
Print detailed analysis of benchmark results.
- print_summary(verbose: bool = False) None¶
Print a summary of the benchmark results.
- Parameters:
verbose – If True, print details for all components. If False, only print failures.
- total_components: int¶
- class transformer_lens.benchmarks.component_outputs.ComponentBenchmarker(bridge_model: Module, hf_model: Module, adapter: ArchitectureAdapter, cfg: TransformerBridgeConfig, atol: float = 0.0001, rtol: float = 0.0001)¶
Bases:
objectBenchmarking utility for testing TransformerBridge components against HuggingFace.
- __init__(bridge_model: Module, hf_model: Module, adapter: ArchitectureAdapter, cfg: TransformerBridgeConfig, atol: float = 0.0001, rtol: float = 0.0001)¶
Initialize the component benchmarker.
- Parameters:
bridge_model – The TransformerBridge model
hf_model – The HuggingFace model
adapter – The architecture adapter for mapping components
cfg – The model configuration
atol – Absolute tolerance for comparing outputs
rtol – Relative tolerance for comparing outputs
- benchmark_all_components(test_inputs: Dict[str, Tensor] | None = None, skip_components: List[str] | None = None) BenchmarkReport¶
Benchmark all components in the model.
- Parameters:
test_inputs – Optional dictionary of pre-generated test inputs. If None, will generate default inputs.
skip_components – Optional list of component paths to skip
- Returns:
BenchmarkReport with results for all tested components
- class transformer_lens.benchmarks.component_outputs.ComponentTestResult(component_path: str, component_type: str, passed: bool, max_diff: float, mean_diff: float, output_shape: Tuple[int, ...], error_message: str | None = None, percentile_diffs: Dict[str, float] | None = None)¶
Bases:
objectResult of testing a single component.
- component_path: str¶
- component_type: str¶
- error_message: str | None = None¶
- get_failure_severity() str¶
Categorize the severity of a failure.
- Returns:
“critical”, “high”, “medium”, “low”, or “pass”
- Return type:
Severity level
- max_diff: float¶
- mean_diff: float¶
- output_shape: Tuple[int, ...]¶
- passed: bool¶
- percentile_diffs: Dict[str, float] | None = None¶
- transformer_lens.benchmarks.component_outputs.benchmark_model(model_name: str, device: str = 'cpu', atol: float = 0.0001, rtol: float = 0.0001, skip_components: List[str] | None = None, verbose: bool = False) BenchmarkReport¶
Benchmark all components in a model.
- Parameters:
model_name – Name of the HuggingFace model to benchmark
device – Device to run on
atol – Absolute tolerance for comparisons
rtol – Relative tolerance for comparisons
skip_components – Optional list of component paths to skip
verbose – If True, print detailed results for all components
- Returns:
BenchmarkReport with results for all components