transformer_lens.benchmarks.hook_registration module

Hook registration and behavior benchmarks for TransformerBridge.

transformer_lens.benchmarks.hook_registration.benchmark_critical_forward_hooks(bridge: TransformerBridge, test_text: str, reference_model: HookedTransformer | None = None, tolerance: float = 0.02) BenchmarkResult

Benchmark critical forward hooks commonly used in interpretability research.

Parameters:
  • bridge – TransformerBridge model to test

  • test_text – Input text for testing

  • reference_model – Optional HookedTransformer reference model

  • tolerance – Tolerance for activation comparison

Returns:

BenchmarkResult with critical hook comparison details

transformer_lens.benchmarks.hook_registration.benchmark_forward_hooks(bridge: TransformerBridge, test_text: str, reference_model: HookedTransformer | None = None, tolerance: float = 0.5, prepend_bos: bool | None = None) BenchmarkResult

Benchmark all forward hooks for activation matching.

Parameters:
  • bridge – TransformerBridge model to test

  • test_text – Input text for testing

  • reference_model – Optional HookedTransformer for comparison

  • tolerance – Tolerance for activation matching (fraction of mismatches allowed)

  • prepend_bos – Whether to prepend BOS token. If None, uses model default.

Returns:

BenchmarkResult with hook activation comparison details

transformer_lens.benchmarks.hook_registration.benchmark_gated_hooks_fire(bridge: TransformerBridge, test_text: str = 'The quick brown fox', prepend_bos: bool | None = None) BenchmarkResult

Verify each cfg-gated attention hook fires when its flag is enabled.

Hooks like hook_result, hook_q_input, hook_attn_in exist unconditionally on the attention bridge but are only populated when the corresponding config flag is set (keeping default-path cost at zero). This benchmark toggles each flag in turn, runs a short forward, and asserts at least one layer’s matching hook actually captured an activation.

use_attn_in and use_split_qkv_input are mutually exclusive, so each flag runs in its own forward pass. Plain AttentionBridge (non-PEA/JPEA) adapters raise NotImplementedError from the setter — recorded as skipped rather than failed, since the applicability gate is intentional.

transformer_lens.benchmarks.hook_registration.benchmark_hook_functionality(bridge: TransformerBridge, test_text: str, reference_model: HookedTransformer | None = None, atol: float = 0.002) BenchmarkResult

Benchmark hook system functionality through ablation effects.

Parameters:
  • bridge – TransformerBridge model to test

  • test_text – Input text for testing

  • reference_model – Optional HookedTransformer reference model

  • atol – Absolute tolerance for effect comparison

Returns:

BenchmarkResult with hook functionality comparison details

transformer_lens.benchmarks.hook_registration.benchmark_hook_registry(bridge: TransformerBridge, reference_model: HookedTransformer | None = None) BenchmarkResult

Benchmark hook registry completeness.

Parameters:
  • bridge – TransformerBridge model to test

  • reference_model – Optional HookedTransformer reference model

Returns:

BenchmarkResult with registry comparison details