transformer_lens.benchmarks.generation module

Generation and KV cache benchmarks for TransformerBridge.

transformer_lens.benchmarks.generation.benchmark_generation(bridge: TransformerBridge, test_text: str, max_new_tokens: int = 10, reference_model: HookedTransformer | None = None) BenchmarkResult

Benchmark basic text generation.

Parameters:
  • bridge – TransformerBridge model to test

  • test_text – Input text for generation

  • max_new_tokens – Number of tokens to generate

  • reference_model – Optional HookedTransformer reference model (not used)

Returns:

BenchmarkResult with generation details

transformer_lens.benchmarks.generation.benchmark_generation_with_kv_cache(bridge: TransformerBridge, test_text: str, max_new_tokens: int = 10, reference_model: HookedTransformer | None = None) BenchmarkResult

Benchmark text generation with KV caching enabled.

This ensures that the KV cache is properly passed through attention layers during generation, and that the cache update logic works correctly.

Parameters:
  • bridge – TransformerBridge model to test

  • test_text – Input text for generation

  • max_new_tokens – Number of tokens to generate

  • reference_model – Optional HookedTransformer reference model (not used)

Returns:

BenchmarkResult with generation details

transformer_lens.benchmarks.generation.benchmark_multiple_generation_calls(bridge: TransformerBridge, test_prompts: list, max_new_tokens: int = 5, reference_model: HookedTransformer | None = None) BenchmarkResult

Benchmark multiple generation calls to ensure KV cache handling is robust.

Parameters:
  • bridge – TransformerBridge model to test

  • test_prompts – List of input prompts for generation

  • max_new_tokens – Number of tokens to generate per prompt

  • reference_model – Optional HookedTransformer reference model (not used)

Returns:

BenchmarkResult with multiple generation details