transformer_lens.benchmarks.generation module¶
Generation and KV cache benchmarks for TransformerBridge.
- transformer_lens.benchmarks.generation.benchmark_generation(bridge: TransformerBridge, test_text: str, max_new_tokens: int = 10, reference_model: HookedTransformer | None = None) BenchmarkResult¶
Benchmark basic text generation.
- Parameters:
bridge – TransformerBridge model to test
test_text – Input text for generation
max_new_tokens – Number of tokens to generate
reference_model – Optional HookedTransformer reference model (not used)
- Returns:
BenchmarkResult with generation details
- transformer_lens.benchmarks.generation.benchmark_generation_with_kv_cache(bridge: TransformerBridge, test_text: str, max_new_tokens: int = 10, reference_model: HookedTransformer | None = None) BenchmarkResult¶
Benchmark text generation with KV caching enabled.
This ensures that the KV cache is properly passed through attention layers during generation, and that the cache update logic works correctly.
- Parameters:
bridge – TransformerBridge model to test
test_text – Input text for generation
max_new_tokens – Number of tokens to generate
reference_model – Optional HookedTransformer reference model (not used)
- Returns:
BenchmarkResult with generation details
- transformer_lens.benchmarks.generation.benchmark_multiple_generation_calls(bridge: TransformerBridge, test_prompts: list, max_new_tokens: int = 5, reference_model: HookedTransformer | None = None) BenchmarkResult¶
Benchmark multiple generation calls to ensure KV cache handling is robust.
- Parameters:
bridge – TransformerBridge model to test
test_prompts – List of input prompts for generation
max_new_tokens – Number of tokens to generate per prompt
reference_model – Optional HookedTransformer reference model (not used)
- Returns:
BenchmarkResult with multiple generation details