transformer_lens.conversion_utils.conversion_steps.attention_auto_conversion module

Attention Auto Conversion

This module provides automatic conversion for attention hook inputs with revert capability. It handles bidirectional conversions for attention activation tensors flowing through hooks.

class transformer_lens.conversion_utils.conversion_steps.attention_auto_conversion.AttentionAutoConversion(config: Any)

Bases: BaseTensorConversion

Handles bidirectional conversions for attention hook inputs (activation tensors).

Converts tensors to match HookedTransformer format and can revert them back to their original format using stored state information.

__init__(config: Any)

Initialize the attention auto conversion.

Parameters:

config – Model configuration containing attention head information

clear_state(tensor_id: int | None = None) None

Clear stored conversion state.

Parameters:

tensor_id – Specific tensor ID to clear, or None to clear all

get_conversion_info(tensor_id: int) Dict[str, Any] | None

Get conversion information for a tensor.

Parameters:

tensor_id – ID of the tensor to get info for

Returns:

Dictionary with conversion information or None if not found

handle_conversion(input_value: Any, *full_context) Any

Convert tensor to HookedTransformer format and store revert state.

Parameters:
  • input_value – The tensor input (activation) flowing through the hook

  • *full_context – Additional context (not used)

Returns:

The tensor reshaped to match HookedTransformer expectations

revert_conversion(converted_value: Any, original_tensor_id: int | None = None) Any

Revert tensor back to its original format using stored state.

Parameters:
  • converted_value – The tensor that was previously converted

  • original_tensor_id – ID of the original tensor (if available)

Returns:

The tensor reverted to its original format