transformer_lens.model_bridge.generalized_components.bloom_block module¶
BLOOM-specific block bridge component.
BLOOM blocks require special arguments (alibi, attention_mask, residual) that standard BlockBridge doesn’t handle. This custom component generates and passes these arguments.
- class transformer_lens.model_bridge.generalized_components.bloom_block.BloomBlockBridge(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None, hook_alias_overrides: Dict[str, str] | None = None)¶
Bases:
BlockBridgeBlock bridge for BLOOM models that handles ALiBi positional encoding.
BLOOM uses ALiBi (Attention with Linear Biases) instead of standard positional embeddings. This requires generating an alibi tensor and passing it to each block along with the attention_mask.
- __init__(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None, hook_alias_overrides: Dict[str, str] | None = None)¶
Initialize the BLOOM block bridge.
- Parameters:
name – The name of the component in the model
config – Model configuration (used to get n_heads for ALiBi)
submodules – Dictionary of submodules to register
hook_alias_overrides – Optional dictionary to override default hook aliases
- static build_alibi_tensor(attention_mask: Tensor, num_heads: int, dtype: dtype) Tensor¶
Build ALiBi tensor for attention biasing.
Delegates to the shared ALiBi utility in alibi_utils.py.
- Parameters:
attention_mask – Attention mask of shape [batch_size, seq_length]
num_heads – Number of attention heads
dtype – Data type for the tensor
- Returns:
ALiBi tensor of shape [batch_size, num_heads, 1, seq_length].
- forward(*args: Any, **kwargs: Any) Any¶
Forward pass through the BLOOM block.
BLOOM blocks require alibi and attention_mask arguments. If the HF model’s BloomModel.forward() is being called, it will generate these and pass them through. If they’re missing (e.g., when called standalone), we generate them here.
- Parameters:
*args – Positional arguments (first should be hidden_states)
**kwargs – Keyword arguments
- Returns:
Output from the original BLOOM block