transformer_lens.utilities.components_utils module¶

components_utils.

This module contains utility functions related to model components

transformer_lens.utilities.components_utils.get_act_name(name: str, layer: int | str | None = None, layer_type: str | None = None)¶

Helper function to convert shorthand to an activation name. Pretty hacky, intended to be useful for short feedback loop hacking stuff together, more so than writing good, readable code. But it is deterministic!

Returns a name corresponding to an activation point in a TransformerLens model.

Parameters:

name (str) – Takes in the name of the activation. This can be used to specify any activation name by itself.
it (The code assumes the first sequence of digits passed to)
type. (that is the layer)
number (Given only a word and)
is. (it leaves layer and layer_type as)
word (Given only a)
is.
Examples – get_act_name(‘embed’) = get_act_name(‘embed’, None, None) get_act_name(‘k6’) = get_act_name(‘k’, 6, None) get_act_name(‘scale4ln1’) = get_act_name(‘scale’, 4, ‘ln1’)
layer (int, optional) – Takes in the layer number. Used for activations that appear in every block.
layer_type (string, optional) – Used to distinguish between activations that appear multiple times in one block.

Full Examples:

get_act_name(‘k’, 6, ‘a’)==’blocks.6.attn.hook_k’ get_act_name(‘pre’, 2)==’blocks.2.mlp.hook_pre’ get_act_name(‘embed’)==’hook_embed’ get_act_name(‘normalized’, 27, ‘ln2’)==’blocks.27.ln2.hook_normalized’ get_act_name(‘k6’)==’blocks.6.attn.hook_k’ get_act_name(‘scale4ln1’)==’blocks.4.ln1.hook_scale’ get_act_name(‘pre5’)==’blocks.5.mlp.hook_pre’