transformer_lens.pretrained.weight_conversions.apertus#

Apertus weight conversion.

Converts Apertus (Swiss AI) weights to HookedTransformer format. Apertus is structurally similar to Llama but uses non-gated MLP with XIeLU activation, and different layer norm names (attention_layernorm / feedforward_layernorm).

transformer_lens.pretrained.weight_conversions.apertus.convert_apertus_weights(apertus, cfg: HookedTransformerConfig)#