transformer_lens.pretrained.weight_conversions.apertus#
Apertus weight conversion.
Converts Apertus (Swiss AI) weights to HookedTransformer format. Apertus is structurally similar to Llama but uses non-gated MLP with XIeLU activation, and different layer norm names (attention_layernorm / feedforward_layernorm).
- transformer_lens.pretrained.weight_conversions.apertus.convert_apertus_weights(apertus, cfg: HookedTransformerConfig)#