transformer_lens.model_bridge.generalized_components.qwen3_5_vision_encoder module¶
Qwen3.5 vision-tower bridges (model.visual).
The Qwen vision tower differs from SigLIP/CLIP, so it needs its own bridge. The merger
(vision->text projector) is bridged separately as the adapter’s vision_projector, and
the paramless rotary_pos_emb is left native.
- class transformer_lens.model_bridge.generalized_components.qwen3_5_vision_encoder.Qwen3_5VisionBlockBridge(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None)¶
Bases:
GeneralizedComponentBridge for a single Qwen3.5 vision block.
Norms stay black-box (hooked, not recomputed): NormalizationBridge would recompute with the wrong eps and break parity.
- hook_aliases: Dict[str, str | List[str]] = {'hook_attn_in': 'attn.hook_in', 'hook_attn_out': 'attn.hook_out', 'hook_mlp_in': 'mlp.hook_in', 'hook_mlp_out': 'mlp.hook_out', 'hook_resid_post': 'hook_out', 'hook_resid_pre': 'hook_in'}¶
- is_list_item: bool = True¶
- real_components: Dict[str, tuple]¶
- training: bool¶
- class transformer_lens.model_bridge.generalized_components.qwen3_5_vision_encoder.Qwen3_5VisionEncoderBridge(name: str, config: Any | None = None, submodules: Dict[str, GeneralizedComponent] | None = None)¶
Bases:
GeneralizedComponentBridge for the Qwen3.5 vision tower (
model.visual); merger is bridged separately.- hook_aliases: Dict[str, str | List[str]] = {'hook_vision_embed': 'patch_embed.hook_out', 'hook_vision_out': 'hook_out'}¶
- real_components: Dict[str, tuple]¶
- training: bool¶