transformer_lens.utilities.devices#
Devices.
Utilities to get the correct device, and assist in distributing model layers across multiple devices.
- transformer_lens.utilities.devices.AvailableDeviceMemory#
This type is passed around between different CUDA memory operations. The first entry of each tuple will be the device index. The second entry will be how much memory is currently available.
alias of
list
[tuple
[int
,int
]]
- transformer_lens.utilities.devices.calculate_available_device_cuda_memory(i: int) int #
Calculates how much memory is available at this moment for the device at the indicated index
- Parameters:
i (int) – The index we are looking at
- Returns:
How memory is available
- Return type:
int
- transformer_lens.utilities.devices.determine_available_memory_for_available_devices(max_devices: int) list[tuple[int, int]] #
Gets all available CUDA devices with their current memory calculated
- Returns:
The list of all available devices with memory precalculated
- Return type:
AvailableDeviceMemory
- transformer_lens.utilities.devices.get_best_available_cuda_device(max_devices: Optional[int] = None) device #
Gets whichever cuda device has the most available amount of memory for use
- Raises:
EnvironmentError – If there are no available devices, this will error out
- Returns:
The specific device that should be used
- Return type:
torch.device
- transformer_lens.utilities.devices.get_best_available_device(cfg: HookedTransformerConfig) device #
Gets the best available device to be used based on the passed in arguments
- Parameters:
device (Union[torch.device, str]) – Either the existing torch device or the string identifier
- Returns:
The best available device
- Return type:
torch.device
- transformer_lens.utilities.devices.get_device_for_block_index(index: int, cfg: HookedTransformerConfig, device: Optional[Union[device, str]] = None)#
Determine the device for a given layer index based on the model configuration.
This function assists in distributing model layers across multiple devices. The distribution is based on the configuration’s number of layers (cfg.n_layers) and devices (cfg.n_devices).
- Parameters:
index (int) – Model layer index.
cfg (HookedTransformerConfig) – Model and device configuration.
device (Optional[Union[torch.device, str]], optional) – Initial device used for determining the target device. If not provided, the function uses the device specified in the configuration (cfg.device).
- Returns:
The device for the specified layer index.
- Return type:
torch.device
- Deprecated:
This function did not take into account a few factors for multi-GPU support. You should now use get_best_available_device in order to properly run models on multiple devices. This will be removed in 3.0
- transformer_lens.utilities.devices.move_to_and_update_config(model: Union[HookedTransformer, HookedEncoder, HookedEncoderDecoder], device_or_dtype: Union[device, str, dtype], print_details=True)#
Wrapper around to that also updates model.cfg.
- transformer_lens.utilities.devices.sort_devices_based_on_available_memory(devices: list[tuple[int, int]]) list[tuple[int, int]] #
Sorts all available devices with devices with the most available memory returned first
- Parameters:
devices (AvailableDeviceMemory) – All available devices with memory calculated
- Returns:
The same list of passed through devices sorted with devices with most available memory first
- Return type:
AvailableDeviceMemory