GPU training (Basic)¶

Audience: Users looking to save money and run large models faster using single or multiple

What is a GPU?¶

A Graphics Processing Unit (GPU), is a specialized hardware accelerator designed to speed up mathematical computations used in gaming and deep learning.

Train on 1 GPU¶

Make sure you’re running on a machine with at least one GPU. There’s no need to specify any NVIDIA flags as Lightning will do it for you.

trainer = Trainer(accelerator="gpu", devices=1)

Train on multiple GPUs¶

To use multiple GPUs, set the number of devices in the Trainer or the index of the GPUs.

trainer = Trainer(accelerator="gpu", devices=4)

Choosing GPU devices¶

You can select the GPU devices using ranges, a list of indices or a string containing a comma separated list of GPU ids:

# DEFAULT (int) specifies how many GPUs to use per node
Trainer(accelerator="gpu", devices=k)

# Above is equivalent to
Trainer(accelerator="gpu", devices=list(range(k)))

# Specify which GPUs to use (don't use when running on cluster)
Trainer(accelerator="gpu", devices=[0, 1])

# Equivalent using a string
Trainer(accelerator="gpu", devices="0, 1")

# To use all available GPUs put -1 or '-1'
# equivalent to list(range(torch.cuda.device_count()))
Trainer(accelerator="gpu", devices=-1)

The table below lists examples of possible input formats and how they are interpreted by Lightning.

devices	Type	Parsed	Meaning
3	int	[0, 1, 2]	first 3 GPUs
-1	int	[0, 1, 2, …]	all available GPUs
[0]	list	[0]	GPU 0
[1, 3]	list	[1, 3]	GPUs 1 and 3
“3”	str	[0, 1, 2]	first 3 GPUs
“1, 3”	str	[1, 3]	GPUs 1 and 3
“-1”	str	[0, 1, 2, …]	all available GPUs

Find usable CUDA devices¶

If you want to run several experiments at the same time on your machine, for example for a hyperparameter sweep, then you can use the following utility function to pick GPU indices that are “accessible”, without having to change your code every time.

from lightning.pytorch.accelerators import find_usable_cuda_devices

# Find two GPUs on the system that are not already occupied
trainer = Trainer(accelerator="cuda", devices=find_usable_cuda_devices(2))

from lightning.lite.accelerators import find_usable_cuda_devices

# Works with LightningLite too
lite = LightningLite(accelerator="cuda", devices=find_usable_cuda_devices(2))

This is especially useful when GPUs are configured to be in “exclusive compute mode”, such that only one process at a time is allowed access to the device. This special mode is often enabled on server GPUs or systems shared among multiple users.