Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
[2.0.0] - 2023-03-15¶
[2.0.0] - Added¶
Added
Fabric.all_reduce
(#16459)Added support for saving and loading DeepSpeed checkpoints through
Fabric.save/load()
(#16452)Added support for automatically calling
set_epoch
on thedataloader.batch_sampler.sampler
(#16841)Added support for writing logs to remote file systems with the
CSVLogger
(#16880)Added support for frozen dataclasses in the optimizer state (#16656)
Added
lightning.fabric.is_wrapped
to check whether a module, optimizer, or dataloader was already wrapped by Fabric (#16953)
[2.0.0] - Changed¶
Fabric now chooses
accelerator="auto", strategy="auto", devices="auto"
as defaults (#16842)Checkpoint saving and loading redesign (#16434)
Changed the method signatrue of
Fabric.save
andFabric.load
Changed the method signature of
Strategy.save_checkpoint
andFabric.load_checkpoint
Fabric.save
accepts a state that can contain model and optimizer referencesFabric.load
can now load state in-place onto models and optimizersFabric.load
returns a dictionary of objects that weren’t loaded into the stateStrategy.save_checkpoint
andFabric.load_checkpoint
are now responsible for accessing the state of the model and optimizers
DataParallelStrategy.get_module_state_dict()
andDDPStrategy.get_module_state_dict()
now correctly extracts the state dict without keys prefixed with ‘module’ (#16487)“Native” suffix removal (#16490)
strategy="fsdp_full_shard_offload"
is nowstrategy="fsdp_cpu_offload"
lightning.fabric.plugins.precision.native_amp
is nowlightning.fabric.plugins.precision.amp
Enabled all shorthand strategy names that can be supported in the CLI (#16485)
Renamed
strategy='tpu_spawn'
tostrategy='xla'
andstrategy='tpu_spawn_debug'
tostrategy='xla_debug'
(#16781)Changed arguments for precision settings (from [64|32|16|bf16] to [“64-true”|”32-true”|”16-mixed”|”bf16-mixed”]) (#16767)
The selection
Fabric(strategy="ddp_spawn", ...)
no longer falls back to “ddp” when a cluster environment gets detected (#16780)Renamed
setup_dataloaders(replace_sampler=...)
tosetup_dataloaders(use_distributed_sampler=...)
(#16829)
[2.0.0] - Removed¶
[1.9.4] - 2023-03-01¶
[1.9.3] - 2023-02-21¶
[1.9.2] - 2023-02-15¶
[1.9.1] - 2023-02-10¶
[1.9.1] - Fixed¶
Fixed error handling for
accelerator="mps"
andddp
strategy pairing (#16455)Fixed strict availability check for
torch_xla
requirement (#16476)Fixed an issue where PL would wrap DataLoaders with XLA’s MpDeviceLoader more than once (#16571)
Fixed the batch_sampler reference for DataLoaders wrapped with XLA’s MpDeviceLoader (#16571)
Fixed an import error when
torch.distributed
is not available (#16658)
[1.9.0] - 2023-01-17¶
[1.9.0] - Added¶
Added
Fabric.launch()
to programmatically launch processes (e.g. in Jupyter notebook) (#14992)Added the option to launch Fabric scripts from the CLI, without the need to wrap the code into the
run
method (#14992)Added
Fabric.setup_module()
andFabric.setup_optimizers()
to support strategies that need to set up the model before an optimizer can be created (#15185)Added support for Fully Sharded Data Parallel (FSDP) training in Lightning Lite (#14967)
Added
lightning.fabric.accelerators.find_usable_cuda_devices
utility function (#16147)Added basic support for LightningModules (#16048)
Added support for managing callbacks via
Fabric(callbacks=...)
and emitting events throughFabric.call()
(#16074)Added Logger support (#16121)
Added
Fabric(loggers=...)
to support different Logger frameworks in FabricAdded
Fabric.log
for logging scalars using multiple loggersAdded
Fabric.log_dict
for logging a dictionary of multiple metrics at onceAdded
Fabric.loggers
andFabric.logger
attributes to access the individual logger instancesAdded support for calling
self.log
andself.log_dict
in a LightningModule when using FabricAdded access to
self.logger
andself.loggers
in a LightningModule when using Fabric
Added
lightning.fabric.loggers.TensorBoardLogger
(#16121)Added
lightning.fabric.loggers.CSVLogger
(#16346)Added support for a consistent
.zero_grad(set_to_none=...)
on the wrapped optimizer regardless of which strategy is used (#16275)
[1.9.0] - Changed¶
The
Fabric.run()
method is no longer abstract (#14992)The
XLAStrategy
now inherits fromParallelStrategy
instead ofDDPSpawnStrategy
(#15838)Merged the implementation of
DDPSpawnStrategy
intoDDPStrategy
and removedDDPSpawnStrategy
(#14952)The dataloader wrapper returned from
.setup_dataloaders()
now calls.set_epoch()
on the distributed sampler if one is used (#16101)Renamed
Strategy.reduce
toStrategy.all_reduce
in all strategies (#16370)When using multiple devices, the strategy now defaults to “ddp” instead of “ddp_spawn” when none is set (#16388)
[1.8.6] - 2022-12-21¶
minor cleaning
[1.8.5] - 2022-12-15¶
minor cleaning