Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
[2.0.4] - 2023-06-22¶
[2.0.4] - Fixed¶
[2.0.3] - 2023-06-07¶
Added support for
Callbackregistration through entry points (#17756)Add Fabric internal hooks (#17759)
[2.0.3] - Changed¶
[2.0.3] - Fixed¶
[2.0.2] - 2023-04-24¶
[2.0.2] - Changed¶
Enable precision autocast for LightningModule step methods in Fabric (#17439)
[2.0.2] - Fixed¶
[2.0.1] - 2023-03-30¶
[2.0.1] - Changed¶
Generalized
Optimizervalidation to accommodate both FSDP 1.x and 2.x (#16733)
[2.0.0] - 2023-03-15¶
[2.0.0] - Added¶
Added
Fabric.all_reduce(#16459)Added support for saving and loading DeepSpeed checkpoints through
Fabric.save/load()(#16452)Added support for automatically calling
set_epochon thedataloader.batch_sampler.sampler(#16841)Added support for writing logs to remote file systems with the
CSVLogger(#16880)Added support for frozen dataclasses in the optimizer state (#16656)
Added
lightning.fabric.is_wrappedto check whether a module, optimizer, or dataloader was already wrapped by Fabric (#16953)
[2.0.0] - Changed¶
Fabric now chooses
accelerator="auto", strategy="auto", devices="auto"as defaults (#16842)Checkpoint saving and loading redesign (#16434)
Changed the method signatrue of
Fabric.saveandFabric.loadChanged the method signature of
Strategy.save_checkpointandFabric.load_checkpointFabric.saveaccepts a state that can contain model and optimizer referencesFabric.loadcan now load state in-place onto models and optimizersFabric.loadreturns a dictionary of objects that weren’t loaded into the stateStrategy.save_checkpointandFabric.load_checkpointare now responsible for accessing the state of the model and optimizers
DataParallelStrategy.get_module_state_dict()andDDPStrategy.get_module_state_dict()now correctly extracts the state dict without keys prefixed with ‘module’ (#16487)“Native” suffix removal (#16490)
strategy="fsdp_full_shard_offload"is nowstrategy="fsdp_cpu_offload"lightning.fabric.plugins.precision.native_ampis nowlightning.fabric.plugins.precision.amp
Enabled all shorthand strategy names that can be supported in the CLI (#16485)
Renamed
strategy='tpu_spawn'tostrategy='xla'andstrategy='tpu_spawn_debug'tostrategy='xla_debug'(#16781)Changed arguments for precision settings (from [64|32|16|bf16] to [“64-true”|”32-true”|”16-mixed”|”bf16-mixed”]) (#16767)
The selection
Fabric(strategy="ddp_spawn", ...)no longer falls back to “ddp” when a cluster environment gets detected (#16780)Renamed
setup_dataloaders(replace_sampler=...)tosetup_dataloaders(use_distributed_sampler=...)(#16829)
[2.0.0] - Removed¶
[2.0.0] - Fixed¶
[1.9.4] - 2023-03-01¶
[1.9.4] - Added¶
Added
Fabric(strategy="auto")support (#16916)
[1.9.4] - Fixed¶
[1.9.3] - 2023-02-21¶
[1.9.3] - Fixed¶
[1.9.2] - 2023-02-15¶
[1.9.2] - Fixed¶
Fixed an attribute error and improved input validation for invalid strategy types being passed to Trainer (#16693)
[1.9.1] - 2023-02-10¶
[1.9.1] - Fixed¶
Fixed error handling for
accelerator="mps"andddpstrategy pairing (#16455)Fixed strict availability check for
torch_xlarequirement (#16476)Fixed an issue where PL would wrap DataLoaders with XLA’s MpDeviceLoader more than once (#16571)
Fixed the batch_sampler reference for DataLoaders wrapped with XLA’s MpDeviceLoader (#16571)
Fixed an import error when
torch.distributedis not available (#16658)
[1.9.0] - 2023-01-17¶
[1.9.0] - Added¶
Added
Fabric.launch()to programmatically launch processes (e.g. in Jupyter notebook) (#14992)Added the option to launch Fabric scripts from the CLI, without the need to wrap the code into the
runmethod (#14992)Added
Fabric.setup_module()andFabric.setup_optimizers()to support strategies that need to set up the model before an optimizer can be created (#15185)Added support for Fully Sharded Data Parallel (FSDP) training in Lightning Lite (#14967)
Added
lightning.fabric.accelerators.find_usable_cuda_devicesutility function (#16147)Added basic support for LightningModules (#16048)
Added support for managing callbacks via
Fabric(callbacks=...)and emitting events throughFabric.call()(#16074)Added Logger support (#16121)
Added
Fabric(loggers=...)to support different Logger frameworks in FabricAdded
Fabric.logfor logging scalars using multiple loggersAdded
Fabric.log_dictfor logging a dictionary of multiple metrics at onceAdded
Fabric.loggersandFabric.loggerattributes to access the individual logger instancesAdded support for calling
self.logandself.log_dictin a LightningModule when using FabricAdded access to
self.loggerandself.loggersin a LightningModule when using Fabric
Added
lightning.fabric.loggers.TensorBoardLogger(#16121)Added
lightning.fabric.loggers.CSVLogger(#16346)Added support for a consistent
.zero_grad(set_to_none=...)on the wrapped optimizer regardless of which strategy is used (#16275)
[1.9.0] - Changed¶
The
Fabric.run()method is no longer abstract (#14992)The
XLAStrategynow inherits fromParallelStrategyinstead ofDDPSpawnStrategy(#15838)Merged the implementation of
DDPSpawnStrategyintoDDPStrategyand removedDDPSpawnStrategy(#14952)The dataloader wrapper returned from
.setup_dataloaders()now calls.set_epoch()on the distributed sampler if one is used (#16101)Renamed
Strategy.reducetoStrategy.all_reducein all strategies (#16370)When using multiple devices, the strategy now defaults to “ddp” instead of “ddp_spawn” when none is set (#16388)
[1.9.0] - Removed¶
Removed support for FairScale’s sharded training (
strategy='ddp_sharded'|'ddp_sharded_spawn'). Use Fully-Sharded Data Parallel instead (strategy='fsdp') (#16329)
[1.9.0] - Fixed¶
[1.8.6] - 2022-12-21¶
minor cleaning
[1.8.5] - 2022-12-15¶
minor cleaning
[1.8.4] - 2022-12-08¶
[1.8.4] - Fixed¶
Fixed
shuffle=Falsehaving no effect when using DDP/DistributedSampler (#15931)
[1.8.3] - 2022-11-22¶
[1.8.3] - Changed¶
Temporarily removed support for Hydra multi-run (#15737)
[1.8.2] - 2022-11-17¶
[1.8.2] - Fixed¶
Fixed the automatic fallback from
LightningLite(strategy="ddp_spawn", ...)toLightningLite(strategy="ddp", ...)when on an LSF cluster (#15103)