Finetuning Scheduler¶
Author: Dan Dale
License: CC BY-SA
Generated: 2022-06-13T17:31:27.948986
This notebook introduces the Finetuning Scheduler extension and demonstrates the use of it to finetune a small foundational model on the RTE task of SuperGLUE with iterative early-stopping defined according to a user-specified schedule. It uses Hugging Face’s datasets
and transformers
libraries to
retrieve the relevant benchmark data and foundational model weights. The required dependencies are installed via the finetuning-scheduler [examples]
extra.
Give us a ⭐ on Github | Check out the documentation | Join us on Slack
Setup¶
This notebook requires some packages besides pytorch-lightning.
[1]:
! pip install --quiet "ipython[notebook]" "torch>=1.8" "setuptools==59.5.0" "pytorch-lightning>=1.4" "hydra-core>=1.1.0" "finetuning-scheduler[examples]" "torchmetrics>=0.7"
Scheduled Finetuning with the Finetuning Scheduler Extension¶
The Finetuning Scheduler extension accelerates and enhances model experimentation with flexible finetuning schedules.
Training with the extension is simple and confers a host of benefits:
it dramatically increases finetuning flexibility
expedites and facilitates exploration of model tuning dynamics
enables marginal performance improvements of finetuned models
Setup is straightforward, just install from PyPI! Since this notebook-based example requires a few additional packages (e.g. transformers
, sentencepiece
), we installed the finetuning-scheduler
package with the [examples]
extra above. Once the finetuning-scheduler
package is installed, the FinetuningScheduler callback is available for use
with PyTorch Lightning. For additional installation options, please see the Finetuning Scheduler README.
Fundamentally, Finetuning Scheduler enables scheduled, multi-phase, finetuning of foundational models. Gradual unfreezing (i.e. thawing) can help maximize foundational model knowledge retention while allowing (typically upper layers of) the model to optimally adapt to new tasks during transfer learning 1, 2, 3
The FinetuningScheduler callback orchestrates the gradual unfreezing of models via a finetuning schedule that is either implicitly generated (the default) or explicitly provided by the user (more computationally efficient). Finetuning phase transitions are driven by
FTSEarlyStopping criteria (a multi-phase extension of EarlyStopping
packaged with FinetuningScheduler), user-specified epoch transitions or a composition of the two (the default mode). A
FinetuningScheduler training session completes when the final phase of the schedule has its stopping criteria met. See the early stopping documentation for more details on that callback’s configuration.
Basic Usage¶
If no finetuning schedule is provided by the user, FinetuningScheduler will generate a default schedule and proceed to finetune according to the generated schedule, using default
FTSEarlyStopping and FTSCheckpoint callbacks with monitor=val_loss
.
from pytorch_lightning import Trainer
from finetuning_scheduler import FinetuningScheduler
trainer = Trainer(callbacks=[FinetuningScheduler()])
The Default Finetuning Schedule¶
Schedule definition is facilitated via the gen_ft_schedule method which dumps a default finetuning schedule (by default using a naive, 2-parameters per level heuristic) which can be adjusted as desired by the user and/or subsequently passed to the callback. Using the default/implicitly generated schedule will likely be less computationally efficient than a user-defined finetuning schedule but is useful for exploring a model’s finetuning behavior and can serve as a good baseline for subsequent explicit schedule refinement. While the current version of FinetuningScheduler only supports single optimizer and (optional) lr_scheduler configurations, per-phase maximum learning rates can be set as demonstrated in the next section.
Specifying a Finetuning Schedule¶
To specify a finetuning schedule, it’s convenient to first generate the default schedule and then alter the thawed/unfrozen parameter groups associated with each finetuning phase as desired. Finetuning phases are zero-indexed and executed in ascending order.
First, generate the default schedule to
Trainer.log_dir
. It will be named after yourLightningModule
subclass with the suffix_ft_schedule.yaml
.
from pytorch_lightning import Trainer
from finetuning_scheduler import FinetuningScheduler
trainer = Trainer(callbacks=[FinetuningScheduler(gen_ft_sched_only=True)])
Alter the schedule as desired.
Once the finetuning schedule has been altered as desired, pass it to FinetuningScheduler to commence scheduled training:
from pytorch_lightning import Trainer
from finetuning_scheduler import FinetuningScheduler
trainer = Trainer(callbacks=[FinetuningScheduler(ft_schedule="/path/to/my/schedule/my_schedule.yaml")])
Early-Stopping and Epoch-Driven Phase Transition Criteria¶
By default, FTSEarlyStopping and epoch-driven transition criteria are composed. If a max_transition_epoch
is specified for a given phase, the next finetuning phase will begin at that epoch unless
FTSEarlyStopping criteria are met first. If FinetuningScheduler.epoch_transitions_only is True
,
FTSEarlyStopping will not be used and transitions will be exclusively epoch-driven.