{"cells": [{"cell_type": "markdown", "id": "b2dec494", "metadata": {"papermill": {"duration": 0.016555, "end_time": "2022-06-13T15:31:48.373430", "exception": false, "start_time": "2022-06-13T15:31:48.356875", "status": "completed"}, "tags": []}, "source": ["\n", "# Finetuning Scheduler\n", "\n", "* **Author:** [Dan Dale](https://github.com/speediedan)\n", "* **License:** CC BY-SA\n", "* **Generated:** 2022-06-13T17:31:27.948986\n", "\n", "This notebook introduces the [Finetuning Scheduler](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) extension\n", "and demonstrates the use of it to finetune a small foundational model on the\n", "[RTE](https://huggingface.co/datasets/viewer/?dataset=super_glue&config=rte) task of\n", "[SuperGLUE](https://super.gluebenchmark.com/) with iterative early-stopping defined according to a user-specified\n", "schedule. It uses Hugging Face's ``datasets`` and ``transformers`` libraries to retrieve the relevant benchmark data\n", "and foundational model weights. The required dependencies are installed via the finetuning-scheduler ``[examples]`` extra.\n", "\n", "\n", "---\n", "Open in [{height=\"20px\" width=\"117px\"}](https://colab.research.google.com/github/PytorchLightning/lightning-tutorials/blob/publication/.notebooks/lightning_examples/finetuning-scheduler.ipynb)\n", "\n", "Give us a \u2b50 [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", "| Check out [the documentation](https://pytorch-lightning.readthedocs.io/en/stable/)\n", "| Join us [on Slack](https://www.pytorchlightning.ai/community)"]}, {"cell_type": "markdown", "id": "2f9d0607", "metadata": {"papermill": {"duration": 0.01088, "end_time": "2022-06-13T15:31:48.395609", "exception": false, "start_time": "2022-06-13T15:31:48.384729", "status": "completed"}, "tags": []}, "source": ["## Setup\n", "This notebook requires some packages besides pytorch-lightning."]}, {"cell_type": "code", "execution_count": 1, "id": "1e79a2d8", "metadata": {"colab": {}, "colab_type": "code", "execution": {"iopub.execute_input": "2022-06-13T15:31:48.419871Z", "iopub.status.busy": "2022-06-13T15:31:48.419025Z", "iopub.status.idle": "2022-06-13T15:31:52.693199Z", "shell.execute_reply": "2022-06-13T15:31:52.692233Z"}, "id": "LfrJLKPFyhsK", "lines_to_next_cell": 0, "papermill": {"duration": 4.288856, "end_time": "2022-06-13T15:31:52.695457", "exception": false, "start_time": "2022-06-13T15:31:48.406601", "status": "completed"}, "tags": []}, "outputs": [], "source": ["! pip install --quiet \"ipython[notebook]\" \"torch>=1.8\" \"setuptools==59.5.0\" \"pytorch-lightning>=1.4\" \"hydra-core>=1.1.0\" \"finetuning-scheduler[examples]\" \"torchmetrics>=0.7\""]}, {"cell_type": "markdown", "id": "68ff7058", "metadata": {"papermill": {"duration": 0.011277, "end_time": "2022-06-13T15:31:52.718548", "exception": false, "start_time": "2022-06-13T15:31:52.707271", "status": "completed"}, "tags": []}, "source": ["## Scheduled Finetuning with the Finetuning Scheduler Extension\n", "\n", "{height=\"58px\" width=\"401px\"}\n", "\n", "The [Finetuning Scheduler](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) extension accelerates and enhances model experimentation with flexible finetuning schedules.\n", "\n", "Training with the extension is simple and confers a host of benefits:\n", "\n", "- it dramatically increases finetuning flexibility\n", "- expedites and facilitates exploration of model tuning dynamics\n", "- enables marginal performance improvements of finetuned models\n", "\n", "Setup is straightforward, just install from PyPI! Since this notebook-based example requires a few additional packages (e.g.\n", "``transformers``, ``sentencepiece``), we installed the ``finetuning-scheduler`` package with the ``[examples]`` extra above.\n", "Once the ``finetuning-scheduler`` package is installed, the [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) callback is available for use with PyTorch Lightning.\n", "For additional installation options, please see the Finetuning Scheduler [README](https://github.com/speediedan/finetuning-scheduler/blob/main/README.md).\n", "\n", "\n", "\n", "<div style=\"display:inline\" id=\"a1\">\n", "\n", "Fundamentally, [Finetuning Scheduler](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) enables\n", "scheduled, multi-phase, finetuning of foundational models. Gradual unfreezing (i.e. thawing) can help maximize\n", "foundational model knowledge retention while allowing (typically upper layers of) the model to\n", "optimally adapt to new tasks during transfer learning [1, 2, 3](#f1)\n", "\n", "</div>\n", "\n", "The [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) callback orchestrates the gradual unfreezing\n", "of models via a finetuning schedule that is either implicitly generated (the default) or explicitly provided by the user\n", "(more computationally efficient). Finetuning phase transitions are driven by\n", "[FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping)\n", "criteria (a multi-phase extension of ``EarlyStopping`` packaged with FinetuningScheduler), user-specified epoch transitions or a composition of the two (the default mode).\n", "A [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) training session completes when the\n", "final phase of the schedule has its stopping criteria met. See\n", "the [early stopping documentation](https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.callbacks.EarlyStopping.html) for more details on that callback's configuration.\n", "\n", "{height=\"272px\" width=\"376px\"}"]}, {"cell_type": "markdown", "id": "fe7eee49", "metadata": {"papermill": {"duration": 0.010879, "end_time": "2022-06-13T15:31:52.740447", "exception": false, "start_time": "2022-06-13T15:31:52.729568", "status": "completed"}, "tags": []}, "source": ["\n", "## Basic Usage\n", "\n", "<div id=\"basic_usage\">\n", "\n", "If no finetuning schedule is provided by the user, [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) will generate a\n", "[default schedule](#The-Default-Finetuning-Schedule) and proceed to finetune according to the generated schedule,\n", "using default [FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping) and [FTSCheckpoint](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSCheckpoint) callbacks with ``monitor=val_loss``.\n", "\n", "</div>\n", "\n", "```python\n", "from pytorch_lightning import Trainer\n", "from finetuning_scheduler import FinetuningScheduler\n", "trainer = Trainer(callbacks=[FinetuningScheduler()])\n", "```"]}, {"cell_type": "markdown", "id": "3e1d1245", "metadata": {"papermill": {"duration": 0.010959, "end_time": "2022-06-13T15:31:52.762279", "exception": false, "start_time": "2022-06-13T15:31:52.751320", "status": "completed"}, "tags": []}, "source": ["## The Default Finetuning Schedule\n", "\n", "Schedule definition is facilitated via the [gen_ft_schedule](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.ScheduleImplMixin.gen_ft_schedule) method which dumps a default finetuning schedule (by default using a naive, 2-parameters per level heuristic) which can be adjusted as\n", "desired by the user and/or subsequently passed to the callback. Using the default/implicitly generated schedule will likely be less computationally efficient than a user-defined finetuning schedule but is useful for exploring a model's finetuning behavior and can serve as a good baseline for subsequent explicit schedule refinement.\n", "While the current version of [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) only supports single optimizer and (optional) lr_scheduler configurations, per-phase maximum learning rates can be set as demonstrated in the next section."]}, {"cell_type": "markdown", "id": "416bc2d8", "metadata": {"papermill": {"duration": 0.010885, "end_time": "2022-06-13T15:31:52.784236", "exception": false, "start_time": "2022-06-13T15:31:52.773351", "status": "completed"}, "tags": []}, "source": ["## Specifying a Finetuning Schedule\n", "\n", "To specify a finetuning schedule, it's convenient to first generate the default schedule and then alter the thawed/unfrozen parameter groups associated with each finetuning phase as desired. Finetuning phases are zero-indexed and executed in ascending order.\n", "\n", "1. First, generate the default schedule to ``Trainer.log_dir``. It will be named after your\n", " ``LightningModule`` subclass with the suffix ``_ft_schedule.yaml``.\n", "\n", "```python\n", " from pytorch_lightning import Trainer\n", " from finetuning_scheduler import FinetuningScheduler\n", " trainer = Trainer(callbacks=[FinetuningScheduler(gen_ft_sched_only=True)])\n", "```\n", "\n", "2. Alter the schedule as desired.\n", "\n", "{height=\"327px\" width=\"800px\"}\n", "\n", "3. Once the finetuning schedule has been altered as desired, pass it to\n", " [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) to commence scheduled training:\n", "\n", "```python\n", "from pytorch_lightning import Trainer\n", "from finetuning_scheduler import FinetuningScheduler\n", "\n", "trainer = Trainer(callbacks=[FinetuningScheduler(ft_schedule=\"/path/to/my/schedule/my_schedule.yaml\")])\n", "```"]}, {"cell_type": "markdown", "id": "c2af2543", "metadata": {"papermill": {"duration": 0.010768, "end_time": "2022-06-13T15:31:52.805955", "exception": false, "start_time": "2022-06-13T15:31:52.795187", "status": "completed"}, "tags": []}, "source": ["## Early-Stopping and Epoch-Driven Phase Transition Criteria\n", "\n", "\n", "By default, [FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping) and epoch-driven\n", "transition criteria are composed. If a ``max_transition_epoch`` is specified for a given phase, the next finetuning phase will begin at that epoch unless [FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping) criteria are met first.\n", "If [FinetuningScheduler.epoch_transitions_only](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler.params.epoch_transitions_only) is ``True``, [FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping) will not be used\n", "and transitions will be exclusively epoch-driven.\n", "\n", "\n", "<div class=\"alert alert-info\">\n", "\n", "**Tip:** Use of regex expressions can be convenient for specifying more complex schedules. Also, a per-phase base maximum lr can be specified:\n", "\n", "{height=\"380px\" width=\"800px\"}\n", "\n", "</div>\n", "\n", "\n", "\n", "The end-to-end example in this notebook ([Scheduled Finetuning For SuperGLUE](#superglue)) uses [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) in explicit mode to finetune a small foundational model on the [RTE](https://huggingface.co/datasets/viewer/?dataset=super_glue&config=rte) task of [SuperGLUE](https://super.gluebenchmark.com/).\n", "Please see the [official Finetuning Scheduler documentation](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) if you are interested in a similar [CLI-based example](https://finetuning-scheduler.readthedocs.io/en/stable/index.html#scheduled-finetuning-superglue) using the LightningCLI."]}, {"cell_type": "markdown", "id": "c462ef7d", "metadata": {"papermill": {"duration": 0.010778, "end_time": "2022-06-13T15:31:52.827711", "exception": false, "start_time": "2022-06-13T15:31:52.816933", "status": "completed"}, "tags": []}, "source": ["## Resuming Scheduled Finetuning Training Sessions\n", "\n", "Resumption of scheduled finetuning training is identical to the continuation of\n", "[other training sessions](https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html) with the caveat that the provided checkpoint must have been saved by a [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) session.\n", "[FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) uses [FTSCheckpoint](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSCheckpoint) (an extension of ``ModelCheckpoint``) to maintain schedule state with special metadata.\n", "\n", "\n", "```python\n", "from pytorch_lightning import Trainer\n", "from finetuning_scheduler import FinetuningScheduler\n", "trainer = Trainer(callbacks=[FinetuningScheduler()])\n", "trainer.fit(..., ckpt_path=\"some/path/to/my_checkpoint.ckpt\")\n", "```\n", "\n", "Training will resume at the depth/level of the provided checkpoint according to the specified schedule. Schedules can be altered between training sessions but schedule compatibility is left to the user for maximal flexibility. If executing a user-defined schedule, typically the same schedule should be provided for the original and resumed training sessions.\n", "\n", "By default ([FinetuningScheduler.restore_best](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html?highlight=restore_best#finetuning_scheduler.fts.FinetuningScheduler.params.restore_best) is ``True``), [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) will attempt to restore the best available checkpoint before finetuning depth transitions.\n", "\n", "```python\n", "trainer = Trainer(callbacks=[FinetuningScheduler()])\n", "trainer.fit(..., ckpt_path=\"some/path/to/my_kth_best_checkpoint.ckpt\")\n", "```\n", "\n", "Note that similar to the behavior of [ModelCheckpoint](https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.callbacks.ModelCheckpoint.html), (specifically [this PR](https://github.com/PyTorchLightning/pytorch-lightning/pull/12045)),\n", "when resuming training with a different [FTSCheckpoint](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSCheckpoint) ``dirpath`` from the provided\n", "checkpoint, the new training session's checkpoint state will be re-initialized at the resumption depth with the provided checkpoint being set as the best checkpoint."]}, {"cell_type": "markdown", "id": "0b65e8c9", "metadata": {"papermill": {"duration": 0.010849, "end_time": "2022-06-13T15:31:52.849341", "exception": false, "start_time": "2022-06-13T15:31:52.838492", "status": "completed"}, "tags": []}, "source": ["<div class=\"alert alert-warning\">\n", "\n", "**Note:** Currently, [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) supports the following strategy types:\n", "\n", "- ``DP``\n", "- ``DDP``\n", "- ``DDP_SPAWN``\n", "- ``DDP_SHARDED``\n", "- ``DDP_SHARDED_SPAWN``\n", "\n", "Custom or officially unsupported strategies can be used by setting [FinetuningScheduler.allow_untested](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html?highlight=allow_untested#finetuning_scheduler.fts.FinetuningScheduler.params.allow_untested) to ``True``.\n", "Note that most currently unsupported strategies are so because they require varying degrees of modification to be compatible (e.g. ``deepspeed`` requires an ``add_param_group`` method, ``tpu_spawn`` an override of the current broadcast method to include python objects)\n", "</div>"]}, {"cell_type": "markdown", "id": "c487881e", "metadata": {"papermill": {"duration": 0.010839, "end_time": "2022-06-13T15:31:52.870869", "exception": false, "start_time": "2022-06-13T15:31:52.860030", "status": "completed"}, "tags": []}, "source": ["<div id=\"superglue\"></div>\n", "\n", "## Scheduled Finetuning For SuperGLUE\n", "\n", "The following example demonstrates the use of [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) to finetune a small foundational model on the [RTE](https://huggingface.co/datasets/viewer/?dataset=super_glue&config=rte) task of [SuperGLUE](https://super.gluebenchmark.com/). Iterative early-stopping will be applied according to a user-specified schedule.\n"]}, {"cell_type": "code", "execution_count": 2, "id": "9cb0605e", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:52.894740Z", "iopub.status.busy": "2022-06-13T15:31:52.893961Z", "iopub.status.idle": "2022-06-13T15:31:56.334774Z", "shell.execute_reply": "2022-06-13T15:31:56.334003Z"}, "papermill": {"duration": 3.455097, "end_time": "2022-06-13T15:31:56.336869", "exception": false, "start_time": "2022-06-13T15:31:52.881772", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["/usr/local/lib/python3.8/dist-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.\n", " setattr(self, word, getattr(machar, word).flat[0])\n", "/usr/local/lib/python3.8/dist-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.\n", " return self._float_to_str(self.smallest_subnormal)\n", "/usr/local/lib/python3.8/dist-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.\n", " setattr(self, word, getattr(machar, word).flat[0])\n", "/usr/local/lib/python3.8/dist-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.\n", " return self._float_to_str(self.smallest_subnormal)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:root:Bagua cannot detect bundled NCCL library, Bagua will try to use system NCCL instead. If you encounter any error, please run `import bagua_core; bagua_core.install_deps()` or the `bagua_install_deps.py` script to install bundled libraries.\n"]}], "source": ["import os\n", "import warnings\n", "from datetime import datetime\n", "from importlib import import_module\n", "from typing import Any, Dict, List, Optional\n", "\n", "import datasets\n", "\n", "import sentencepiece as sp # noqa: F401 # isort: split\n", "import pytorch_lightning as pl\n", "import torch\n", "from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint\n", "from pytorch_lightning.loggers.tensorboard import TensorBoardLogger\n", "from pytorch_lightning.utilities import rank_zero_warn\n", "from pytorch_lightning.utilities.cli import _Registry\n", "from pytorch_lightning.utilities.exceptions import MisconfigurationException\n", "from torch.optim.adamw import AdamW\n", "from torch.optim.lr_scheduler import CosineAnnealingWarmRestarts\n", "from torch.utils.data import DataLoader\n", "from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer\n", "from transformers import logging as transformers_logging\n", "from transformers.tokenization_utils_base import BatchEncoding"]}, {"cell_type": "code", "execution_count": 3, "id": "668ef9c5", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.361222Z", "iopub.status.busy": "2022-06-13T15:31:56.360631Z", "iopub.status.idle": "2022-06-13T15:31:56.366475Z", "shell.execute_reply": "2022-06-13T15:31:56.365762Z"}, "papermill": {"duration": 0.019182, "end_time": "2022-06-13T15:31:56.367874", "exception": false, "start_time": "2022-06-13T15:31:56.348692", "status": "completed"}, "tags": []}, "outputs": [], "source": ["# a couple helper functions to prepare code to work with a user module registry\n", "MOCK_REGISTRY = _Registry()\n", "\n", "\n", "def mock_register_module(key: str, require_fqn: bool = False) -> List:\n", " if key.lower() == \"finetuningscheduler\":\n", " mod = import_module(\"finetuning_scheduler\")\n", " MOCK_REGISTRY.register_classes(mod, pl.callbacks.Callback)\n", " else:\n", " raise MisconfigurationException(f\"user module key '{key}' not found\")\n", " registered_list = []\n", " # make registered class available by unqualified class name by default\n", " if not require_fqn:\n", " for n, c in MOCK_REGISTRY.items():\n", " globals()[f\"{n}\"] = c\n", " registered_list = \", \".join([n for n in MOCK_REGISTRY.names])\n", " else:\n", " registered_list = \", \".join([c.__module__ + \".\" + c.__name__ for c in MOCK_REGISTRY.classes])\n", " print(f\"Imported and registered the following callbacks: {registered_list}\")"]}, {"cell_type": "code", "execution_count": 4, "id": "4cfa6171", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.390943Z", "iopub.status.busy": "2022-06-13T15:31:56.390530Z", "iopub.status.idle": "2022-06-13T15:31:56.398330Z", "shell.execute_reply": "2022-06-13T15:31:56.397678Z"}, "papermill": {"duration": 0.020943, "end_time": "2022-06-13T15:31:56.399743", "exception": false, "start_time": "2022-06-13T15:31:56.378800", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Imported and registered the following callbacks: FTSCheckpoint, FTSEarlyStopping, FinetuningScheduler\n"]}], "source": ["# Load the `FinetuningScheduler` PyTorch Lightning extension module we want to use. This will import all necessary callbacks.\n", "mock_register_module(\"finetuningscheduler\")\n", "# set notebook-level variables\n", "TASK_NUM_LABELS = {\"boolq\": 2, \"rte\": 2}\n", "DEFAULT_TASK = \"rte\"\n", "\n", "transformers_logging.set_verbosity_error()\n", "# ignore warnings related tokenizers_parallelism/DataLoader parallelism trade-off and\n", "# expected logging behavior\n", "for warnf in [\".*does not have many workers*\", \".*The number of training samples.*\"]:\n", " warnings.filterwarnings(\"ignore\", warnf)"]}, {"cell_type": "code", "execution_count": 5, "id": "610ed9de", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.423564Z", "iopub.status.busy": "2022-06-13T15:31:56.423297Z", "iopub.status.idle": "2022-06-13T15:31:56.434701Z", "shell.execute_reply": "2022-06-13T15:31:56.434041Z"}, "papermill": {"duration": 0.024741, "end_time": "2022-06-13T15:31:56.436129", "exception": false, "start_time": "2022-06-13T15:31:56.411388", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class RteBoolqDataModule(pl.LightningDataModule):\n", " \"\"\"A ``LightningDataModule`` designed for both the RTE or BoolQ SuperGLUE Hugging Face datasets.\"\"\"\n", "\n", " TASK_TEXT_FIELD_MAP = {\"rte\": (\"premise\", \"hypothesis\"), \"boolq\": (\"question\", \"passage\")}\n", " LOADER_COLUMNS = (\n", " \"datasets_idx\",\n", " \"input_ids\",\n", " \"token_type_ids\",\n", " \"attention_mask\",\n", " \"start_positions\",\n", " \"end_positions\",\n", " \"labels\",\n", " )\n", "\n", " def __init__(\n", " self,\n", " model_name_or_path: str,\n", " task_name: str = DEFAULT_TASK,\n", " max_seq_length: int = 128,\n", " train_batch_size: int = 16,\n", " eval_batch_size: int = 16,\n", " tokenizers_parallelism: bool = True,\n", " **dataloader_kwargs: Any,\n", " ):\n", " r\"\"\"Initialize the ``LightningDataModule`` designed for both the RTE or BoolQ SuperGLUE Hugging Face\n", " datasets.\n", "\n", " Args:\n", " model_name_or_path (str):\n", " Can be either:\n", " - A string, the ``model id`` of a pretrained model hosted inside a model repo on huggingface.co.\n", " Valid model ids can be located at the root-level, like ``bert-base-uncased``, or namespaced under\n", " a user or organization name, like ``dbmdz/bert-base-german-cased``.\n", " - A path to a ``directory`` containing model weights saved using\n", " :meth:`~transformers.PreTrainedModel.save_pretrained`, e.g., ``./my_model_directory/``.\n", " task_name (str, optional): Name of the SuperGLUE task to execute. This module supports 'rte' or 'boolq'.\n", " Defaults to DEFAULT_TASK which is 'rte'.\n", " max_seq_length (int, optional): Length to which we will pad sequences or truncate input. Defaults to 128.\n", " train_batch_size (int, optional): Training batch size. Defaults to 16.\n", " eval_batch_size (int, optional): Batch size to use for validation and testing splits. Defaults to 16.\n", " tokenizers_parallelism (bool, optional): Whether to use parallelism in the tokenizer. Defaults to True.\n", " \\**dataloader_kwargs: Arguments passed when initializing the dataloader\n", " \"\"\"\n", " super().__init__()\n", " task_name = task_name if task_name in TASK_NUM_LABELS.keys() else DEFAULT_TASK\n", " self.text_fields = self.TASK_TEXT_FIELD_MAP[task_name]\n", " self.dataloader_kwargs = {\n", " \"num_workers\": dataloader_kwargs.get(\"num_workers\", 0),\n", " \"pin_memory\": dataloader_kwargs.get(\"pin_memory\", False),\n", " }\n", " self.save_hyperparameters()\n", " os.environ[\"TOKENIZERS_PARALLELISM\"] = \"true\" if self.hparams.tokenizers_parallelism else \"false\"\n", " self.tokenizer = AutoTokenizer.from_pretrained(\n", " self.hparams.model_name_or_path, use_fast=True, local_files_only=False\n", " )\n", "\n", " def prepare_data(self):\n", " \"\"\"Load the SuperGLUE dataset.\"\"\"\n", " # N.B. PL calls prepare_data from a single process (rank 0) so do not use it to assign\n", " # state (e.g. self.x=y)\n", " datasets.load_dataset(\"super_glue\", self.hparams.task_name)\n", "\n", " def setup(self, stage):\n", " \"\"\"Setup our dataset splits for training/validation.\"\"\"\n", " self.dataset = datasets.load_dataset(\"super_glue\", self.hparams.task_name)\n", " for split in self.dataset.keys():\n", " self.dataset[split] = self.dataset[split].map(\n", " self._convert_to_features, batched=True, remove_columns=[\"label\"]\n", " )\n", " self.columns = [c for c in self.dataset[split].column_names if c in self.LOADER_COLUMNS]\n", " self.dataset[split].set_format(type=\"torch\", columns=self.columns)\n", "\n", " self.eval_splits = [x for x in self.dataset.keys() if \"validation\" in x]\n", "\n", " def train_dataloader(self):\n", " return DataLoader(self.dataset[\"train\"], batch_size=self.hparams.train_batch_size, **self.dataloader_kwargs)\n", "\n", " def val_dataloader(self):\n", " return DataLoader(self.dataset[\"validation\"], batch_size=self.hparams.eval_batch_size, **self.dataloader_kwargs)\n", "\n", " def _convert_to_features(self, example_batch: datasets.arrow_dataset.Batch) -> BatchEncoding:\n", " \"\"\"Convert raw text examples to a :class:`~transformers.tokenization_utils_base.BatchEncoding` container\n", " (derived from python dict) of features that includes helpful methods for translating between word/character\n", " space and token space.\n", "\n", " Args:\n", " example_batch ([type]): The set of examples to convert to token space.\n", "\n", " Returns:\n", " ``BatchEncoding``: A batch of encoded examples (note default tokenizer batch_size=1000)\n", " \"\"\"\n", " text_pairs = list(zip(example_batch[self.text_fields[0]], example_batch[self.text_fields[1]]))\n", " # Tokenize the text/text pairs\n", " features = self.tokenizer.batch_encode_plus(\n", " text_pairs, max_length=self.hparams.max_seq_length, padding=\"longest\", truncation=True\n", " )\n", " # Rename label to labels to make it easier to pass to model forward\n", " features[\"labels\"] = example_batch[\"label\"]\n", " return features"]}, {"cell_type": "code", "execution_count": 6, "id": "78b500ab", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.460109Z", "iopub.status.busy": "2022-06-13T15:31:56.459721Z", "iopub.status.idle": "2022-06-13T15:31:56.473229Z", "shell.execute_reply": "2022-06-13T15:31:56.472586Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.027482, "end_time": "2022-06-13T15:31:56.474663", "exception": false, "start_time": "2022-06-13T15:31:56.447181", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class RteBoolqModule(pl.LightningModule):\n", " \"\"\"A ``LightningModule`` that can be used to finetune a foundational model on either the RTE or BoolQ SuperGLUE\n", " tasks using Hugging Face implementations of a given model and the `SuperGLUE Hugging Face dataset.\"\"\"\n", "\n", " def __init__(\n", " self,\n", " model_name_or_path: str,\n", " optimizer_init: Dict[str, Any],\n", " lr_scheduler_init: Dict[str, Any],\n", " model_cfg: Optional[Dict[str, Any]] = None,\n", " task_name: str = DEFAULT_TASK,\n", " experiment_tag: str = \"default\",\n", " ):\n", " \"\"\"\n", " Args:\n", " model_name_or_path (str): Path to pretrained model or identifier from https://huggingface.co/models\n", " optimizer_init (Dict[str, Any]): The desired optimizer configuration.\n", " lr_scheduler_init (Dict[str, Any]): The desired learning rate scheduler config\n", " model_cfg (Optional[Dict[str, Any]], optional): Defines overrides of the default model config. Defaults to\n", " ``None``.\n", " task_name (str, optional): The SuperGLUE task to execute, one of ``'rte'``, ``'boolq'``. Defaults to \"rte\".\n", " experiment_tag (str, optional): The tag to use for the experiment and tensorboard logs. Defaults to\n", " \"default\".\n", " \"\"\"\n", " super().__init__()\n", " if task_name not in TASK_NUM_LABELS.keys():\n", " rank_zero_warn(f\"Invalid task_name {task_name!r}. Proceeding with the default task: {DEFAULT_TASK!r}\")\n", " task_name = DEFAULT_TASK\n", " self.num_labels = TASK_NUM_LABELS[task_name]\n", " self.model_cfg = model_cfg or {}\n", " conf = AutoConfig.from_pretrained(model_name_or_path, num_labels=self.num_labels, local_files_only=False)\n", " self.model = AutoModelForSequenceClassification.from_pretrained(model_name_or_path, config=conf)\n", " self.model.config.update(self.model_cfg) # apply model config overrides\n", " self.init_hparams = {\n", " \"optimizer_init\": optimizer_init,\n", " \"lr_scheduler_init\": lr_scheduler_init,\n", " \"model_config\": self.model.config,\n", " \"model_name_or_path\": model_name_or_path,\n", " \"task_name\": task_name,\n", " \"experiment_id\": f\"{datetime.now().strftime('%Y%m%d_%H%M%S')}_{experiment_tag}\",\n", " }\n", " self.save_hyperparameters(self.init_hparams)\n", " self.metric = datasets.load_metric(\n", " \"super_glue\", self.hparams.task_name, experiment_id=self.hparams.experiment_id\n", " )\n", " self.no_decay = [\"bias\", \"LayerNorm.weight\"]\n", "\n", " @property\n", " def finetuningscheduler_callback(self) -> FinetuningScheduler: # type: ignore # noqa\n", " fts = [c for c in self.trainer.callbacks if isinstance(c, FinetuningScheduler)] # type: ignore # noqa\n", " return fts[0] if fts else None\n", "\n", " def forward(self, **inputs):\n", " return self.model(**inputs)\n", "\n", " def training_step(self, batch, batch_idx):\n", " outputs = self(**batch)\n", " loss = outputs[0]\n", " self.log(\"train_loss\", loss)\n", " return loss\n", "\n", " def on_train_epoch_start(self) -> None:\n", " if self.finetuningscheduler_callback:\n", " self.logger.log_metrics(\n", " metrics={\"finetuning_schedule_depth\": float(self.finetuningscheduler_callback.curr_depth)},\n", " step=self.global_step,\n", " )\n", "\n", " def validation_step(self, batch, batch_idx, dataloader_idx=0):\n", " outputs = self(**batch)\n", " val_loss, logits = outputs[:2]\n", " if self.num_labels >= 1:\n", " preds = torch.argmax(logits, axis=1)\n", " elif self.num_labels == 1:\n", " preds = logits.squeeze()\n", " labels = batch[\"labels\"]\n", " self.log(\"val_loss\", val_loss, prog_bar=True)\n", " metric_dict = self.metric.compute(predictions=preds, references=labels)\n", " self.log_dict(metric_dict, prog_bar=True)\n", "\n", " def _init_param_groups(self) -> List[Dict]:\n", " \"\"\"Initialize the parameter groups. Used to ensure weight_decay is not applied to our specified bias\n", " parameters when we initialize the optimizer.\n", "\n", " Returns:\n", " List[Dict]: A list of parameter group dictionaries.\n", " \"\"\"\n", " return [\n", " {\n", " \"params\": [\n", " p\n", " for n, p in self.model.named_parameters()\n", " if not any(nd in n for nd in self.no_decay) and p.requires_grad\n", " ],\n", " \"weight_decay\": self.hparams.optimizer_init[\"weight_decay\"],\n", " },\n", " {\n", " \"params\": [\n", " p\n", " for n, p in self.model.named_parameters()\n", " if any(nd in n for nd in self.no_decay) and p.requires_grad\n", " ],\n", " \"weight_decay\": 0.0,\n", " },\n", " ]\n", "\n", " def configure_optimizers(self):\n", " # the phase 0 parameters will have been set to require gradients during setup\n", " # you can initialize the optimizer with a simple requires.grad filter as is often done,\n", " # but in this case we pass a list of parameter groups to ensure weight_decay is\n", " # not applied to the bias parameter (for completeness, in this case it won't make much\n", " # performance difference)\n", " optimizer = AdamW(params=self._init_param_groups(), **self.hparams.optimizer_init)\n", " scheduler = {\n", " \"scheduler\": CosineAnnealingWarmRestarts(optimizer, **self.hparams.lr_scheduler_init),\n", " \"interval\": \"epoch\",\n", " }\n", " return [optimizer], [scheduler]"]}, {"cell_type": "markdown", "id": "b5050d52", "metadata": {"papermill": {"duration": 0.011023, "end_time": "2022-06-13T15:31:56.496766", "exception": false, "start_time": "2022-06-13T15:31:56.485743", "status": "completed"}, "tags": []}, "source": ["### Our Training Sessions\n", "\n", "We'll be comparing three different finetuning training configurations. Every configuration in this example depends\n", "upon a shared set of defaults, only differing in their respective finetuning schedules.\n", "\n", "| Experiment Tag | Training Scenario Description |\n", "|:-----------------:| ---------------------------------------------------------------------- |\n", "| ``fts_explicit`` | Training with a finetuning schedule explicitly provided by the user |\n", "| ``nofts_baseline``| A baseline finetuning training session (without scheduled finetuning) |\n", "| ``fts_implicit`` | Training with an implicitly generated finetuning schedule (the default)|\n", "\n", "Let's begin by configuring the ``fts_explicit`` scenario. We'll subsequently run the other two scenarios for\n", "comparison."]}, {"cell_type": "code", "execution_count": 7, "id": "2417aff5", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.522065Z", "iopub.status.busy": "2022-06-13T15:31:56.521373Z", "iopub.status.idle": "2022-06-13T15:31:56.525626Z", "shell.execute_reply": "2022-06-13T15:31:56.525006Z"}, "papermill": {"duration": 0.019177, "end_time": "2022-06-13T15:31:56.527050", "exception": false, "start_time": "2022-06-13T15:31:56.507873", "status": "completed"}, "tags": []}, "outputs": [], "source": ["# Let's create a finetuning schedule for our model and run an explicitly scheduled finetuning training scenario with it\n", "# Please see the [FinetuningScheduler documentation](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) for a full description of the schedule format\n", "\n", "\n", "ft_schedule_yaml = \"\"\"\n", "0:\n", " params:\n", " - model.classifier.bias\n", " - model.classifier.weight\n", " - model.pooler.dense.bias\n", " - model.pooler.dense.weight\n", " - model.deberta.encoder.LayerNorm.bias\n", " - model.deberta.encoder.LayerNorm.weight\n", " - model.deberta.encoder.rel_embeddings.weight\n", " - model.deberta.encoder.layer.{0,11}.(output|attention|intermediate).*\n", "1:\n", " params:\n", " - model.deberta.embeddings.LayerNorm.bias\n", " - model.deberta.embeddings.LayerNorm.weight\n", "2:\n", " params:\n", " - model.deberta.embeddings.word_embeddings.weight\n", "\"\"\"\n", "ft_schedule_name = \"RteBoolqModule_ft_schedule_deberta_base.yaml\"\n", "# Let's write the schedule to a file so we can simulate loading an explicitly defined finetuning\n", "# schedule.\n", "with open(ft_schedule_name, \"w\") as f:\n", " f.write(ft_schedule_yaml)"]}, {"cell_type": "code", "execution_count": 8, "id": "f82f706b", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:56.551096Z", "iopub.status.busy": "2022-06-13T15:31:56.550682Z", "iopub.status.idle": "2022-06-13T15:31:59.862604Z", "shell.execute_reply": "2022-06-13T15:31:59.861692Z"}, "papermill": {"duration": 3.325476, "end_time": "2022-06-13T15:31:59.864167", "exception": false, "start_time": "2022-06-13T15:31:56.538691", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["Global seed set to 42\n"]}, {"data": {"application/vnd.jupyter.widget-view+json": {"model_id": "f869052074d3460bbef2f627ba2569a8", "version_major": 2, "version_minor": 0}, "text/plain": ["Downloading: 0%| | 0.00/52.0 [00:00<?, ?B/s]"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"application/vnd.jupyter.widget-view+json": {"model_id": "71092ba87cef416db56db89e424a26c4", "version_major": 2, "version_minor": 0}, "text/plain": ["Downloading: 0%| | 0.00/579 [00:00<?, ?B/s]"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"application/vnd.jupyter.widget-view+json": {"model_id": "96cbdef4c87c474c8729e46d442b5d6f", "version_major": 2, "version_minor": 0}, "text/plain": ["Downloading: 0%| | 0.00/2.35M [00:00<?, ?B/s]"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stderr", "output_type": "stream", "text": ["/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py:434: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.\n", " warnings.warn(\n"]}], "source": ["datasets.logging.disable_progress_bar()\n", "pl.seed_everything(42)\n", "dm = RteBoolqDataModule(model_name_or_path=\"microsoft/deberta-v3-base\", tokenizers_parallelism=True)"]}, {"cell_type": "markdown", "id": "73801958", "metadata": {"papermill": {"duration": 0.011657, "end_time": "2022-06-13T15:31:59.888286", "exception": false, "start_time": "2022-06-13T15:31:59.876629", "status": "completed"}, "tags": []}, "source": ["### Optimizer Configuration\n", "\n", "<div id=\"a2\">\n", "\n", "Though other optimizers can arguably yield some marginal advantage contingent on the context,\n", "the Adam optimizer (and the [AdamW version](https://pytorch.org/docs/stable/_modules/torch/optim/adamw.html#AdamW) which\n", "implements decoupled weight decay) remains robust to hyperparameter choices and is commonly used for finetuning\n", "foundational language models. See [(Sivaprasad et al., 2020)](#f2) and [(Mosbach, Andriushchenko & Klakow, 2020)](#f3) for theoretical and systematic empirical justifications of Adam and its use in finetuning\n", "large transformer-based language models. The values used here have some justification\n", "in the referenced literature but have been largely empirically determined and while a good\n", "starting point could be could be further tuned.\n", "\n", "</div>"]}, {"cell_type": "code", "execution_count": 9, "id": "5ea6d2c4", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:59.912938Z", "iopub.status.busy": "2022-06-13T15:31:59.912200Z", "iopub.status.idle": "2022-06-13T15:31:59.915945Z", "shell.execute_reply": "2022-06-13T15:31:59.915300Z"}, "papermill": {"duration": 0.017747, "end_time": "2022-06-13T15:31:59.917394", "exception": false, "start_time": "2022-06-13T15:31:59.899647", "status": "completed"}, "tags": []}, "outputs": [], "source": ["optimizer_init = {\"weight_decay\": 1e-05, \"eps\": 1e-07, \"lr\": 1e-05}"]}, {"cell_type": "markdown", "id": "9bb3fbea", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.011529, "end_time": "2022-06-13T15:31:59.941141", "exception": false, "start_time": "2022-06-13T15:31:59.929612", "status": "completed"}, "tags": []}, "source": ["### LR Scheduler Configuration\n", "\n", "<div id=\"a3\">\n", "\n", "The [CosineAnnealingWarmRestarts scheduler](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.html?highlight=cosineannealingwarm#torch.optim.lr_scheduler.CosineAnnealingWarmRestarts) nicely fits with our iterative finetuning since it does not depend upon a global max_epoch\n", "value. The importance of initial warmup is reduced due to the innate warmup effect of Adam bias correction [[5]](#f3)\n", "and the gradual thawing we are performing. Note that commonly used LR schedulers that depend on providing\n", "max_iterations/epochs (e.g. the\n", "[CosineWarmupScheduler](https://github.com/PyTorchLightning/lightning-tutorials/blob/0c325829101d5a6ebf32ed99bbf5b09badf04a59/course_UvA-DL/05-transformers-and-MH-attention/Transformers_MHAttention.py#L688)\n", "used in other pytorch-lightning tutorials) also work with FinetuningScheduler. Though the LR scheduler is theoretically\n", "justified [(Loshchilov & Hutter, 2016)](#f4), the particular values provided here are primarily empircally driven.\n", "\n", "[FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) also supports LR scheduler\n", "reinitialization in both explicit and implicit finetuning schedule modes. See the [advanced usage documentation](https://finetuning-scheduler.readthedocs.io/en/stable/advanced/lr_scheduler_reinitialization.html) for explanations and demonstration of the extension's support for more complex requirements.\n", "</div>"]}, {"cell_type": "code", "execution_count": 10, "id": "b5dfcfe5", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:59.965753Z", "iopub.status.busy": "2022-06-13T15:31:59.965148Z", "iopub.status.idle": "2022-06-13T15:31:59.968593Z", "shell.execute_reply": "2022-06-13T15:31:59.967948Z"}, "papermill": {"duration": 0.017317, "end_time": "2022-06-13T15:31:59.969984", "exception": false, "start_time": "2022-06-13T15:31:59.952667", "status": "completed"}, "tags": []}, "outputs": [], "source": ["lr_scheduler_init = {\"T_0\": 1, \"T_mult\": 2, \"eta_min\": 1e-07}"]}, {"cell_type": "code", "execution_count": 11, "id": "a6aab95c", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:31:59.994530Z", "iopub.status.busy": "2022-06-13T15:31:59.994029Z", "iopub.status.idle": "2022-06-13T15:32:07.660475Z", "shell.execute_reply": "2022-06-13T15:32:07.659584Z"}, "papermill": {"duration": 7.680988, "end_time": "2022-06-13T15:32:07.662658", "exception": false, "start_time": "2022-06-13T15:31:59.981670", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/vnd.jupyter.widget-view+json": {"model_id": "5532000b543848deb749949c2136a5aa", "version_major": 2, "version_minor": 0}, "text/plain": ["Downloading: 0%| | 0.00/354M [00:00<?, ?B/s]"]}, "metadata": {}, "output_type": "display_data"}], "source": ["# Load our lightning module...\n", "lightning_module_kwargs = {\n", " \"model_name_or_path\": \"microsoft/deberta-v3-base\",\n", " \"optimizer_init\": optimizer_init,\n", " \"lr_scheduler_init\": lr_scheduler_init,\n", "}\n", "model = RteBoolqModule(**lightning_module_kwargs, experiment_tag=\"fts_explicit\")"]}, {"cell_type": "markdown", "id": "5399d106", "metadata": {"papermill": {"duration": 0.011845, "end_time": "2022-06-13T15:32:07.687336", "exception": false, "start_time": "2022-06-13T15:32:07.675491", "status": "completed"}, "tags": []}, "source": ["### Callback Configuration\n", "\n", "The only callback required to invoke the [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) is the [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) callback itself.\n", "Default versions of [FTSCheckpoint](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSCheckpoint) and [FTSEarlyStopping](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts_supporters.html#finetuning_scheduler.fts_supporters.FTSEarlyStopping)\n", "(if not specifying ``epoch_only_transitions``) will be included ([as discussed above](#basic_usage)) if not provided\n", "in the callbacks list. For demonstration purposes I'm including example configurations of all three callbacks below."]}, {"cell_type": "code", "execution_count": 12, "id": "7c3ef856", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:32:07.712570Z", "iopub.status.busy": "2022-06-13T15:32:07.711831Z", "iopub.status.idle": "2022-06-13T15:32:07.716816Z", "shell.execute_reply": "2022-06-13T15:32:07.716164Z"}, "papermill": {"duration": 0.019366, "end_time": "2022-06-13T15:32:07.718333", "exception": false, "start_time": "2022-06-13T15:32:07.698967", "status": "completed"}, "tags": []}, "outputs": [], "source": ["# let's save our callback configurations for the explicit scenario since we'll be reusing the same\n", "# configurations for the implicit and nofts_baseline scenarios (except the config for the\n", "# FinetuningScheduler callback itself of course in the case of nofts_baseline)\n", "earlystopping_kwargs = {\"monitor\": \"val_loss\", \"min_delta\": 0.001, \"patience\": 2}\n", "checkpoint_kwargs = {\"monitor\": \"val_loss\", \"save_top_k\": 1}\n", "fts_kwargs = {\"max_depth\": 1}\n", "callbacks = [\n", " FinetuningScheduler(ft_schedule=ft_schedule_name, **fts_kwargs), # type: ignore # noqa\n", " FTSEarlyStopping(**earlystopping_kwargs), # type: ignore # noqa\n", " FTSCheckpoint(**checkpoint_kwargs), # type: ignore # noqa\n", "]"]}, {"cell_type": "code", "execution_count": 13, "id": "3735509e", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:32:07.743696Z", "iopub.status.busy": "2022-06-13T15:32:07.743187Z", "iopub.status.idle": "2022-06-13T15:32:07.746878Z", "shell.execute_reply": "2022-06-13T15:32:07.746250Z"}, "papermill": {"duration": 0.017915, "end_time": "2022-06-13T15:32:07.748320", "exception": false, "start_time": "2022-06-13T15:32:07.730405", "status": "completed"}, "tags": []}, "outputs": [], "source": ["logger = TensorBoardLogger(\"lightning_logs\", name=\"fts_explicit\")\n", "# optionally start tensorboard and monitor progress graphically while viewing multi-phase finetuning specific training\n", "# logs in the cell output below by uncommenting the next 2 lines\n", "# %load_ext tensorboard\n", "# %tensorboard --logdir lightning_logs\n", "# disable progress bar by default to focus on multi-phase training logs. Set to True to re-enable if desired\n", "enable_progress_bar = False"]}, {"cell_type": "code", "execution_count": 14, "id": "ff1776fc", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:32:07.774835Z", "iopub.status.busy": "2022-06-13T15:32:07.774394Z", "iopub.status.idle": "2022-06-13T15:34:35.119541Z", "shell.execute_reply": "2022-06-13T15:34:35.118797Z"}, "papermill": {"duration": 147.359812, "end_time": "2022-06-13T15:34:35.121516", "exception": false, "start_time": "2022-06-13T15:32:07.761704", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["Using 16bit native Automatic Mixed Precision (AMP)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["GPU available: True, used: True\n"]}, {"name": "stderr", "output_type": "stream", "text": ["TPU available: False, using: 0 TPU cores\n"]}, {"name": "stderr", "output_type": "stream", "text": ["IPU available: False, using: 0 IPUs\n"]}, {"name": "stderr", "output_type": "stream", "text": ["HPU available: False, using: 0 HPUs\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Note given the computation associated w/ the multiple phases of finetuning demonstrated, this notebook is best used with an accelerator\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Downloading and preparing dataset super_glue/rte (download: 733.32 KiB, generated: 1.83 MiB, post-processed: Unknown size, total: 2.54 MiB) to /home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7...\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Missing logger folder: lightning_logs/fts_explicit\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Dataset super_glue downloaded and prepared to /home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7. Subsequent calls will reuse this data.\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:datasets.builder:Reusing dataset super_glue (/home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Finetuning schedule dumped to lightning_logs/fts_explicit/version_0/RteBoolqModule_ft_schedule.yaml.\n"]}, {"name": "stderr", "output_type": "stream", "text": ["LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]\n"]}, {"name": "stderr", "output_type": "stream", "text": ["\n", " | Name | Type | Params\n", "-------------------------------------------------------------\n", "0 | model | DebertaV2ForSequenceClassification | 184 M \n", "-------------------------------------------------------------\n", "86.0 M Trainable params\n", "98.4 M Non-trainable params\n", "184 M Total params\n", "368.847 Total estimated model params size (MB)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Restoring states from the checkpoint path at lightning_logs/fts_explicit/version_0/checkpoints/epoch=2-step=468.ckpt\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Restored all states from the checkpoint file at lightning_logs/fts_explicit/version_0/checkpoints/epoch=2-step=468.ckpt\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Multi-phase fine-tuned training continuing at level 1.\n"]}], "source": ["\n", "\n", "def train() -> None:\n", " trainer = pl.Trainer(\n", " enable_progress_bar=enable_progress_bar,\n", " max_epochs=100,\n", " precision=16,\n", " accelerator=\"auto\",\n", " devices=1 if torch.cuda.is_available() else None,\n", " callbacks=callbacks,\n", " logger=logger,\n", " )\n", " trainer.fit(model, datamodule=dm)\n", "\n", "\n", "print(\n", " \"Note given the computation associated w/ the multiple phases of finetuning demonstrated, this notebook is best used with an accelerator\"\n", ")\n", "train()"]}, {"cell_type": "markdown", "id": "69dc3104", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.012625, "end_time": "2022-06-13T15:34:35.148426", "exception": false, "start_time": "2022-06-13T15:34:35.135801", "status": "completed"}, "tags": []}, "source": ["### Running the Baseline and Implicit Finetuning Scenarios\n", "\n", "Let's now compare our ``nofts_baseline`` and ``fts_implicit`` scenarios with the ``fts_explicit`` one we just ran.\n", "\n", "We'll need to update our callbacks list, using the core PL ``EarlyStopping`` and ``ModelCheckpoint`` callbacks for the\n", "``nofts_baseline`` (which operate identically to their FTS analogs apart from the recursive training support).\n", "For both core PyTorch Lightning and user-registered callbacks, we can define our callbacks using a dictionary as we do\n", "with the LightningCLI. This allows us to avoid managing imports and support more complex configuration separated from\n", "code.\n", "\n", "Note that we'll be using identical callback configurations to the ``fts_explicit`` scenario. Keeping [max_depth](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html?highlight=max_depth#finetuning_scheduler.fts.FinetuningScheduler.params.max_depth) for\n", "the implicit schedule will limit finetuning to just the last 4 parameters of the model, which is only a small fraction\n", "of the parameters you'd want to tune for maximum performance. Since the implicit schedule is quite computationally\n", "intensive and most useful for exploring model behavior, leaving [max_depth](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html?highlight=max_depth#finetuning_scheduler.fts.FinetuningScheduler.params.max_depth) 1 allows us to demo implicit mode\n", "behavior while keeping the computational cost and runtime of this notebook reasonable. To review how a full implicit\n", "mode run compares to the ``nofts_baseline`` and ``fts_explicit`` scenarios, please see the the following\n", "[tensorboard experiment summary](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/)."]}, {"cell_type": "code", "execution_count": 15, "id": "48bdd31e", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:34:35.175562Z", "iopub.status.busy": "2022-06-13T15:34:35.174819Z", "iopub.status.idle": "2022-06-13T15:34:35.179619Z", "shell.execute_reply": "2022-06-13T15:34:35.178970Z"}, "papermill": {"duration": 0.020024, "end_time": "2022-06-13T15:34:35.181045", "exception": false, "start_time": "2022-06-13T15:34:35.161021", "status": "completed"}, "tags": []}, "outputs": [], "source": ["nofts_callbacks = [EarlyStopping(**earlystopping_kwargs), ModelCheckpoint(**checkpoint_kwargs)]\n", "fts_implicit_callbacks = [\n", " FinetuningScheduler(**fts_kwargs), # type: ignore # noqa\n", " FTSEarlyStopping(**earlystopping_kwargs), # type: ignore # noqa\n", " FTSCheckpoint(**checkpoint_kwargs), # type: ignore # noqa\n", "]\n", "scenario_callbacks = {\"nofts_baseline\": nofts_callbacks, \"fts_implicit\": fts_implicit_callbacks}"]}, {"cell_type": "code", "execution_count": 16, "id": "1d493435", "metadata": {"execution": {"iopub.execute_input": "2022-06-13T15:34:35.207359Z", "iopub.status.busy": "2022-06-13T15:34:35.206768Z", "iopub.status.idle": "2022-06-13T15:37:04.797303Z", "shell.execute_reply": "2022-06-13T15:37:04.796554Z"}, "papermill": {"duration": 149.605777, "end_time": "2022-06-13T15:37:04.799299", "exception": false, "start_time": "2022-06-13T15:34:35.193522", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["Using 16bit native Automatic Mixed Precision (AMP)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["GPU available: True, used: True\n"]}, {"name": "stderr", "output_type": "stream", "text": ["TPU available: False, using: 0 TPU cores\n"]}, {"name": "stderr", "output_type": "stream", "text": ["IPU available: False, using: 0 IPUs\n"]}, {"name": "stderr", "output_type": "stream", "text": ["HPU available: False, using: 0 HPUs\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Beginning training the 'nofts_baseline' scenario\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:datasets.builder:Reusing dataset super_glue (/home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Missing logger folder: lightning_logs/nofts_baseline\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:datasets.builder:Reusing dataset super_glue (/home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]\n"]}, {"name": "stderr", "output_type": "stream", "text": ["\n", " | Name | Type | Params\n", "-------------------------------------------------------------\n", "0 | model | DebertaV2ForSequenceClassification | 184 M \n", "-------------------------------------------------------------\n", "184 M Trainable params\n", "0 Non-trainable params\n", "184 M Total params\n", "368.847 Total estimated model params size (MB)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Using 16bit native Automatic Mixed Precision (AMP)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["GPU available: True, used: True\n"]}, {"name": "stderr", "output_type": "stream", "text": ["TPU available: False, using: 0 TPU cores\n"]}, {"name": "stderr", "output_type": "stream", "text": ["IPU available: False, using: 0 IPUs\n"]}, {"name": "stderr", "output_type": "stream", "text": ["HPU available: False, using: 0 HPUs\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Beginning training the 'fts_implicit' scenario\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:datasets.builder:Reusing dataset super_glue (/home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Missing logger folder: lightning_logs/fts_implicit\n"]}, {"name": "stderr", "output_type": "stream", "text": ["WARNING:datasets.builder:Reusing dataset super_glue (/home/AzDevOps_azpcontainer/.cache/huggingface/datasets/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Finetuning schedule dumped to lightning_logs/fts_implicit/version_0/RteBoolqModule_ft_schedule.yaml.\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Generated default finetuning schedule 'lightning_logs/fts_implicit/version_0/RteBoolqModule_ft_schedule.yaml' for iterative finetuning\n"]}, {"name": "stderr", "output_type": "stream", "text": ["LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]\n"]}, {"name": "stderr", "output_type": "stream", "text": ["\n", " | Name | Type | Params\n", "-------------------------------------------------------------\n", "0 | model | DebertaV2ForSequenceClassification | 184 M \n", "-------------------------------------------------------------\n", "1.5 K Trainable params\n", "184 M Non-trainable params\n", "184 M Total params\n", "368.847 Total estimated model params size (MB)\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Restoring states from the checkpoint path at lightning_logs/fts_implicit/version_0/checkpoints/epoch=1-step=312.ckpt\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Restored all states from the checkpoint file at lightning_logs/fts_implicit/version_0/checkpoints/epoch=1-step=312.ckpt\n"]}, {"name": "stderr", "output_type": "stream", "text": ["Multi-phase fine-tuned training continuing at level 1.\n"]}], "source": ["for scenario_name, scenario_callbacks in scenario_callbacks.items():\n", " model = RteBoolqModule(**lightning_module_kwargs, experiment_tag=scenario_name)\n", " logger = TensorBoardLogger(\"lightning_logs\", name=scenario_name)\n", " callbacks = scenario_callbacks\n", " print(f\"Beginning training the '{scenario_name}' scenario\")\n", " train()"]}, {"cell_type": "markdown", "id": "455002ca", "metadata": {"lines_to_next_cell": 0, "papermill": {"duration": 0.013783, "end_time": "2022-06-13T15:37:04.827801", "exception": false, "start_time": "2022-06-13T15:37:04.814018", "status": "completed"}, "tags": []}, "source": ["### Reviewing the Training Results\n", "\n", "See the [tensorboard experiment summaries](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/) to get a sense\n", "of the relative computational and performance tradeoffs associated with these [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) configurations.\n", "The summary compares a full ``fts_implicit`` execution to ``fts_explicit`` and ``nofts_baseline`` scenarios using DDP\n", "training with 2 GPUs. The full logs/schedules for all three scenarios are available\n", "[here](https://drive.google.com/file/d/1LrUcisRLHeJgh_BDOOD_GUBPp5iHAkoR/view?usp=sharing) and the checkpoints\n", "produced in the scenarios [here](https://drive.google.com/file/d/1t7myBgcqcZ9ax_IT9QVk-vFH_l_o5UXB/view?usp=sharing)\n", "(caution, ~3.5GB).\n", "\n", "[{height=\"315px\" width=\"492px\"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOnRydWUsIm5vZnRzX2Jhc2VsaW5lIjpmYWxzZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)\n", "[{height=\"316px\" width=\"505px\"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6dHJ1ZSwiZnRzX2ltcGxpY2l0IjpmYWxzZX0%3D)\n", "\n", "Note there could be around ~1% variation in performance from the tensorboard summaries generated by this notebook\n", "which uses DP and 1 GPU.\n", "\n", "[FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) expands the space of possible finetuning schedules and the composition of more sophisticated schedules can\n", "yield marginal finetuning performance gains. That stated, it should be emphasized the primary utility of [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) is to grant\n", "greater finetuning flexibility for model exploration in research. For example, glancing at DeBERTa-v3's implicit training\n", "run, a critical tuning transition point is immediately apparent:\n", "\n", "[{height=\"272px\" width=\"494px\"}](https://tensorboard.dev/experiment/n7U8XhrzRbmvVzC4SQSpWw/#scalars&_smoothingWeight=0&runSelectionState=eyJmdHNfZXhwbGljaXQiOmZhbHNlLCJub2Z0c19iYXNlbGluZSI6ZmFsc2UsImZ0c19pbXBsaWNpdCI6dHJ1ZX0%3D)\n", "\n", "Our `val_loss` begins a precipitous decline at step 3119 which corresponds to phase 17 in the schedule. Referring to our\n", "schedule, in phase 17 we're beginning tuning the attention parameters of our 10th encoder layer (of 11). Interesting!\n", "Though beyond the scope of this tutorial, it might be worth investigating these dynamics further and\n", "[FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) allows one to do just that quite easily.\n"]}, {"cell_type": "markdown", "id": "c022b74e", "metadata": {"lines_to_next_cell": 0, "papermill": {"duration": 0.013435, "end_time": "2022-06-13T15:37:04.854795", "exception": false, "start_time": "2022-06-13T15:37:04.841360", "status": "completed"}, "tags": []}, "source": ["\n", "Note that though this example is intended to capture a common usage scenario, substantial variation is expected\n", "among use cases and models.\n", "In summary, [FinetuningScheduler](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.fts.html#finetuning_scheduler.fts.FinetuningScheduler) provides increased finetuning flexibility that can be useful in a variety of\n", "contexts from exploring model tuning behavior to maximizing performance."]}, {"cell_type": "markdown", "id": "a1692fa7", "metadata": {"papermill": {"duration": 0.013473, "end_time": "2022-06-13T15:37:04.881995", "exception": false, "start_time": "2022-06-13T15:37:04.868522", "status": "completed"}, "tags": []}, "source": ["## Footnotes\n", "\n", "<ol>\n", "<li id=\"f1\">\n", "\n", "[Howard, J., & Ruder, S. (2018)](https://arxiv.org/pdf/1801.06146.pdf). Fine-tuned Language\n", " Models for Text Classification. ArXiv, abs/1801.06146. [\u21a9](#a1)\n", "\n", " </li>\n", "<li>\n", "\n", "[Chronopoulou, A., Baziotis, C., & Potamianos, A. (2019)](https://arxiv.org/pdf/1902.10547.pdf).\n", " An embarrassingly simple approach for transfer learning from pretrained language models. arXiv\n", " preprint arXiv:1902.10547. [\u21a9](#a1)\n", "\n", " </li>\n", "<li>\n", "\n", "[Peters, M. E., Ruder, S., & Smith, N. A. (2019)](https://arxiv.org/pdf/1903.05987.pdf). To tune or not to\n", " tune? adapting pretrained representations to diverse tasks. arXiv preprint arXiv:1903.05987. [\u21a9](#a1)\n", "\n", "</li>\n", "<li id=\"f2\">\n", "\n", "[Sivaprasad, P. T., Mai, F., Vogels, T., Jaggi, M., & Fleuret, F. (2020)](https://arxiv.org/pdf/1910.11758.pdf).\n", " Optimizer benchmarking needs to account for hyperparameter tuning. In International Conference on Machine Learning\n", "(pp. 9036-9045). PMLR. [\u21a9](#a2)\n", "\n", "</li>\n", "<li id=\"f3\">\n", "\n", "[Mosbach, M., Andriushchenko, M., & Klakow, D. (2020)](https://arxiv.org/pdf/2006.04884.pdf). On the stability of\n", "fine-tuning bert: Misconceptions, explanations, and strong baselines. arXiv preprint arXiv:2006.04884. [\u21a9](#a2)\n", "\n", "</li>\n", "<li id=\"f4\">\n", "\n", "[Loshchilov, I., & Hutter, F. (2016)](https://arxiv.org/pdf/1608.03983.pdf). Sgdr: Stochastic gradient descent with\n", "warm restarts. arXiv preprint arXiv:1608.03983. [\u21a9](#a3)\n", "\n", "</li>\n", "\n", "</ol>"]}, {"cell_type": "markdown", "id": "7abeeb6f", "metadata": {"papermill": {"duration": 0.013522, "end_time": "2022-06-13T15:37:04.908956", "exception": false, "start_time": "2022-06-13T15:37:04.895434", "status": "completed"}, "tags": []}, "source": []}, {"cell_type": "markdown", "id": "cb95d6f2", "metadata": {"papermill": {"duration": 0.013455, "end_time": "2022-06-13T15:37:04.936072", "exception": false, "start_time": "2022-06-13T15:37:04.922617", "status": "completed"}, "tags": []}, "source": ["## Congratulations - Time to Join the Community!\n", "\n", "Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the Lightning\n", "movement, you can do so in the following ways!\n", "\n", "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool\n", "tools we're building.\n", "\n", "### Join our [Slack](https://www.pytorchlightning.ai/community)!\n", "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself\n", "and share your interests in `#general` channel\n", "\n", "\n", "### Contributions !\n", "The best way to contribute to our community is to become a code contributor! At any time you can go to\n", "[Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/lightning-bolts)\n", "GitHub Issues page and filter for \"good first issue\".\n", "\n", "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", "* [Bolt good first issue](https://github.com/PyTorchLightning/lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", "* You can also contribute your own notebooks with useful examples !\n", "\n", "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", "\n", "[{height=\"60px\" width=\"240px\"}](https://pytorchlightning.ai)"]}, {"cell_type": "raw", "metadata": {"raw_mimetype": "text/restructuredtext"}, "source": [".. customcarditem::\n", " :header: Finetuning Scheduler\n", " :card_description: This notebook introduces the [Finetuning Scheduler](https://finetuning-scheduler.readthedocs.io/en/stable/index.html) extension and demonstrates the use of it to finetune a...\n", " :tags: Finetuning,GPU/TPU,Lightning-Examples"]}], "metadata": {"jupytext": {"cell_metadata_filter": "id,colab,colab_type,-all", "formats": "ipynb,py:percent", "main_language": "python"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10"}, "papermill": {"default_parameters": {}, "duration": 319.15472, "end_time": "2022-06-13T15:37:06.170583", "environment_variables": {}, "exception": null, "input_path": "lightning_examples/finetuning-scheduler/finetuning-scheduler.ipynb", "output_path": ".notebooks/lightning_examples/finetuning-scheduler.ipynb", "parameters": {}, "start_time": "2022-06-13T15:31:47.015863", "version": "2.3.4"}, "widgets": {"application/vnd.jupyter.widget-state+json": {"state": {"0869ea39f65446bea80a45a846b1f7d5": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": ""}}, "0aa5334a0ddb486f9cd58510279ef46c": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": ""}}, "13bdf6bb5f1f4eb58c4679c15ed938dd": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "167034608afd476e8ac303a0edca3384": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "1b6840a17734458388267eda1d0ea064": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9d1c4b1d1475443292273ccbf7a6f2a0", "placeholder": "\u200b", "style": "IPY_MODEL_e4cc8269f088473584cea9edfc49af46", "value": "Downloading: 100%"}}, "1ee606ede42441d1b8c26ecb83db823e": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_a544dd8849d843e2b8a1799bf6f35f2e", "placeholder": "\u200b", "style": "IPY_MODEL_c530921aad89464183213fcc9ad1e3f7", "value": "Downloading: 100%"}}, "1f739c0dbca242a78de80d5327f54cff": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "34bccf23cf214054954dd13ea8c53365": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9398c4aa4b0147c8b1ed182a263a477b", "placeholder": "\u200b", "style": "IPY_MODEL_cc74602637074884a02f13d71430cc68", "value": " 354M/354M [00:05<00:00, 75.6MB/s]"}}, "39fb1822b92e417dac66a6ba424740f1": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_cb373b7c860146bc8f9102d4d7cd6d8e", "placeholder": "\u200b", "style": "IPY_MODEL_1f739c0dbca242a78de80d5327f54cff", "value": "Downloading: 100%"}}, "429a33f144954b1fbca3e1fcb2b1947f": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "43e16da71c6a4246a5501ff362f510fd": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_f303b1b8553441789c860330541e2908", "placeholder": "\u200b", "style": "IPY_MODEL_c9fb6eb3742344ac8ba07e6f84f3f521", "value": " 2.35M/2.35M [00:00<00:00, 25.7MB/s]"}}, "488f1f87485e46829c9ddf047007ab91": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "4a247bdc334d4d888e07de85259da960": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "5532000b543848deb749949c2136a5aa": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": ["IPY_MODEL_1b6840a17734458388267eda1d0ea064", "IPY_MODEL_86eafe01fa0748d19379789499fa940e", "IPY_MODEL_34bccf23cf214054954dd13ea8c53365"], "layout": "IPY_MODEL_4a247bdc334d4d888e07de85259da960"}}, "5a5f916e0c5f4bf4be81103bdb935bf3": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": ""}}, "62db11a47f804f08be5af4ad643ca836": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_f6831e2fd0374f11898e1bda30839299", "max": 2464616.0, "min": 0.0, "orientation": "horizontal", "style": "IPY_MODEL_0869ea39f65446bea80a45a846b1f7d5", "value": 2464616.0}}, "71092ba87cef416db56db89e424a26c4": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": ["IPY_MODEL_1ee606ede42441d1b8c26ecb83db823e", "IPY_MODEL_c4319035a9be43e6b60c6d3fdf65a215", "IPY_MODEL_84bd41ddbc6b468dbb0c8f81bf490ef1"], "layout": "IPY_MODEL_167034608afd476e8ac303a0edca3384"}}, "7877cd40f2a742679af32aa603bb601b": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "84bd41ddbc6b468dbb0c8f81bf490ef1": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_eddea62e1778407091132a07db787dce", "placeholder": "\u200b", "style": "IPY_MODEL_e4e5e782170649edafc8afea8bc8fa54", "value": " 579/579 [00:00<00:00, 31.4kB/s]"}}, "86eafe01fa0748d19379789499fa940e": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_88cbf8d5c3f74dcda72b01d3f59d540c", "max": 371146213.0, "min": 0.0, "orientation": "horizontal", "style": "IPY_MODEL_5a5f916e0c5f4bf4be81103bdb935bf3", "value": 371146213.0}}, "88cbf8d5c3f74dcda72b01d3f59d540c": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "9398c4aa4b0147c8b1ed182a263a477b": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "9437589ecf964d279b24bae46f5c1696": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_7877cd40f2a742679af32aa603bb601b", "placeholder": "\u200b", "style": "IPY_MODEL_bc428db150354a6399eddd6dff1a29f6", "value": "Downloading: 100%"}}, "96cbdef4c87c474c8729e46d442b5d6f": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": ["IPY_MODEL_9437589ecf964d279b24bae46f5c1696", "IPY_MODEL_62db11a47f804f08be5af4ad643ca836", "IPY_MODEL_43e16da71c6a4246a5501ff362f510fd"], "layout": "IPY_MODEL_13bdf6bb5f1f4eb58c4679c15ed938dd"}}, "9d1c4b1d1475443292273ccbf7a6f2a0": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "a544dd8849d843e2b8a1799bf6f35f2e": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "adbb0796187b414598f0376455250cc2": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "bc428db150354a6399eddd6dff1a29f6": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "be6d560e2584400f9c3e06e36c0b030a": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_488f1f87485e46829c9ddf047007ab91", "max": 52.0, "min": 0.0, "orientation": "horizontal", "style": "IPY_MODEL_e6e1a66b11454ed289be2d539df7b322", "value": 52.0}}, "bfebb65556c74f0fa698bdae323818cc": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_429a33f144954b1fbca3e1fcb2b1947f", "placeholder": "\u200b", "style": "IPY_MODEL_f80b81fdd68b4a03b6a441b35332eb9d", "value": " 52.0/52.0 [00:00<00:00, 2.63kB/s]"}}, "c4319035a9be43e6b60c6d3fdf65a215": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_c520cb87165943a6a30d8d73e1a60e87", "max": 579.0, "min": 0.0, "orientation": "horizontal", "style": "IPY_MODEL_0aa5334a0ddb486f9cd58510279ef46c", "value": 579.0}}, "c520cb87165943a6a30d8d73e1a60e87": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "c530921aad89464183213fcc9ad1e3f7": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "c9fb6eb3742344ac8ba07e6f84f3f521": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "cb373b7c860146bc8f9102d4d7cd6d8e": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "cc74602637074884a02f13d71430cc68": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "e4cc8269f088473584cea9edfc49af46": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "e4e5e782170649edafc8afea8bc8fa54": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "e6e1a66b11454ed289be2d539df7b322": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": ""}}, "eddea62e1778407091132a07db787dce": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "f303b1b8553441789c860330541e2908": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "f6831e2fd0374f11898e1bda30839299": {"model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {"_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null}}, "f80b81fdd68b4a03b6a441b35332eb9d": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": {"_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": ""}}, "f869052074d3460bbef2f627ba2569a8": {"model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": {"_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": ["IPY_MODEL_39fb1822b92e417dac66a6ba424740f1", "IPY_MODEL_be6d560e2584400f9c3e06e36c0b030a", "IPY_MODEL_bfebb65556c74f0fa698bdae323818cc"], "layout": "IPY_MODEL_adbb0796187b414598f0376455250cc2"}}}, "version_major": 2, "version_minor": 0}}}, "nbformat": 4, "nbformat_minor": 5}