{"cells": [{"cell_type": "markdown", "id": "161eabba", "metadata": {"papermill": {"duration": 0.030074, "end_time": "2021-09-16T12:36:05.411850", "exception": false, "start_time": "2021-09-16T12:36:05.381776", "status": "completed"}, "tags": []}, "source": ["\n", "# Tutorial 3: Initialization and Optimization\n", "\n", "* **Author:** Phillip Lippe\n", "* **License:** CC BY-SA\n", "* **Generated:** 2021-09-16T14:32:21.097031\n", "\n", "In this tutorial, we will review techniques for optimization and initialization of neural networks.\n", "When increasing the depth of neural networks, there are various challenges we face.\n", "Most importantly, we need to have a stable gradient flow through the network, as otherwise, we might encounter vanishing or exploding gradients.\n", "This is why we will take a closer look at the following concepts: initialization and optimization.\n", "This notebook is part of a lecture series on Deep Learning at the University of Amsterdam.\n", "The full list of tutorials can be found at https://uvadlc-notebooks.rtfd.io.\n", "\n", "\n", "---\n", "Open in [![Open In Colab](){height=\"20px\" width=\"117px\"}](https://colab.research.google.com/github/PytorchLightning/lightning-tutorials/blob/publication/.notebooks/course_UvA-DL/03-initialization-and-optimization.ipynb)\n", "\n", "Give us a \u2b50 [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", "| Check out [the documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", "| Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-pw5v393p-qRaDgEk24~EjiZNBpSQFgQ)"]}, {"cell_type": "markdown", "id": "513d8cd0", "metadata": {"papermill": {"duration": 0.029401, "end_time": "2021-09-16T12:36:05.470171", "exception": false, "start_time": "2021-09-16T12:36:05.440770", "status": "completed"}, "tags": []}, "source": ["## Setup\n", "This notebook requires some packages besides pytorch-lightning."]}, {"cell_type": "code", "execution_count": 1, "id": "780d4b1e", "metadata": {"colab": {}, "colab_type": "code", "execution": {"iopub.execute_input": "2021-09-16T12:36:05.530490Z", "iopub.status.busy": "2021-09-16T12:36:05.530024Z", "iopub.status.idle": "2021-09-16T12:36:05.532156Z", "shell.execute_reply": "2021-09-16T12:36:05.532530Z"}, "id": "LfrJLKPFyhsK", "lines_to_next_cell": 0, "papermill": {"duration": 0.034212, "end_time": "2021-09-16T12:36:05.532707", "exception": false, "start_time": "2021-09-16T12:36:05.498495", "status": "completed"}, "tags": []}, "outputs": [], "source": ["# ! pip install --quiet \"seaborn\" \"torchvision\" \"torchmetrics>=0.3\" \"torch>=1.6, <1.9\" \"pytorch-lightning>=1.3\" \"matplotlib\""]}, {"cell_type": "markdown", "id": "0d6d5d66", "metadata": {"papermill": {"duration": 0.028732, "end_time": "2021-09-16T12:36:05.590890", "exception": false, "start_time": "2021-09-16T12:36:05.562158", "status": "completed"}, "tags": []}, "source": ["
\n", "In the first half of the notebook, we will review different initialization techniques, and go step by step from the simplest initialization to methods that are nowadays used in very deep networks.\n", "In the second half, we focus on optimization comparing the optimizers SGD, SGD with Momentum, and Adam.\n", "\n", "Let's start with importing our standard libraries:"]}, {"cell_type": "code", "execution_count": 2, "id": "622ad7af", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:05.655225Z", "iopub.status.busy": "2021-09-16T12:36:05.654757Z", "iopub.status.idle": "2021-09-16T12:36:07.304855Z", "shell.execute_reply": "2021-09-16T12:36:07.304422Z"}, "papermill": {"duration": 1.685542, "end_time": "2021-09-16T12:36:07.304972", "exception": false, "start_time": "2021-09-16T12:36:05.619430", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["/tmp/ipykernel_879/869332958.py:24: DeprecationWarning: `set_matplotlib_formats` is deprecated since IPython 7.23, directly use `matplotlib_inline.backend_inline.set_matplotlib_formats()`\n", " set_matplotlib_formats(\"svg\", \"pdf\") # For export\n"]}], "source": ["import copy\n", "import json\n", "import math\n", "import os\n", "import urllib.request\n", "from urllib.error import HTTPError\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pytorch_lightning as pl\n", "import seaborn as sns\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torch.utils.data as data\n", "\n", "# %matplotlib inline\n", "from IPython.display import set_matplotlib_formats\n", "from matplotlib import cm\n", "from torchvision import transforms\n", "from torchvision.datasets import FashionMNIST\n", "from tqdm.notebook import tqdm\n", "\n", "set_matplotlib_formats(\"svg\", \"pdf\") # For export\n", "sns.set()"]}, {"cell_type": "markdown", "id": "d90abb51", "metadata": {"papermill": {"duration": 0.029212, "end_time": "2021-09-16T12:36:07.364402", "exception": false, "start_time": "2021-09-16T12:36:07.335190", "status": "completed"}, "tags": []}, "source": ["Instead of the `set_seed` function as in Tutorial 3, we can use PyTorch Lightning's build-in function `pl.seed_everything`.\n", "We will reuse the path variables `DATASET_PATH` and `CHECKPOINT_PATH` as in Tutorial 3.\n", "Adjust the paths if necessary."]}, {"cell_type": "code", "execution_count": 3, "id": "ea2ba888", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:07.429294Z", "iopub.status.busy": "2021-09-16T12:36:07.428723Z", "iopub.status.idle": "2021-09-16T12:36:07.498990Z", "shell.execute_reply": "2021-09-16T12:36:07.499373Z"}, "papermill": {"duration": 0.10601, "end_time": "2021-09-16T12:36:07.499520", "exception": false, "start_time": "2021-09-16T12:36:07.393510", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["Global seed set to 42\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Using device cuda:0\n"]}], "source": ["# Path to the folder where the datasets are/should be downloaded (e.g. MNIST)\n", "DATASET_PATH = os.environ.get(\"PATH_DATASETS\", \"data/\")\n", "# Path to the folder where the pretrained models are saved\n", "CHECKPOINT_PATH = os.environ.get(\"PATH_CHECKPOINT\", \"saved_models/InitOptim/\")\n", "\n", "# Seed everything\n", "pl.seed_everything(42)\n", "\n", "# Ensure that all operations are deterministic on GPU (if used) for reproducibility\n", "torch.backends.cudnn.determinstic = True\n", "torch.backends.cudnn.benchmark = False\n", "\n", "# Fetching the device that will be used throughout this notebook\n", "device = torch.device(\"cpu\") if not torch.cuda.is_available() else torch.device(\"cuda:0\")\n", "print(\"Using device\", device)"]}, {"cell_type": "markdown", "id": "7aaf0232", "metadata": {"papermill": {"duration": 0.029659, "end_time": "2021-09-16T12:36:07.559273", "exception": false, "start_time": "2021-09-16T12:36:07.529614", "status": "completed"}, "tags": []}, "source": ["In the last part of the notebook, we will train models using three different optimizers.\n", "The pretrained models for those are downloaded below."]}, {"cell_type": "code", "execution_count": 4, "id": "139dec18", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:07.623803Z", "iopub.status.busy": "2021-09-16T12:36:07.623326Z", "iopub.status.idle": "2021-09-16T12:36:08.686564Z", "shell.execute_reply": "2021-09-16T12:36:08.686141Z"}, "papermill": {"duration": 1.09754, "end_time": "2021-09-16T12:36:08.686681", "exception": false, "start_time": "2021-09-16T12:36:07.589141", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGD.config...\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGD_results.json...\n", "Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGD.tar...\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGDMom.config...\n", "Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGDMom_results.json...\n", "Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_SGDMom.tar...\n"]}, {"name": "stdout", "output_type": "stream", "text": ["Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_Adam.config...\n", "Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_Adam_results.json...\n", "Downloading https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/FashionMNIST_Adam.tar...\n"]}], "source": ["# Github URL where saved models are stored for this tutorial\n", "base_url = \"https://raw.githubusercontent.com/phlippe/saved_models/main/tutorial4/\"\n", "# Files to download\n", "pretrained_files = [\n", " \"FashionMNIST_SGD.config\",\n", " \"FashionMNIST_SGD_results.json\",\n", " \"FashionMNIST_SGD.tar\",\n", " \"FashionMNIST_SGDMom.config\",\n", " \"FashionMNIST_SGDMom_results.json\",\n", " \"FashionMNIST_SGDMom.tar\",\n", " \"FashionMNIST_Adam.config\",\n", " \"FashionMNIST_Adam_results.json\",\n", " \"FashionMNIST_Adam.tar\",\n", "]\n", "# Create checkpoint path if it doesn't exist yet\n", "os.makedirs(CHECKPOINT_PATH, exist_ok=True)\n", "\n", "# For each file, check whether it already exists. If not, try downloading it.\n", "for file_name in pretrained_files:\n", " file_path = os.path.join(CHECKPOINT_PATH, file_name)\n", " if not os.path.isfile(file_path):\n", " file_url = base_url + file_name\n", " print(f\"Downloading {file_url}...\")\n", " try:\n", " urllib.request.urlretrieve(file_url, file_path)\n", " except HTTPError as e:\n", " print(\n", " \"Something went wrong. Please try to download the file from the GDrive folder, or contact the author with the full output including the following error:\\n\",\n", " e,\n", " )"]}, {"cell_type": "markdown", "id": "bef9b45d", "metadata": {"papermill": {"duration": 0.030457, "end_time": "2021-09-16T12:36:08.748288", "exception": false, "start_time": "2021-09-16T12:36:08.717831", "status": "completed"}, "tags": []}, "source": ["## Preparation"]}, {"cell_type": "markdown", "id": "7cbe8831", "metadata": {"papermill": {"duration": 0.030589, "end_time": "2021-09-16T12:36:08.809491", "exception": false, "start_time": "2021-09-16T12:36:08.778902", "status": "completed"}, "tags": []}, "source": ["Throughout this notebook, we will use a deep fully connected network, similar to our previous tutorial.\n", "We will also again apply the network to FashionMNIST, so you can relate to the results of Tutorial 3.\n", "We start by loading the FashionMNIST dataset:"]}, {"cell_type": "code", "execution_count": 5, "id": "b07918f9", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:08.874871Z", "iopub.status.busy": "2021-09-16T12:36:08.874401Z", "iopub.status.idle": "2021-09-16T12:36:08.910586Z", "shell.execute_reply": "2021-09-16T12:36:08.910149Z"}, "papermill": {"duration": 0.070623, "end_time": "2021-09-16T12:36:08.910707", "exception": false, "start_time": "2021-09-16T12:36:08.840084", "status": "completed"}, "tags": []}, "outputs": [], "source": ["\n", "# Transformations applied on each image => first make them a tensor, then normalize them with mean 0 and std 1\n", "transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.2861,), (0.3530,))])\n", "\n", "# Loading the training dataset. We need to split it into a training and validation part\n", "train_dataset = FashionMNIST(root=DATASET_PATH, train=True, transform=transform, download=True)\n", "train_set, val_set = torch.utils.data.random_split(train_dataset, [50000, 10000])\n", "\n", "# Loading the test set\n", "test_set = FashionMNIST(root=DATASET_PATH, train=False, transform=transform, download=True)"]}, {"cell_type": "markdown", "id": "e1f85744", "metadata": {"papermill": {"duration": 0.0303, "end_time": "2021-09-16T12:36:08.971816", "exception": false, "start_time": "2021-09-16T12:36:08.941516", "status": "completed"}, "tags": []}, "source": ["We define a set of data loaders that we can use for various purposes later.\n", "Note that for actually training a model, we will use different data loaders\n", "with a lower batch size."]}, {"cell_type": "code", "execution_count": 6, "id": "1e446c8f", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.037306Z", "iopub.status.busy": "2021-09-16T12:36:09.036842Z", "iopub.status.idle": "2021-09-16T12:36:09.038587Z", "shell.execute_reply": "2021-09-16T12:36:09.038964Z"}, "papermill": {"duration": 0.036552, "end_time": "2021-09-16T12:36:09.039078", "exception": false, "start_time": "2021-09-16T12:36:09.002526", "status": "completed"}, "tags": []}, "outputs": [], "source": ["train_loader = data.DataLoader(train_set, batch_size=1024, shuffle=True, drop_last=False)\n", "val_loader = data.DataLoader(val_set, batch_size=1024, shuffle=False, drop_last=False)\n", "test_loader = data.DataLoader(test_set, batch_size=1024, shuffle=False, drop_last=False)"]}, {"cell_type": "markdown", "id": "2fa7ffc1", "metadata": {"papermill": {"duration": 0.03043, "end_time": "2021-09-16T12:36:09.100101", "exception": false, "start_time": "2021-09-16T12:36:09.069671", "status": "completed"}, "tags": []}, "source": ["In comparison to the previous tutorial, we have changed the parameters of the normalization transformation `transforms.Normalize`.\n", "The normalization is now designed to give us an expected mean of 0 and a standard deviation of 1 across pixels.\n", "This will be particularly relevant for the discussion about initialization we will look at below, and hence we change it here.\n", "It should be noted that in most classification tasks, both normalization techniques (between -1 and 1 or mean 0 and stddev 1) have shown to work well.\n", "We can calculate the normalization parameters by determining the mean and standard deviation on the original images:"]}, {"cell_type": "code", "execution_count": 7, "id": "f94c10d3", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.164949Z", "iopub.status.busy": "2021-09-16T12:36:09.164391Z", "iopub.status.idle": "2021-09-16T12:36:09.265798Z", "shell.execute_reply": "2021-09-16T12:36:09.265361Z"}, "papermill": {"duration": 0.135044, "end_time": "2021-09-16T12:36:09.265917", "exception": false, "start_time": "2021-09-16T12:36:09.130873", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Mean 0.28604060411453247\n", "Std 0.3530242443084717\n"]}], "source": ["print(\"Mean\", (train_dataset.data.float() / 255.0).mean().item())\n", "print(\"Std\", (train_dataset.data.float() / 255.0).std().item())"]}, {"cell_type": "markdown", "id": "53a9eb37", "metadata": {"papermill": {"duration": 0.030511, "end_time": "2021-09-16T12:36:09.328055", "exception": false, "start_time": "2021-09-16T12:36:09.297544", "status": "completed"}, "tags": []}, "source": ["We can verify the transformation by looking at the statistics of a single batch:"]}, {"cell_type": "code", "execution_count": 8, "id": "7167dac5", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.392969Z", "iopub.status.busy": "2021-09-16T12:36:09.392508Z", "iopub.status.idle": "2021-09-16T12:36:09.576984Z", "shell.execute_reply": "2021-09-16T12:36:09.576574Z"}, "papermill": {"duration": 0.218349, "end_time": "2021-09-16T12:36:09.577097", "exception": false, "start_time": "2021-09-16T12:36:09.358748", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Mean: 0.009\n", "Standard deviation: 1.012\n", "Maximum: 2.022\n", "Minimum: -0.810\n"]}], "source": ["imgs, _ = next(iter(train_loader))\n", "print(f\"Mean: {imgs.mean().item():5.3f}\")\n", "print(f\"Standard deviation: {imgs.std().item():5.3f}\")\n", "print(f\"Maximum: {imgs.max().item():5.3f}\")\n", "print(f\"Minimum: {imgs.min().item():5.3f}\")"]}, {"cell_type": "markdown", "id": "90f0ee3d", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.030932, "end_time": "2021-09-16T12:36:09.639639", "exception": false, "start_time": "2021-09-16T12:36:09.608707", "status": "completed"}, "tags": []}, "source": ["Note that the maximum and minimum are not 1 and -1 anymore, but shifted towards the positive values.\n", "This is because FashionMNIST contains a lot of black pixels, similar to MNIST.\n", "\n", "Next, we create a linear neural network. We use the same setup as in the previous tutorial."]}, {"cell_type": "code", "execution_count": 9, "id": "878c7079", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.708008Z", "iopub.status.busy": "2021-09-16T12:36:09.707536Z", "iopub.status.idle": "2021-09-16T12:36:09.709627Z", "shell.execute_reply": "2021-09-16T12:36:09.709167Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.039177, "end_time": "2021-09-16T12:36:09.709744", "exception": false, "start_time": "2021-09-16T12:36:09.670567", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class BaseNetwork(nn.Module):\n", " def __init__(self, act_fn, input_size=784, num_classes=10, hidden_sizes=[512, 256, 256, 128]):\n", " \"\"\"\n", " Args:\n", " act_fn: Object of the activation function that should be used as non-linearity in the network.\n", " input_size: Size of the input images in pixels\n", " num_classes: Number of classes we want to predict\n", " hidden_sizes: A list of integers specifying the hidden layer sizes in the NN\n", " \"\"\"\n", " super().__init__()\n", "\n", " # Create the network based on the specified hidden sizes\n", " layers = []\n", " layer_sizes = [input_size] + hidden_sizes\n", " for layer_index in range(1, len(layer_sizes)):\n", " layers += [nn.Linear(layer_sizes[layer_index - 1], layer_sizes[layer_index]), act_fn]\n", " layers += [nn.Linear(layer_sizes[-1], num_classes)]\n", " # A module list registers a list of modules as submodules (e.g. for parameters)\n", " self.layers = nn.ModuleList(layers)\n", "\n", " self.config = {\n", " \"act_fn\": act_fn.__class__.__name__,\n", " \"input_size\": input_size,\n", " \"num_classes\": num_classes,\n", " \"hidden_sizes\": hidden_sizes,\n", " }\n", "\n", " def forward(self, x):\n", " x = x.view(x.size(0), -1)\n", " for layer in self.layers:\n", " x = layer(x)\n", " return x"]}, {"cell_type": "markdown", "id": "f411d171", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.031158, "end_time": "2021-09-16T12:36:09.773277", "exception": false, "start_time": "2021-09-16T12:36:09.742119", "status": "completed"}, "tags": []}, "source": ["For the activation functions, we make use of PyTorch's `torch.nn` library instead of implementing ourselves.\n", "However, we also define an `Identity` activation function.\n", "Although this activation function would significantly limit the\n", "network's modeling capabilities, we will use it in the first steps of\n", "our discussion about initialization (for simplicity)."]}, {"cell_type": "code", "execution_count": 10, "id": "17e393bf", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.838964Z", "iopub.status.busy": "2021-09-16T12:36:09.838503Z", "iopub.status.idle": "2021-09-16T12:36:09.840568Z", "shell.execute_reply": "2021-09-16T12:36:09.840106Z"}, "papermill": {"duration": 0.036232, "end_time": "2021-09-16T12:36:09.840668", "exception": false, "start_time": "2021-09-16T12:36:09.804436", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class Identity(nn.Module):\n", " def forward(self, x):\n", " return x\n", "\n", "\n", "act_fn_by_name = {\"tanh\": nn.Tanh, \"relu\": nn.ReLU, \"identity\": Identity}"]}, {"cell_type": "markdown", "id": "00535e64", "metadata": {"papermill": {"duration": 0.031324, "end_time": "2021-09-16T12:36:09.903283", "exception": false, "start_time": "2021-09-16T12:36:09.871959", "status": "completed"}, "tags": []}, "source": ["Finally, we define a few plotting functions that we will use for our discussions.\n", "These functions help us to (1) visualize the weight/parameter distribution inside a network, (2) visualize the gradients that the parameters at different layers receive, and (3) the activations, i.e. the output of the linear layers.\n", "The detailed code is not important, but feel free to take a closer look if interested."]}, {"cell_type": "code", "execution_count": 11, "id": "cb3680e1", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:09.980697Z", "iopub.status.busy": "2021-09-16T12:36:09.978256Z", "iopub.status.idle": "2021-09-16T12:36:09.982397Z", "shell.execute_reply": "2021-09-16T12:36:09.982770Z"}, "papermill": {"duration": 0.048335, "end_time": "2021-09-16T12:36:09.982887", "exception": false, "start_time": "2021-09-16T12:36:09.934552", "status": "completed"}, "tags": []}, "outputs": [], "source": ["##############################################################\n", "\n", "\n", "def plot_dists(val_dict, color=\"C0\", xlabel=None, stat=\"count\", use_kde=True):\n", " columns = len(val_dict)\n", " fig, ax = plt.subplots(1, columns, figsize=(columns * 3, 2.5))\n", " fig_index = 0\n", " for key in sorted(val_dict.keys()):\n", " key_ax = ax[fig_index % columns]\n", " sns.histplot(\n", " val_dict[key],\n", " ax=key_ax,\n", " color=color,\n", " bins=50,\n", " stat=stat,\n", " kde=use_kde and ((val_dict[key].max() - val_dict[key].min()) > 1e-8),\n", " ) # Only plot kde if there is variance\n", " hidden_dim_str = (\n", " r\"(%i $\\to$ %i)\" % (val_dict[key].shape[1], val_dict[key].shape[0]) if len(val_dict[key].shape) > 1 else \"\"\n", " )\n", " key_ax.set_title(f\"{key} {hidden_dim_str}\")\n", " if xlabel is not None:\n", " key_ax.set_xlabel(xlabel)\n", " fig_index += 1\n", " fig.subplots_adjust(wspace=0.4)\n", " return fig\n", "\n", "\n", "##############################################################\n", "\n", "\n", "def visualize_weight_distribution(model, color=\"C0\"):\n", " weights = {}\n", " for name, param in model.named_parameters():\n", " if name.endswith(\".bias\"):\n", " continue\n", " key_name = f\"Layer {name.split('.')[1]}\"\n", " weights[key_name] = param.detach().view(-1).cpu().numpy()\n", "\n", " # Plotting\n", " fig = plot_dists(weights, color=color, xlabel=\"Weight vals\")\n", " fig.suptitle(\"Weight distribution\", fontsize=14, y=1.05)\n", " plt.show()\n", " plt.close()\n", "\n", "\n", "##############################################################\n", "\n", "\n", "def visualize_gradients(model, color=\"C0\", print_variance=False):\n", " \"\"\"\n", " Args:\n", " net: Object of class BaseNetwork\n", " color: Color in which we want to visualize the histogram (for easier separation of activation functions)\n", " \"\"\"\n", " model.eval()\n", " small_loader = data.DataLoader(train_set, batch_size=1024, shuffle=False)\n", " imgs, labels = next(iter(small_loader))\n", " imgs, labels = imgs.to(device), labels.to(device)\n", "\n", " # Pass one batch through the network, and calculate the gradients for the weights\n", " model.zero_grad()\n", " preds = model(imgs)\n", " loss = F.cross_entropy(preds, labels) # Same as nn.CrossEntropyLoss, but as a function instead of module\n", " loss.backward()\n", " # We limit our visualization to the weight parameters and exclude the bias to reduce the number of plots\n", " grads = {\n", " name: params.grad.view(-1).cpu().clone().numpy()\n", " for name, params in model.named_parameters()\n", " if \"weight\" in name\n", " }\n", " model.zero_grad()\n", "\n", " # Plotting\n", " fig = plot_dists(grads, color=color, xlabel=\"Grad magnitude\")\n", " fig.suptitle(\"Gradient distribution\", fontsize=14, y=1.05)\n", " plt.show()\n", " plt.close()\n", "\n", " if print_variance:\n", " for key in sorted(grads.keys()):\n", " print(f\"{key} - Variance: {np.var(grads[key])}\")\n", "\n", "\n", "##############################################################\n", "\n", "\n", "def visualize_activations(model, color=\"C0\", print_variance=False):\n", " model.eval()\n", " small_loader = data.DataLoader(train_set, batch_size=1024, shuffle=False)\n", " imgs, labels = next(iter(small_loader))\n", " imgs, labels = imgs.to(device), labels.to(device)\n", "\n", " # Pass one batch through the network, and calculate the gradients for the weights\n", " feats = imgs.view(imgs.shape[0], -1)\n", " activations = {}\n", " with torch.no_grad():\n", " for layer_index, layer in enumerate(model.layers):\n", " feats = layer(feats)\n", " if isinstance(layer, nn.Linear):\n", " activations[f\"Layer {layer_index}\"] = feats.view(-1).detach().cpu().numpy()\n", "\n", " # Plotting\n", " fig = plot_dists(activations, color=color, stat=\"density\", xlabel=\"Activation vals\")\n", " fig.suptitle(\"Activation distribution\", fontsize=14, y=1.05)\n", " plt.show()\n", " plt.close()\n", "\n", " if print_variance:\n", " for key in sorted(activations.keys()):\n", " print(f\"{key} - Variance: {np.var(activations[key])}\")\n", "\n", "\n", "##############################################################"]}, {"cell_type": "markdown", "id": "e2e7f03b", "metadata": {"papermill": {"duration": 0.031273, "end_time": "2021-09-16T12:36:10.045659", "exception": false, "start_time": "2021-09-16T12:36:10.014386", "status": "completed"}, "tags": []}, "source": ["## Initialization\n", "\n", "Before starting our discussion about initialization, it should be noted that there exist many very good blog posts about the topic of neural network initialization (for example [deeplearning.ai](https://www.deeplearning.ai/ai-notes/initialization/), or a more [math-focused blog post](https://pouannes.github.io/blog/initialization/#mjx-eqn-eqfwd_K)).\n", "In case something remains unclear after this tutorial, we recommend skimming through these blog posts as well.\n", "\n", "When initializing a neural network, there are a few properties we would like to have.\n", "First, the variance of the input should be propagated through the model to the last layer, so that we have a similar standard deviation for the output neurons.\n", "If the variance would vanish the deeper we go in our model, it becomes much harder to optimize the model as the input to the next layer is basically a single constant value.\n", "Similarly, if the variance increases, it is likely to explode (i.e. head to infinity) the deeper we design our model.\n", "The second property we look out for in initialization techniques is a gradient distribution with equal variance across layers.\n", "If the first layer receives much smaller gradients than the last layer, we will have difficulties in choosing an appropriate learning rate.\n", "\n", "As a starting point for finding a good method, we will analyze different initialization based on our linear neural network with no activation function (i.e. an identity).\n", "We do this because initializations depend on the specific activation\n", "function used in the network, and we can adjust the initialization\n", "schemes later on for our specific choice."]}, {"cell_type": "code", "execution_count": 12, "id": "67fe474f", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:10.111297Z", "iopub.status.busy": "2021-09-16T12:36:10.110839Z", "iopub.status.idle": "2021-09-16T12:36:13.211461Z", "shell.execute_reply": "2021-09-16T12:36:13.211014Z"}, "papermill": {"duration": 3.134635, "end_time": "2021-09-16T12:36:13.211585", "exception": false, "start_time": "2021-09-16T12:36:10.076950", "status": "completed"}, "tags": []}, "outputs": [], "source": ["model = BaseNetwork(act_fn=Identity()).to(device)"]}, {"cell_type": "markdown", "id": "c440676a", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.031683, "end_time": "2021-09-16T12:36:13.275669", "exception": false, "start_time": "2021-09-16T12:36:13.243986", "status": "completed"}, "tags": []}, "source": ["### Constant initialization\n", "\n", "The first initialization we can consider is to initialize all weights with the same constant value.\n", "Intuitively, setting all weights to zero is not a good idea as the propagated gradient will be zero.\n", "However, what happens if we set all weights to a value slightly larger or smaller than 0?\n", "To find out, we can implement a function for setting all parameters below and visualize the gradients."]}, {"cell_type": "code", "execution_count": 13, "id": "c9298d5a", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:13.342938Z", "iopub.status.busy": "2021-09-16T12:36:13.342463Z", "iopub.status.idle": "2021-09-16T12:36:23.537805Z", "shell.execute_reply": "2021-09-16T12:36:23.538201Z"}, "papermill": {"duration": 10.231163, "end_time": "2021-09-16T12:36:23.538350", "exception": false, "start_time": "2021-09-16T12:36:13.307187", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:14.644531\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:22.580271\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 2.0582756996154785\n", "Layer 2 - Variance: 13.489118576049805\n", "Layer 4 - Variance: 22.100566864013672\n", "Layer 6 - Variance: 36.209571838378906\n", "Layer 8 - Variance: 14.831439018249512\n"]}], "source": ["def const_init(model, fill=0.0):\n", " for name, param in model.named_parameters():\n", " param.data.fill_(fill)\n", "\n", "\n", "const_init(model, fill=0.005)\n", "visualize_gradients(model)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "ef67c266", "metadata": {"papermill": {"duration": 0.049718, "end_time": "2021-09-16T12:36:23.634899", "exception": false, "start_time": "2021-09-16T12:36:23.585181", "status": "completed"}, "tags": []}, "source": ["As we can see, only the first and the last layer have diverse gradient distributions while the other three layers have the same gradient for all weights (note that this value is unequal 0, but often very close to it).\n", "Having the same gradient for parameters that have been initialized with the same values means that we will always have the same value for those parameters.\n", "This would make our layer useless and reduce our effective number of parameters to 1.\n", "Thus, we cannot use a constant initialization to train our networks."]}, {"cell_type": "markdown", "id": "aa79959c", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.042907, "end_time": "2021-09-16T12:36:23.720988", "exception": false, "start_time": "2021-09-16T12:36:23.678081", "status": "completed"}, "tags": []}, "source": ["### Constant variance\n", "\n", "From the experiment above, we have seen that a constant value is not working.\n", "So instead, how about we initialize the parameters by randomly sampling from a distribution like a Gaussian?\n", "The most intuitive way would be to choose one variance that is used for all layers in the network.\n", "Let's implement it below, and visualize the activation distribution across layers."]}, {"cell_type": "code", "execution_count": 14, "id": "e5dc3377", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:23.810870Z", "iopub.status.busy": "2021-09-16T12:36:23.810401Z", "iopub.status.idle": "2021-09-16T12:36:31.446937Z", "shell.execute_reply": "2021-09-16T12:36:31.446513Z"}, "papermill": {"duration": 7.683221, "end_time": "2021-09-16T12:36:31.447053", "exception": false, "start_time": "2021-09-16T12:36:23.763832", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:30.447769\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 0.07831248641014099\n", "Layer 2 - Variance: 0.004064005799591541\n", "Layer 4 - Variance: 0.00022317888215184212\n", "Layer 6 - Variance: 0.00011556116805877537\n", "Layer 8 - Variance: 8.162161248037592e-05\n"]}], "source": ["def var_init(model, std=0.01):\n", " for name, param in model.named_parameters():\n", " param.data.normal_(mean=0.0, std=std)\n", "\n", "\n", "var_init(model, std=0.01)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "8959f829", "metadata": {"papermill": {"duration": 0.05122, "end_time": "2021-09-16T12:36:31.548759", "exception": false, "start_time": "2021-09-16T12:36:31.497539", "status": "completed"}, "tags": []}, "source": ["The variance of the activation becomes smaller and smaller across layers, and almost vanishes in the last layer.\n", "Alternatively, we could use a higher standard deviation:"]}, {"cell_type": "code", "execution_count": 15, "id": "025eae6f", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:31.649936Z", "iopub.status.busy": "2021-09-16T12:36:31.649437Z", "iopub.status.idle": "2021-09-16T12:36:39.172380Z", "shell.execute_reply": "2021-09-16T12:36:39.171893Z"}, "papermill": {"duration": 7.574836, "end_time": "2021-09-16T12:36:39.172497", "exception": false, "start_time": "2021-09-16T12:36:31.597661", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:38.181095\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 8.082208633422852\n", "Layer 2 - Variance: 37.87363815307617\n", "Layer 4 - Variance: 96.36101531982422\n", "Layer 6 - Variance: 237.2630615234375\n", "Layer 8 - Variance: 303.44244384765625\n"]}], "source": ["var_init(model, std=0.1)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "5a7f323d", "metadata": {"papermill": {"duration": 0.054768, "end_time": "2021-09-16T12:36:39.283572", "exception": false, "start_time": "2021-09-16T12:36:39.228804", "status": "completed"}, "tags": []}, "source": ["With a higher standard deviation, the activations are likely to explode.\n", "You can play around with the specific standard deviation values, but it will be hard to find one that gives us a good activation distribution across layers and is very specific to our model.\n", "If we would change the hidden sizes or number of layers, you would have\n", "to search all over again, which is neither efficient nor recommended."]}, {"cell_type": "markdown", "id": "06378f27", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.054595, "end_time": "2021-09-16T12:36:39.392403", "exception": false, "start_time": "2021-09-16T12:36:39.337808", "status": "completed"}, "tags": []}, "source": ["### How to find appropriate initialization values\n", "\n", "From our experiments above, we have seen that we need to sample the weights from a distribution, but are not sure which one exactly.\n", "As a next step, we will try to find the optimal initialization from the perspective of the activation distribution.\n", "For this, we state two requirements:\n", "\n", "1. The mean of the activations should be zero\n", "2. The variance of the activations should stay the same across every layer\n", "\n", "Suppose we want to design an initialization for the following layer: $y=Wx+b$ with $y\\in\\mathbb{R}^{d_y}$, $x\\in\\mathbb{R}^{d_x}$.\n", "Our goal is that the variance of each element of $y$ is the same as the input, i.e. $\\text{Var}(y_i)=\\text{Var}(x_i)=\\sigma_x^{2}$, and that the mean is zero.\n", "We assume $x$ to also have a mean of zero, because, in deep neural networks, $y$ would be the input of another layer.\n", "This requires the bias and weight to have an expectation of 0.\n", "Actually, as $b$ is a single element per output neuron and is constant across different inputs, we set it to 0 overall.\n", "\n", "Next, we need to calculate the variance with which we need to initialize the weight parameters.\n", "Along the calculation, we will need to following variance rule: given two independent variables, the variance of their product is $\\text{Var}(X\\cdot Y) = \\mathbb{E}(Y)^2\\text{Var}(X) + \\mathbb{E}(X)^2\\text{Var}(Y) + \\text{Var}(X)\\text{Var}(Y) = \\mathbb{E}(Y^2)\\mathbb{E}(X^2)-\\mathbb{E}(Y)^2\\mathbb{E}(X)^2$ ($X$ and $Y$ are not refering to $x$ and $y$, but any random variable).\n", "\n", "The needed variance of the weights, $\\text{Var}(w_{ij})$, is calculated as follows:\n", "\n", "$$\n", "\\begin{split}\n", " y_i & = \\sum_{j} w_{ij}x_{j}\\hspace{10mm}\\text{Calculation of a single output neuron without bias}\\\\\n", " \\text{Var}(y_i) = \\sigma_x^{2} & = \\text{Var}\\left(\\sum_{j} w_{ij}x_{j}\\right)\\\\\n", " & = \\sum_{j} \\text{Var}(w_{ij}x_{j}) \\hspace{10mm}\\text{Inputs and weights are independent of each other}\\\\\n", " & = \\sum_{j} \\text{Var}(w_{ij})\\cdot\\text{Var}(x_{j}) \\hspace{10mm}\\text{Variance rule (see above) with expectations being zero}\\\\\n", " & = d_x \\cdot \\text{Var}(w_{ij})\\cdot\\text{Var}(x_{j}) \\hspace{10mm}\\text{Variance equal for all $d_x$ elements}\\\\\n", " & = \\sigma_x^{2} \\cdot d_x \\cdot \\text{Var}(w_{ij})\\\\\n", " \\Rightarrow \\text{Var}(w_{ij}) = \\sigma_{W}^2 & = \\frac{1}{d_x}\\\\\n", "\\end{split}\n", "$$\n", "\n", "Thus, we should initialize the weight distribution with a variance of the inverse of the input dimension $d_x$.\n", "Let's implement it below and check whether this holds:"]}, {"cell_type": "code", "execution_count": 16, "id": "b38f6321", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:39.508740Z", "iopub.status.busy": "2021-09-16T12:36:39.508266Z", "iopub.status.idle": "2021-09-16T12:36:52.066650Z", "shell.execute_reply": "2021-09-16T12:36:52.066233Z"}, "papermill": {"duration": 12.618969, "end_time": "2021-09-16T12:36:52.066766", "exception": false, "start_time": "2021-09-16T12:36:39.447797", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:43.501004\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:51.103981\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 1.0088235139846802\n", "Layer 2 - Variance: 1.0696827173233032\n", "Layer 4 - Variance: 1.125657081604004\n", "Layer 6 - Variance: 1.1308791637420654\n", "Layer 8 - Variance: 1.0503977537155151\n"]}], "source": ["def equal_var_init(model):\n", " for name, param in model.named_parameters():\n", " if name.endswith(\".bias\"):\n", " param.data.fill_(0)\n", " else:\n", " param.data.normal_(std=1.0 / math.sqrt(param.shape[1]))\n", "\n", "\n", "equal_var_init(model)\n", "visualize_weight_distribution(model)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "b1a998c1", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.065378, "end_time": "2021-09-16T12:36:52.200763", "exception": false, "start_time": "2021-09-16T12:36:52.135385", "status": "completed"}, "tags": []}, "source": ["As we expected, the variance stays indeed constant across layers.\n", "Note that our initialization does not restrict us to a normal distribution, but allows any other distribution with a mean of 0 and variance of $1/d_x$.\n", "You often see that a uniform distribution is used for initialization.\n", "A small benefit of using a uniform instead of a normal distribution is that we can exclude the chance of initializing very large or small weights.\n", "\n", "Besides the variance of the activations, another variance we would like to stabilize is the one of the gradients.\n", "This ensures a stable optimization for deep networks.\n", "It turns out that we can do the same calculation as above starting from $\\Delta x=W\\Delta y$, and come to the conclusion that we should initialize our layers with $1/d_y$ where $d_y$ is the number of output neurons.\n", "You can do the calculation as a practice, or check a thorough explanation in [this blog post](https://pouannes.github.io/blog/initialization/#mjx-eqn-eqfwd_K).\n", "As a compromise between both constraints, [Glorot and Bengio (2010)](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf?hc_location=ufi) proposed to use the harmonic mean of both values.\n", "This leads us to the well-known Xavier initialization:\n", "\n", "$$W\\sim \\mathcal{N}\\left(0,\\frac{2}{d_x+d_y}\\right)$$\n", "\n", "If we use a uniform distribution, we would initialize the weights with:\n", "\n", "$$W\\sim U\\left[-\\frac{\\sqrt{6}}{\\sqrt{d_x+d_y}}, \\frac{\\sqrt{6}}{\\sqrt{d_x+d_y}}\\right]$$\n", "\n", "Let's shortly implement it and validate its effectiveness:"]}, {"cell_type": "code", "execution_count": 17, "id": "463d4b1e", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:36:52.335958Z", "iopub.status.busy": "2021-09-16T12:36:52.335480Z", "iopub.status.idle": "2021-09-16T12:37:05.009307Z", "shell.execute_reply": "2021-09-16T12:37:05.008916Z"}, "papermill": {"duration": 12.743484, "end_time": "2021-09-16T12:37:05.009425", "exception": false, "start_time": "2021-09-16T12:36:52.265941", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:36:56.408768\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["layers.0.weight - Variance: 0.0003991015546489507\n", "layers.2.weight - Variance: 0.0007022571517154574\n", "layers.4.weight - Variance: 0.0009397325338795781\n", "layers.6.weight - Variance: 0.0014803955564275384\n", "layers.8.weight - Variance: 0.012549502775073051\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:04.023764\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 1.2209526300430298\n", "Layer 2 - Variance: 1.5839706659317017\n", "Layer 4 - Variance: 1.5429933071136475\n", "Layer 6 - Variance: 2.021383047103882\n", "Layer 8 - Variance: 2.6867828369140625\n"]}], "source": ["def xavier_init(model):\n", " for name, param in model.named_parameters():\n", " if name.endswith(\".bias\"):\n", " param.data.fill_(0)\n", " else:\n", " bound = math.sqrt(6) / math.sqrt(param.shape[0] + param.shape[1])\n", " param.data.uniform_(-bound, bound)\n", "\n", "\n", "xavier_init(model)\n", "visualize_gradients(model, print_variance=True)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "25a50e49", "metadata": {"papermill": {"duration": 0.077482, "end_time": "2021-09-16T12:37:05.166555", "exception": false, "start_time": "2021-09-16T12:37:05.089073", "status": "completed"}, "tags": []}, "source": ["We see that the Xavier initialization balances the variance of gradients and activations.\n", "Note that the significantly higher variance for the output layer is due to the large difference of input and output dimension ($128$ vs $10$).\n", "However, we currently assumed the activation function to be linear.\n", "So what happens if we add a non-linearity?\n", "In a tanh-based network, a common assumption is that for small values during the initial steps in training, the $\\tanh$ works as a linear function such that we don't have to adjust our calculation.\n", "We can check if that is the case for us as well:"]}, {"cell_type": "code", "execution_count": 18, "id": "9121ed95", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:05.322709Z", "iopub.status.busy": "2021-09-16T12:37:05.322186Z", "iopub.status.idle": "2021-09-16T12:37:18.007787Z", "shell.execute_reply": "2021-09-16T12:37:18.007359Z"}, "papermill": {"duration": 12.765022, "end_time": "2021-09-16T12:37:18.007907", "exception": false, "start_time": "2021-09-16T12:37:05.242885", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:09.375483\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["layers.0.weight - Variance: 2.1826384909218177e-05\n", "layers.2.weight - Variance: 3.5952674807049334e-05\n", "layers.4.weight - Variance: 4.872870340477675e-05\n", "layers.6.weight - Variance: 6.269156438065693e-05\n", "layers.8.weight - Variance: 0.0004620618128683418\n"]}, {"data": {"application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUgovR3JvdXAgPDwgL0NTIC9EZXZpY2VSR0IgL1MgL1RyYW5zcGFyZW5jeSAvVHlwZSAvR3JvdXAgPj4KL01lZGlhQm94IFsgMCAwIDg5NC4wMjUgMjE2LjY2NTYyNSBdIC9QYXJlbnQgMiAwIFIgL1Jlc291cmNlcyA4IDAgUgovVHlwZSAvUGFnZSA+PgplbmRvYmoKOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDEyIDAgUiA+PgpzdHJlYW0KeJzVnU+TJLmR3e/1KfK4e2AQfxyA40gapTFb04W7Y9JBpsNodlYa2vSskUNybb+93gMiIzyQqGZXd3ZR4NjQqt5kIcN/gUA4AIe7v/3h5de/8bf/88sN/3dztz/g3//Az9/w9xeH3z68aJXNhYSffzp+Dj5vOaeMH3/Cxy6//t+Xl397cVv1JUtxSfU2/iLV+Zpd0duf+KXfPHzg+OVl+PTLi9RN8TUSttq/8MMLPr1liV6DkX+ysi+66V3fW7ho7Zr/eHto3Ie8qd//QTsxbfX2px9u/+P28+3Xvwkd3D/h3z/g3wbu5de/++GvP37/wz9/89vb97+85LQ59TVeLvgQL9fw8i8vv7/98d6s23zCLbm33H79Zldf/vjiwe1XDv8p5a2WkFKKMeVbSJt3bO77Dy+//fb26//qb97fvv23l7rhHpWatfBGfvuvL//z9g/pH2//6/btP738l29hutucZ6PO/PT9B7bwq9/98Ifv/vtf/uW7n3/51Ycff/7LL7ff/fvt9y+/b1f7fGLe1WZRztebfMpPoAZIm7A1tCLyOjZ3wHInrK9meRLYFYPzV8tP+RmWJ79Ftuaqj/5TLE/WctNSRp9SLUUFTd3qVmwb4drGb/7xhiGihuI08n+3f/j+zz/+9bs///jvP9/++t1Pv3x9uJ8+bnwe1xBa+1G3oqnG8tG+tL1Db9pbLGlTxYA6GHzKX2hw8VtxVbSgY5WP2uzfzWbv8Jj4UIJcjTb6F1rtnWw5RhezuigfNTu8n9mCexFzKKPZp/6lZseyJQyamiUl/1Gz4/uZXXAvUtWHu33qX2p2cZuo8zFjrKwfNVuugyUb+RWbw+gX0ZIqhtvor+PtMFb+7oeff/nxz//5hfxitzNUdNAafCpwpaQ7VSHCB9jy3a2S5oVtIYcIVpJ22Jc/vtk/fhn++OUlpS3BNcxXNyeVTWsKopfh9aqWUDPFSwO7CKfst///mz4xEh4eWnC40/Fi+lVVt9t+aeGuLmH8zEwYH7bkMi7uavxFrSGF8IDkrq5h/MRMGC98xWQ33PlTjXCvA9ygKxKjrmH8xEwYXzbvNepw508VZkpUjKEjkru6hvETM2F83SQI3jpX408VZmpBkw9I7uoaxk/M/PBCNzDUmurFeKPGzecMM0Ykh7qE8TMzYbxsLqYi1ztv1LhFCXBNRiSHuobxEzNhPDwb8SkMd/5U45ZyTc35vCA51DWMn5gJ4xXeZol+uPOnGjHphfyA5FDXMH5i5ocXdVtN0dXrnTeqbAU/tTe6acGoSxg/MxPGxy3gL/R6542atpJT9Fcip7iG6RMjYTqc9Sw5D/f9VDFJrNKnm6YFo65h/MRMGA9vvThMf6/GH2rC1FfxctMLEquuYfzEzA8vFVPkksO1zx9iEsySa8p6BWLUJUyfGAnLA6wILlxv+6liHhN91Gp5GG0Nwyc2wnKM13BQ/dXwu0hfDl5M9lcaRl3D9EcjYXnZXI2pDvf8UDGulxivPIy2huETG2F53WKt8TqNO8Sat+KCoG1Lw6prmP5oJLeZuJ6dfL7edCN7j3vdvFjL5KIuYf7UUgKAj+5dHe2/qz62ebu4MGCx+iIAHi2l/XDTfS4xDwAOmTtoQUtJ4QrGyosAmJhKAvDVQ0jBDwQOGXP2zZWYvB/IWH0RBBNbgcDDZQ8a3dAJTtmXgCdffFurtWisvgaCma1EAM89tq3dK4JD9lm3WLLGMKCx+iIIJrYSAfz3WLWMveCQfUqbpFz8iMbqiyCY2EoE8ONhSR57wSHDwYW7K9JWby0aqy+CYGIrEATuC3pYckVwyt4r3oJafBzQWH0NBDNbiQCefSohDr3glL1zW9IUow5orL4IgomtRAD/Lkfny4DgkBU/wQus4UrGyosAmFhKAPD0s6obguxOGW8+FzVGuXKx8iIAJpYSAHz9IknHHnDIGa/+EnLVKxcrLwJgYikARL8x3KUMPeCU8dYrrgQpVy5WXgPAzFICiFvVzL+/AjhkxtZJSTkOXIy8CICJpQSQ0YavMvaAU9YtYEok9crFyosAmFhKALqlWkoYe8ApZ7SbXKoDFyMvAmBiKQAI3HyHGd7QA4wsG5rMXgYuRl4DwMxSAoCT7yqaGQCcssD5yeL0gcshLwJgYikBwMX3iaEqVwCnHDHuZcyBBy5GXgTAxFICgIMfnOaxB5xy3CQUjPcPXA55EQATSxmPDvc+5JyGHmBk3PQg6h+w3NU1zJ/ZSfPh28cgw+q4leOGWWBocadXKoe8CICJpf08QkGjYbz/pxw3B48v5wcuh7wIgImlBADfXsS5sQecsoloG7gsFeg2tZQA4NtLLXXsAafMeEaM9frI5S4vAmBiKQBk+PYppXINgbCyCegcuawU5zm1lADg22c4uDIAuMi1atsOHBq5y4sAmFhKAPDtc/Ey9oCLXIom/8jlLi8CYGIpAcC3L6HGsQdc5JLE5QmXXV4EwMRSnjRxmxYtfugBV3k/yjBi+bIY/+vZmC3d/uPlVXuuMP75m9v1DM14dKNWH/dY5hoz90bhtQSX2vJdaZ8o3ZeJUqSyF5eyZc7tS4trlH10K8oFQE+/V7boRJvfq4FmJYaD5c1nF9t2saI1DI99+pyKT225WIVnjfDU3JoPmfHioMo5tWTOqQENrHzpqpaIcYWRVinUoLkHIUmKMWhbgSlOU4vV0i3DJExFUw9CzT1oJyaHt/MNbysMVLHuKuYueGPfMIS7IoxchQrkTnEPGeGjtQXwQ4V7K7kkveWyFVVN0gNDxCXB3xFeCLUtATBWIgT1+aYJpvPl0GMpVOEwhhvaKrHE0PwH5zYMGMFzo4F/xyiLpmNChVvA1XfxGyYaxcm+HwuGmavvGHfx5TX2fVpcVnTA0fZpq/euT9Gc4v5UXDl0WInOomXXVflbO/+Uq5a+tu3QN7JGbnAwmLnEXO47gE4KL6JtggYfa931jIuW0nSn2r/We8wMc6ytFedwrWmXMTqWGNvVwGvIuVnl0WROjjtL4At3ot946hV48bjBZEy2NbZbz824FGLJoVFIMaTYP58xC+N13jg5R7d1pX8v3tG4EpjIj6Ar+b6IFfDqioKZ+60q5i9opu4L/OgkeFIVvdBJ6i/6gC4bfcB7rvBwLFyedi18mEUzBh92jczQqCZjTNCMmRFvE7peuq8nF/RaeMnsicBcGq7IlQRYHRhQB0Al7ktsIXhh8B16ic9pX4/CI6c8xY1nxHu+dylybw43MbfnCf893xfv8EwLD8HCT+MGZl+6wWdS9py34UlNtbY4ddxTODDCvo2JbYBlLeCFKH2JqUX35oLuGvfJfxb0+Ob4lVx6LBzAgjue/D4ldLm3jIEo4AWZupeYtHf+dlQ+Bb2/N/qkehhjwTm8Ll+P/gUOmxsw9ffA3zwkPMsWgBanZ4c/vJpzAH/xpgPIj9/60bYdjPrUM40Y/jRV1/7BI6xyb+n1E4n/7bv//OFP+M0eSdzTRrwlz0NPLjHme3hIFXHN9xBC4dA9RqKir21eNQ9hOka+MDKtPOZ9YO84PsB/PvdN/DLrU0F41NrX60LqqT7hTH+IAYNdSBUXD4ZvSgKBv3xCGoiviA8v8eAxSl8fNCM/AyCakMzWxNVPygbxeIL/6yGIfFFgLC7DI3DKT0AQPV9kbC1kvAQ/gcGl51wyQ8APwgtG0C0rWDw5NcRX7GpvGmo+s6fBmzNX/7npIt4FApyVDC8OA/EFwik/CQImx3Dg8PjBmf84h4cUEu/CoYYtlMJsKRcOp/wkDngzO3hicI1c/HhyhYecEu/CgY4xfF7VeAVh9CeRoFePaaavqSTMY96UZ+J9UGDGlbr1VxSn/iwUcMTFYYauCaPzp+eeeD8UXNpBp4WLfkVx6s9CkTF9pKPoA+Y/H0WRXkvDgZm1Xp64z03H8ZlsuXL1qYs1f3vl6qPLXvCdMT3EXLAOoVz4LCfQZQjlGuRSfdutHxq5y8fa1RogJhYTBKbjkmXIV2Dltl6vbT3KNmLkxUBMLOZsBD4GZhDluqFjZVjMRZH8yOcurwViZjFBhC0mdWnoEUZmygLfTi9f8dzVxTBM7CUG2XKOKmN/OOUI5ydpW1m60jnkxUBMLCaIzCXvHMb+cMoRTYlrwY1XPoe8GIiJxQRReWfFjz3ilIU7PFHiwMfIi4GYWAwQiR6e90MAmJVlq3Bc8yOfQ14LxMxigoArpbkOgWBWxjX72teCbSNGXgzExGKCyJuvoQz5EKys+N6Mrxj4GHkxEBOLCUI3qZqSDCAOOaFdxkD5Kx8rLwZiYjFAcF/QSYxDjzjlxFOjMeaRj5HXAjGzmCAY8wZXeegRp5wqN45ciAMfIy8GYmIxQWDu4LntMoA4ZG50JWnrBqYNoy6GYWIvMWDmEDw8owHDIXOHGO3rQMfKi4GYWAwQBVOHUGQIIDOy+k0KXekrHyuvBWJmMUFg7hBjGALJjFw9YxVSiwIwjVh5MRATiwmCGYWrSwOHu8pwjxpdivGK56IvBuLRYnLIDOjQOII4ZB6rVRd68JEBZOXFQExMJgnMHZLLwyrdofqITqC5L0UYPEZeDMOjvdzo9dzllPHBOGXPs0WuasswZfBYeS0QM5NJIjLgbYg3PFXGWfuMZgY8Rl4Mw6O9pMDAwVLL2B8OmTFDnt9ar3isvBiIickkgXlDiUN2ilNltF1QBiRe8Rh5MQyP9oJCZXhkizO4YDjlFnUYuVz/gOeQ1wIxM5kkmEoYjoAfSBxyi0IO4rovZQhZfTEUE5uJgmG13vvh2ThlBoVGV0L3ry0ioy+GYmIzURR8kp8dUBwyj6hh1l1aFLNFZPXFUExs/vASHQxyIesQM3jKvOgsUVvYrkVk9aVQTG0mCjiJTqX4AcUhMyBasxfnB0RWXwzFxGaiwAzCS0hjrzjkFlHvcotmt4SMvBiIicUEwTA8x1MBVxCH3I44FOcGPlZeDMTEYoLAHCIkDWOPOGR615JcOxRjGrHyYiAmFgOE58kJbmxfQZwyfKhURdVf+Vh5LRAziwkCs4hYYi0DiENmDCN8yz5GnI1YeTEQE4sJAvOIxxMLRk6Z21upvSlNI1ZeDMTEYoLAPEK05rFHHHJiatSc26E1y8fIi4GYWMwodEwjkjDt6QWEkXVDF+ihELYRI68FYmYxQWASkR2PmV1BnDL8Bbwo2tln24iRFwMxsZggMIXIOYaxR5wyT7AJpt4DHyMvBmJiMUFgAlGCd2OPOGUeC2dyvIGPkRcDMbGYIGo79D7ki7ByhCudfC0PfA55MRATiwEicu0t5nINHrLyGUh4xbNkeOHUXmKQx3qBP11k7oCLa4vWA527vBiIicUEkWclEq3MAFsuSD3wOeTFQEwsJog6K5hoZRNyPfBZMhJ7ajFA8ADJY/HEQa4ht1PaQyN3eS0QM4sJIs4KKQ6yJm0JFYZG7vJiICYWE0SeFVUc5BJaeoOhjV1dDMPEXmLQWXnFQT4wzOB8KQZ7LuqlpV25fSKUx7Qr46ka8eE4iKFlH/C8C0xm0APLmeUhtopysLalQQhM2BBrS6rDzCs1ttoTDLVtuRZaBK44n1yPtwyYjIeWhIllGkpN93jdWmF4Yv4H79rFUy6boJUQW2aJ6Lz0RjjgJH4R/i7moLIH7sUsDqNwghMbxeUex+Z5LJ3lttFaxQW1whAM60pFaXBmPW6X+xZlzqxSrfhMCVsppNpkbtgJvqcwMUbLocRtfiZ08YmnM3uwXO1xMYRWmV8G7DYnDjbsUSIqhblkesIQfFHdwyYwVDCbjMc8hJUrejvqtxAcU4kwW0UsqYesKlM590PnNDT73BLCcJMZExjuqTLXVyqSQt91Fe4gMPsLM0Cpq971PdqEmU7i1bBWAozq91Mzyybh4lruE9wV3y+mMCqUh2yYKsUnUbnrCUxTT6FSXS33HU5fAtPItK1eRuHf9ZK87juiR5GKoDAQd0b1/nkfZdd5vbHneamq7SQftw0xgdciTYYztt+n6jGPw4NQWrqYhAEg1l3PLfsPbQLRvveILwogGVs2l8rq5LpvxEUuE/VsLiH66rouXFkN3L5mMjUe4euNZ7qGkTlnWH4Fb8J+/ypTxPhaWxQIEwr1bT7dsmv5RniqMmCACLLvbkS2yQFANia60bovcRMTa1o4zyzmqSVu4YqvZ36bxCi0FDBny/v6pwPldCuski6pZfyOnkV+hXbQIqm1O7Wev8Deckt4CKqG0JcO2xOb+dgxTN7VfdkE/ZTZkiofOwFb6atNBcZprYnLSgGfEN0XWbSUrC3/S8JdVd2XHJiN2fdSfImpDqgyh48w3xEXIjIeobpPyxNo4CHFIBIThrE+N2N+F++ZWgZgQy57I+i56N0t/4uL/WlhfgIvxec2PGXH8yDNZ/EtC6K2sQy3LPf9BGaLFpdqrxWLbtVZw8NBj7/vX9VSO9Vh2MdH5HX5DRlgXskB8Fr+ELQ8TQ/w4dVMJPiLN+cZmH/7R7/jLRlhOPLK/jbk6PDpKWHC3yUlDK4V//0huBgjFV4gDxVrjHyBZFqZp4Q5P+CuX/XFp6cxMjG7hg6hTUZ+QkIPYUWelk4g+1LemBTmCSlhviK+zHE84WV2xXfKz8DHcDG2BncCr9tPyQkzHhH/egTwGsflMAvb9Qk45ScQSHhF5dYa3uGflBDmlXQwGJHpNrUe9txcMF+xj71pjPnMlDvwE8zVf24umHeBgO7E1HrxyuBQn4QA7nt1gplOZYLMN2VAeRcKrcSXw8QgXzkY/UkkfHMmQ2RId31j2o/3QQFXCx4lnK0Bxak/C0Wb/LoEJ17142k/8itpPyJnKmeLX5D24zPZvusKCt/8k/I1wpnvY/maQcZkwXf50shdPlPWLgFiYjFBpG1SxmaQK+Yo8sjnLi8GYmIxQZRZOZtBPkFM+SwGYmJx97wnZW2szNT9mMz6gY+RFwMxsRggQpiVt7EyQ6WLTvDs6loYZvYSg8yK3FgZL3w4xy199ZXOIS8GYmIxQeRZsRsrx5bduTzyOeTFQEwsJog6K3pjZXgpDsbJwMfIi4GYWAwQcPUm5W+sDNuKlvjI55DXAjGzmCDirAyOlfMWI75DBj5GXgzExGKCyLNyOFaGP899kAHPqS6GYWIvMeisKI6Rk2PKZ8lXOEZdDMPEXmAQNyuNY+QkWxbPAslXOkZeC8TMYoLgLw8lcoycKk3oGz+Wj5EXAzGxmCDSrFSOkXPa8Gfe1SsfKy8GYmIxQZRZyRwjF9ki+kDbDTaNWHkxEBOLCaLOSucYmae2Wbnliseoi2GY2AsMKWzcj3dDfzjlmjbNmGf6Kx0rrwViZjFByCbV3at8nCAOmcVocsRooFdAF30xFBObiSJvhR8b+8Qhs/RMSIxXGBBZfTEUE5uJgtuhfjite6q8ZB6gcHEAZPXFQDxazH1NhtmUPGQ4MDK8x17SKl4BWXktEDOTSQKTBx+HrLSn6plJzzM18cDH6ouBeLSYHDLraOLvBxCHzGQnQUpuh4oMICsvBmJict/zZxWuK4auMQyt5NqrqFk2Vl8MwmgtCLA2W3Rce71AOOVLtN8Jx4hrQZgZTA74JeZxc+tQGWkZHL5EBjpWXwzEo8XkwEc8xDT2h0NuKU5yCq1wiQVk5MVATEwmCUwcRP3gSR2qTxgWk6YUBz5WXwzEo8XkUDltqkM6AyOzakmF4xjiFZCVFwMxMRkkNLCUIbfrLiROmTUZpYoLZSBk9bVQzGwmCkwccpI6dIpT9u3Meo2tUrFFZPXFUExsJgpMHQru8NgrDpk5NvFDD7q2iKy+GIqJzUTBMrfMTzCgOGSmXQ0ltmh904iVFwMxsZhVP/Ey1KBp6BOnrPQgaq+LbBqx8logZhYTRNzyY1C6kQsL5LbiHxc+Vl4MxMRigkiMeZEw9ohDbid8ejk404ZRF8MwsZcYWC3ahSGngZFTYe3h2keIsxErLwZiYvGHl+Qco81rvYYMGTnx5SC5HaexfIy8FIipxQQRmIm7lDKAOOXK0tSlrdvbRoy8GIiJxQTBitgl5bFHnDLr0VffjunZRoy8GIiJxQRReP4vytgjTpln22I/q2cbMfJiICYWEwTLeMOksUecsmCqybnmwMfIi4GYWMyDKeGxEOFPF1k2lxgp9MDnkNcCMbOYIDB5eCy9aGV4DqVI27ywjRh5MRATiwkiz0ovWplHgffSglc+h7wYiInFBFFnpRet3E7Jcmdr5HPIi4GYWAwQwc9KL1o5bq4m1zynK59DXgvEzGKCiLPii4N8D78fGlkyKn9qMUGkbVJ8cZBPEFM+XwrimUk/Hs7UhCD30weYIeTWlQuaaquPgSf94RfuVTW9a2MeQ4tjCLlnCCkFY0GbcQbdsjgm4RVGC/mUe5whKygrU9LCuXKBmRaajBeLzy63IFXnJLUqGRIxzMIBiS0XSHHi+3kHEEmwllNB3UoN2Zc9hrEkp5zTAIkyV9ke0ofxSfFFzDiSZQ/aiBUXWPDIMrOB6zm8RLhekBI6TdINDWftoU+wDBcbInMmJM2uHTEUJgjMuOzKxCGeZ317YAxMCPh4YuKQXKSXiWK8jNQcU7wpjAHW2CMmPO42bgFL6jiMHLuNPErPWLPSDvNFzX27LGXcX5VcWzepKebOlVkohNk0bl6YEdrz67m9FrYgTMlwwx3Y4M25fkSAiWBxc7Rvv7iWLaXpsA9smU46s3hodiHs25W8WW17qrKMdt/S5kwhsadwE7PCP2jVbrip52Bo6JlAgEh7dAhPRHv81Z7CAzD6SZ5cN1/Fedkzh0iqu1xKjNwM4oG5KL6v6vC7YnDS85Jk9IBS9r0jx2wotWXxiBr3+1SYELpK24eWLQCN132PBd2SmSQIB89ITH03Dr+E1PKeMHeN4l72ZTWHD7mWaiRyhNmrmHAxGi227CGBmWdc8PtyZCrCZCCVWUajSJ9646EDSzwZvTZM7N204rbBpYqt5hhuoLq+ciPt2DeeDNwCzMl8b5shVxJh+S0zOYHL+6cL86ILWmSGG24p3ldD4KslPhl4ooprTwCnvvhk4oPREqcE6TNirpRxLOCMuCZ0tz4biPyFBwfwLOYksWUC4WwJHTS0MHJcvo/dY84cISpXMTF6oVfV3oYyYVzsQfmacQX93QloeHPm5nZjDOp57eltupxaJhDm7cXj0J1Q3DXxklq+DvzXur+AwZhJaXq+DoxDLXK3vY6i7Nk9HB7Z9tRxzI6MU+pbdJpaxpRhJFc0p6/Lb8ji8crp7ddSPqDl6cHuD68mj8BfvPmE+PzbP/odb8niwUGzvil7h/xdsnckPL3yGCVcmD38sY6LkS9wTCvz7B3HB/iPfNbr/rXkCxhncsDgOcxoTvkZyRfQBJ7TgtdZqB9JP/G1snd8PXxZmp/GzFkWn5GfgA+vVzgIaI3Zxz8pecVDWoWvR6DgRVBc8UN0m5GfQQCvptBaY6qsL8jegTcdPBDMwENLQPTcBB5f8yl9yzDzmQ8pPFlz9Z+bwONdIMD3cpicwNG7QDjlJ0EoPM5UWTWI1XvflMLjXThgTGV6PPh5VxBGfxIJZiCsNajLHk/O21J4vA+K2IrTc0/uiuLUn4UCl+ng0MEFcphpfnIKj/dD0fJJ1pjDgOLUn4UCMyVfK97rxf2NxC76SjaThFl4PVv8gmwmn8n2fReHYN+kNg386FltGitj/pFy6gvIphEjH4tDS4CYWUwQYVabxsqciPnYMqEOfO7yYiAmFhNEmtWmsXLc+K0tg+SVzyEvBmJiMUEw++lDbRorc2nRuVIf+BzyYiAmFhNEndWmsTITk3rtSxcXPoe8GIiJxZhcMRnyY20aK8sWg1bVgY+RlwIxtZggZFabxsrSlofbGteVzyEvBmJiMUHkWW0aKydanNrSpW3EyIuBmFhMEDqrTWNlLm1r1QHPqS6GYWIvMHg/q0xjZO7xBF9koGPltUDMLCaIOKtNY+QUmd27Z8+2fIy8GIiJxQSRZrVpjJzgU+fUd9ssHyMvBmJiMUHorDaNkbnJxfLA/srHyouBmFgMEMHNatMYubDwZcpSr3ysvBaImcUEEWa1aYysrC0fe7kA04iVFwMxsZgg0qw2jZF5MIRZbeTKx8qLgZhYTBBlVp3GyL6FubpeL8G0ctEXQzGxmSgwewi1pjqgOGTPRKIqIiMiqy+GYmIzUEQ+76nI0CtO2bOyg2fQ74DI6muhmNlMFMJCNykMveKUvfCLJfS3qEFk9cVQTGwmisxIqOjHXnHIvhV50ejygMjqi6GY2EwUypgqV8deccgs98NqyS3g6ILI6IuhmNjM/W9Gs2gLnLEoTpnJOxh51L0rg8jqa6GY2dxDAeAt5jz0ilNuUW65tGgwS8jIi4GYWEwQmEuMhYJ+svIljccF0Cy9xxooJjYTBaua5TA+HXeVDwH8S217lhaQ1RcD8WgxOCT8lQYXhi5xyowvLYk7jAMgI68FYmYySYStaBlO7Z+q5094GFqJN8vH6ouBeLSYHLi3HlMde8QhM4+HOCdty9MAsvJiICYmkwSc5lqHzNSn6gODl5O0cuKWj9UXA/FoMTlgHuGSz2OPOGTm8Kjq++EhA8jKi4GYmAwSGdMIFgS9gjhUTr6TxFrywMfqa4GYWEwOmEPwrHUeQBxyjTxZIO3o7dmGVRfDMDGYHBijG9KQxMHIKpvDjKvvep2NWHkxEBOLCYLHStpM8grikBkY6Gvp215nI1ZeDMTEYoDAlCE8husbGbazEm+fa5yNWHktEDOLCYI1jKuWoUecMhxIr7UX9rJ8jLwYiInFBMGjPomJ6K8gDrkF9EXfDsGZRqy8GIiJxQSBuUPyImOPOOTEKYUve484+Rh5MRATiwFC8WepMDPiBYSRFZ507AGzthEjrwViZjFBYO6QoxvSOFiZR/BcL/RsGzHyYiAmFhMEw+ZVhzQOVsbTEDS3Y5i2ESMvBmJiMUGwMrUkHXvEKcumRXgU5MrHyIuBmFhMEPWx5uJPF1nwDNC2Bz6HvBiIicUAwa3dxyqTVhbGHfu+PnXhc8hrgZhZTBCyTapMWpkn3oKkkY+RFwMxsZgg8qzKpJVZNU9TfeRzyIuBmFhMEDqrMmlllhFW186WD3zu8mIgJhbzgI2fVZm0sjmkceWz5tmNqcUEEWd1Jge5ptqSQgyN3OUvBfHMDCfDORtXVWoLxMdbruJRrv2hjj1rAW6sK5L3Mc9zxa0Fl7oNX+24bc7Kz760FAwZUOA5OW1elMtO+wTMMa8JEzlQzs5L7Z/OG3Oj0NVoaUq6b+4K5u+xpU8qW/F7gcrMYtsaWiQL+lpkYqEWweZYxZGFRBLDPEPtXowPW4KJsaV4xD3we+Afqyxk5mtJmSeytHvBHtPCJFUKM0d4jdn1RngMLHCFMctWCq6k7EFz6OkY/5niJABYD5LxlafBceEtxQlBta8M3PBLBYyZ5CKXkLvMyDvHZTvmMsnwuHp8SfSbqwVDSctnXZh9UfdgC40pxIKuAUAi4vo2fMJMJjDfCjrjxkmu6L4Nj3kuM54wj4dLPb0pZa0ucyE5Fdyr2FKfcKea19L2oBzzy4TQN/NrG9WY+STiNuMqWivCjAGZ6Zx8q1UgsSVi5zZvCimy/zIJCfpk54tu7p20fC7FtUwVknc9aztuyG1hl9AX7rvI6IDie06UoKV62XWVUNJ+GNDhDvX2ZZMQWmIYtqMtEqPvMWJEbsfVeaTKVbnLSfF07LlSfEFX3jfipOBOaTOXOWp6ZxLmscKF+rZDV9HBe7dmXLVPKZRWvQS9I3XPMfFAogQfWzRIdJo7HvyCvtcSzMByYYaghpM3yGllTpfI8+UYD/pKlsMzBLbaSs3newUErnl6X0uqN7pmPlUf9xVAiYU3WplxpWjv87hDIJZc3PP91t4IiEjFzQotiUoQzICaHLaAhyu3RSPBWHpfYap8zbcUsbgDfcOuoO/hZ66ryIahzu+T64xbljiTTBje8Py0RCKcc7sQK54KcIuKn3rTFYNA4BiAOfl+n3H9GFwK1wELs3fXYw7rSuXNwniB26M9dlNZtTTv5yGiR4dtdmtmYr5W+5AfSNH3NzmedddClZgqBZff15ErxprgWxFU9HTdb9kwyqtIeFV9Q+6TVw6+v5YwAy1Pz8R/eDX1BrOlvPVw/fzbP/odb8l9kgP6E1vhP2/KgZL/LjlQCp++x7hi9Ay8cK7w7toFj/n7efaT8wP++iVfnryCb7EU6zDtM/ITklcUPNwOnlbFbEA/kr3ia2U/+Xr44E9sqikNkT5GfgI+eEYtrRroJbz0Pif7yVckoMxQRt9j6PuH/AwCXIRlaxiSw5dkPyl4hfApxnUycdNzs5983af0EweYz3xCY7SX/rmpT96FAFyJiPmBZkPg1J5EAP4P3CbMVauHn/GmvCfvAoH5TQouTtRQMOKTMLAkCpxJOvCsYPympCfvw4Hh5K6lADUcTvFZHFjgO3gejWThvTdlPHkfDpxIehYYthxO8VkcMtMEhqrRSyqfle4ELv5+fqq1+AXpTj4T7PuuFOU0q8xTss4q81gZ84bqtCX4tI0Y+VgpWgPExGKAKG5WmcfKTN8aerHPK59DXgvEzGKCCLPKPFa2IOZ8FgMxsZggZFaZ51V5im0xEBPTCKLMavNYOW6l1thWIG0jRl4MxMRigqiz2jxWlk0Fb30/8DHyYiAmFgMEJlWT2jxWZtBSz2lt2zjVtTDM7CUGmVXmsTJdDC0t3bVtxMiLgZhYTBB5VpnHyoqXg6Q48jHyYiAmFhOEzirzGDlxkyfUtlVhGrHyYiAmFgNE9bPKPEZOvPWay8jHyGuBmFlMEHGbVOYxcmZxBJ5Pv/Kx8mIgJhYTRJpV5jFySRtXXNvunmnEyouBmFhMEDqrzGNkxTU759rGnGnEyouBmFj84UW5If9YmcfIWhkNXtpxGcvHyEuBmFpMEGFWmcfItbLd2GoemEasvBiIicUEgbnDY2UeI3vfc/60ooamlYu+GIqJzURRmCxZ/NgnDtkH344ItO1ni8jqi6GY2EwUdUvqfR17xSH7lmGUJeUHRFZfDMXEZqBAN4d3VIfALiPzGHKq0jO8WERWXwvFzGaiEEY6lSGbg5EZRBQZclFGREZfDMXEZqLIm1RNSQYUh8wQIzwMuaXgtYisvhiKic1EgZmEQzcfe8UhM0wqhFj6A2IQWX0xFBObufePe+udH3JbGJmbKJp89ygviIy+FoqZzUSB98BYU+knK18Sm1hE04Qna6CY2EwUqQWN1bFXHDKjKWuEQ6UDIqsvhmJiM1Ewuq9IGXvFIXsprFAY2p6GRWT1xVBMbAYK1nOMMeShV5xyK96BcVIHQkZeC8TMYoLAlCJWNwTLHSoDnHPITgc6Rl4Mw6O9pCAM19Y4Yjhk37Kw4h3hr3isvBiIickkUVjbcVjhP1WmNomhRaEPfIy+GIhHi8kBM4mU5eHBOGSGeweJLukVkJUXAzExGSQEE4kc/PDSONRaNxddT/pjmrDyWhgm9pIC5hCYNpShP5xy5dEbn1qJB0PHqIthmBhMDphAlDhktzhVjTAgaTt9apqw8mIYHu0lBR77eQjQN3JpVZqlXOlYdTEME4PBIfEkT4pDRgcjF91a4Y8y4DHyWiBmFhMES/pirjA8F6fcCjiL9Pfm2YiVFwMxsZggEj7Jzw4gDjkziQMc6DLwMfJiICYWE4SypDULsV9BHDJ+YnHHlK98rLwYiInFAIFLj06lDD3ilFPYfMxu7xEnHyOvBWJmMUEwKFh4Z68gTrluEnJuBxptI0ZeDMTEYoIQHh10MvaIU05MRdxzYdlGjLwYiInFBFEeSzL+dJFNIJ1tZNX4uqnFBFFnRSitfAEx5bMYiInFAFH8rAillU0cum1k1fD0qcUEIbMilFY2wbZXPmvG4E4tJghMHh6LUL4qT7EtBmJiGkHorAillU1Avm1k1Tj9qcX9hOekCKWV4+aLOHnkc8hfCuKZyU6up2xYOjz3ozpcSSttmy+ydIcLPcyc6RoCc4xw8hB6FajChKri2k4xl1zoOTU5o2llnlXmxYqur9wzRNvV7HukevZ92bKowygintlV6I5H7YFYGrZUQmDACpDE2PJWFMVAEx2TIUjBS0hD6rKwbnvz5WCR68eINOPimLCOaRVKKNx/pYz3mMs86w1fN9daSm9Ct4h7kPTWcmu70K8CY32QKML8DjXiIW8fZsIS5XCXPWil2kOvK1NXeJajx7vSeZf3SBHPO4HecPMOF4IR05c9rkilBniY+Cg6SUrNcgbZJC2hrfviHjkXtcch5a06yVFaMgwp+EhvB9P65qq2k4A1Qq97WAa+Owp3mFj5sfgez+TxoYxWmQDFM0VyT1umnjjwLdIyoAg/0a7fx00TY2W5N5dVe44L9UymmyTmtqmNm5Jr2De7mT4zhFa+wGvtqZ/ase7ImwOdSSpcT0iMfr4Jbmjt1cVjQL/v7dQtJ9dO4eFW4z727TF0+FBTiE1mhon+rTyCjjdxzv3jNbnu0AfPSN/QdhbLlvBNLbMIdQxTKYV2Ne1ITf88up1krb34AuTS11gDHuM2rEAXni9w/R0YhFWf6p4fJrMkdd/c5QlKXGjPdBJYMd3vuzkZ95kJWZi4xpXUV2kid4ATE4L4gO9Fp2s9jYv9PsbCvBAAUjFG7EvfFQ9naZlXHHOOxRD7qlfYOPDBb6+CIUhrb4aJ+rJnMk/lrpqU9ixwjSzUyvwqeLBAu7b4V+VjVJmR58Ynudm3L6AkdDE8DMVxAya2E6dcTggpOHQuEENHYdIDyjyhg6cg8CHJmGI2+1n5JohKm2nyiGm/2WCHZ5s5V1pJTpf6I8KkMsIcRYzhx82rfeecKXd44LWNAiHg8Sy7q47xgGWsYFasWfuH0yboU6XNZFKqPbUM3VmfcuS6Oy7UJdfnPbAsg2Vu2U6qoIP18Tzwl8J9G6bMzaVjbWljs6SeGMr3agTDO8Fpr9bwivyG5CivnIt/LZ8GWn48Mv9hnpODiVTedOp+/qWvt/6WhCiF2xjCA83tUPMbEqLo7ZVMCLEUppLiIFa1XaBpSmaJEDCEuYIZMDqRSYTwrz/+8uc//fi//8JfLgd0X/4fxSTBZQplbmRzdHJlYW0KZW5kb2JqCjEyIDAgb2JqCjExMzAyCmVuZG9iagoxMCAwIG9iagpbIF0KZW5kb2JqCjE3IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggOTEgPj4Kc3RyZWFtCnicNYy7DcAwCER7prgR+DiA94miFPb+bYgtF9w96YnzbGBknYcjtOMWsqZwU0xSTqh3DGqlNx076CXN/TTJei4a9A9x9RW2mwOSUSSRh0SXy5Vn5V98PgxvHGIKZW5kc3RyZWFtCmVuZG9iagoxOCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE2NCA+PgpzdHJlYW0KeJw9kMERQyEIRO9WsSWAgEA9yWRy+L//a0CTXGQdYPepO4GQUYczw2fiyYPTsTRwbxWMawivI/QITQKTwMTBmngMCwGnYZFjLt9VllWnla6ajZ7XvWNB1WmXNQ1t2oHyrY8/wjXeo/Aa7B5CB7EodG5lWguZWDxrnDvMo8znfk7bdz0YrabUrDdy2dc9OsvUUF5a+4TOaLT9J9cvuzFeH4UUOQgKZW5kc3RyZWFtCmVuZG9iagoxOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDYxID4+CnN0cmVhbQp4nDM1NVcwULC0ABKmpkYK5kaWCimGXEA+iJXLZWhpDmblgFkWxkAGSBmcYQCkwZpzYHpyuDK40gDLFRDMCmVuZHN0cmVhbQplbmRvYmoKMjAgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAzMDcgPj4Kc3RyZWFtCnicPZJLbgMxDEP3PoUuEMD62Z7zpCi6mN5/2ycl6Yoc2RZFapa6TFlTHpA0k4R/6fBwsZ3yO2zPZmbgWqKXieWU59AVYu6ifNnMRl1ZJ8XqhGY6t+hRORcHNk2qn6sspd0ueA7XJp5b9hE/vNCgHtQ1Lgk3dFejZSk0Y6r7f9J7/Iwy4GpMXWxSq3sfPF5EVejoB0eJImOXF+fjQQnpSsJoWoiVd0UDQe7ytMp7Ce7b3mrIsgepmM47KWaw63RSLm4XhyEeyPKo8OWj2GtCz/iwKyX0SNiGM3In7mjG5tTI4pD+3o0ES4+uaCHz4K9u1i5gvFM6RWJkTnKsaYtVTvdQFNO5w70MEPVsRUMpc5HV6l/DzgtrlmwWeEr6BR6j3SZLDlbZ26hO76082dD3H1rXdB8KZW5kc3RyZWFtCmVuZG9iagoyMSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDI0NCA+PgpzdHJlYW0KeJxFkU1yBSEIhPeeoi/wquRXPc+kUllM7r8NzbwkK1qF5gPTAhNH8BJD7ImVEx8yfC/oMny3MjvwOtmZcE+4blzDZcMzYVvgOyrLO15Dd7ZSP52hqu8aOd4uUjV0ZWSfeqGaC8yQiK4RWXQrl3VA05TuUuEabFuCFPVKrCedoDToEcrwd5RrfHUTT6+x5FTNIVrNrRMairBseEHUySQRtQ2LJ5ZzIVH5qhurOi5gkyXi9IDcoJVmfHpSSREwg3ysyWjMAjbQk7tnF8aaSx5Fjlc0mLA7STXwgPfitr73NnGP8xf4hXff/ysOfdcCPn8AS/5dBgplbmRzdHJlYW0KZW5kb2JqCjIyIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjMyID4+CnN0cmVhbQp4nDVRSW7EMAy7+xX8wADW7rwnxaCH9v/XUsoUCEAltrglYmMjAi8x+DmI3PiSNaMmfmdyV/wsT4VHwq3gSRSBl+FedoLLG8ZlPw4zH7yXVs6kxpMMyEU2PTwRMtglEDowuwZ12Gbaib4h4bMjUs1GltPXEvTSKgTKU7bf6YISbav6c/usC2372hNOdnvqSeUTiOeWrMBl4xWTxVgGPVG5SzF9kOpsoSehvCifg2w+aohElyhn4InBwSjQDuy57WfiVSFoXd2nbWOoRkrH078NTU2SCPlECWe2NO4W/n/Pvb7X+w9OIVQRCmVuZHN0cmVhbQplbmRvYmoKMjMgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMzEgPj4Kc3RyZWFtCnicNU85kgQhDMt5hT4wVRjbQL+np7Y22Pl/upKZTpDwIcnTEx2ZeJkjI7Bmx9taZCBm4FNMxb/2tA8TqvfgHiKUiwthhpFw1qzjbp6OF/92lc9YB+82+IpZXhDYwkzWVxZnLtsFY2mcxDnJboxdE7GNda2nU1hHMKEMhHS2w5Qgc1Sk9MmOMuboOJEnnovv9tssdjl+DusLNo0hFef4KnqCNoOi7HnvAhpyQf9d3fgeRbvoJSAbCRbWUWLunOWEX712dB61KBJzQppBLhMhzekqphCaUKyzo6BSUXCpPqforJ9/5V9cLQplbmRzdHJlYW0KZW5kb2JqCjI0IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjQ5ID4+CnN0cmVhbQp4nD1QO45EIQzrOYUv8CTyI3AeRqstZu/frgOaKVBMfrYzJNARgUcMMZSv4yWtoK6Bv4tC8W7i64PCIKtDUiDOeg+IdOymNpETOh2cMz9hN2OOwEUxBpzpdKY9ByY5+8IKhHMbZexWSCeJqiKO6jOOKZ4qe594FiztyDZbJ5I95CDhUlKJyaWflMo/bcqUCjpm0QQsErngZBNNOMu7SVKMGZQy6h6mdiJ9rDzIozroZE3OrCOZ2dNP25n4HHC3X9pkTpXHdB7M+Jy0zoM5Fbr344k2B02N2ujs9xNpKi9Sux1anX51EpXdGOcYEpdnfxnfZP/5B/6HWiIKZW5kc3RyZWFtCmVuZG9iagoyNSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDM5NSA+PgpzdHJlYW0KeJw9UktuxUAI2+cUXKDS8JvPeVJV3bz7b2tDUqkqvIkxxjB9ypC55UtdEnGFybderls8pnwuW1qZeYi7i40lPrbcl+4htl10LrE4HUfyCzKdKkSozarRofhCloUHkE7woQvCfTn+4y+AwdewDbjhPTJBsCTmKULGblEZmhJBEWHnkRWopFCfWcLfUe7r9zIFam+MpQtjHPQJtAVCbUjEAupAAETslFStkI5nJBO/Fd1nYhxg59GyAa4ZVESWe+zHiKnOqIy8RMQ+T036KJZMLVbGblMZX/yUjNR8dAUqqTTylPLQVbPQC1iJeRL2OfxI+OfWbCGGOm7W8onlHzPFMhLOYEs5YKGX40fg21l1Ea4dubjOdIEfldZwTLTrfsj1T/5021rNdbxyCKJA5U1B8LsOrkaxxMQyPp2NKXqiLLAamrxGM8FhEBHW98PIAxr9crwQNKdrIrRYIpu1YkSNimxzPb0E1kzvxTnWwxPCbO+d1qGyMzMqIYLauoZq60B2s77zcLafPzPoom0KZW5kc3RyZWFtCmVuZG9iagoyNiAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDI0OSA+PgpzdHJlYW0KeJxNUUmKAzAMu+cV+kAhXpO8p0OZQ+f/18oOhTkECa+Sk5aYWAsPMYQfLD34kSFzN/0bfqLZu1l6ksnZ/5jnIlNR+FKoLmJCXYgbz6ER8D2haxJZsb3xOSyjmXO+Bx+FuAQzoQFjfUkyuajmlSETTgx1HA5apMK4a2LD4lrRPI3cbvtGZmUmhA2PZELcGICIIOsCshgslDY2EzJZzgPtDckNWmDXqRtRi4IrlNYJdKJWxKrM4LPm1nY3Qy3y4Kh98fpoVpdghdFL9Vh4X4U+mKmZdu6SQnrhTTsizB4KpDI7LSu1e8TqboH6P8tS8P3J9/gdrw/N/FycCmVuZHN0cmVhbQplbmRvYmoKMjcgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCA5NCA+PgpzdHJlYW0KeJxFjcERwCAIBP9UQQkKCtpPJpOH9v+NEDJ8YOcO7oQFC7Z5Rh8FlSZeFVgHSmPcUI9AveFyLcncBQ9wJ3/a0FScltN3aZFJVSncpBJ5/w5nJpCoedFjnfcLY/sjPAplbmRzdHJlYW0KZW5kb2JqCjI4IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNzIgPj4Kc3RyZWFtCnicMzK3UDBQsDQBEoYWJgrmZgYKKYZcQL6piblCLhdIDMTKAbMMgLQlnIKIZ4CYIG0QxSAWRLGZiRlEHZwBkcvgSgMAJdsWyQplbmRzdHJlYW0KZW5kb2JqCjI5IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNDcgPj4Kc3RyZWFtCnicMzK3UDBQsDQBEoYWJgrmZgYKKYZclhBWLhdMLAfMAtGWcAoinsGVBgC5Zw0nCmVuZHN0cmVhbQplbmRvYmoKMzAgMCBvYmoKPDwgL0JCb3ggWyAtMTAyMSAtNDYzIDE3OTQgMTIzMyBdIC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMzkKL1N1YnR5cGUgL0Zvcm0gL1R5cGUgL1hPYmplY3QgPj4Kc3RyZWFtCnic4zI0MFMwNjVVyOUyNzYCs3LALCNzIyALJItgQWQzuNIAFfMKfAplbmRzdHJlYW0KZW5kb2JqCjMxIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTYzID4+CnN0cmVhbQp4nEWQOxIDIQxDe06hI/gjAz7PZjIpNvdvY9hsUsDTWCCDuxOC1NqCieiCh7Yl3QXvrQRnY/zpNm41EuQEdYBWpONolFJ9ucVplXTxaDZzKwutEx1mDnqUoxmgEDoV3u2i5HKm7s75Q3D1X/W/Yt05m4mBycodCM3qU9z5NjuiurrJ/qTH3KzXfivsVWFpWUvLCbedu2ZACdxTOdqrPT8fCjr2CmVuZHN0cmVhbQplbmRvYmoKMzIgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMTggPj4Kc3RyZWFtCnicPVC5jQQxDMtdhRpYwHrtqWcWi0um//RI+fYi0RZFUio1mZIpL3WUJVlT3jp8lsQOeYblbmQ2JSpFL5OwJffQCvF9ieYU993VlrNDNJdoOX4LMyqqGx3TSzaacCoTuqDcwzP6DW10A1aHHrFbINCkYNe2IHLHDxgMwZkTiyIMSk0G/65yj59eixs+w/FDFJGSDuY1/1j98nMNr1OPJ5Fub77iXpypDgMRHJKavCNdWLEuEhFpNUFNz8BaLYC7t17+G7QjugxA9onEcZpSjqG/a3Clzy/lJ1PYCmVuZHN0cmVhbQplbmRvYmoKMzMgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCA4MyA+PgpzdHJlYW0KeJxFjLsNwDAIRHumYAR+JvY+UZTC3r8NECVuuCfdPVwdCZkpbjPDQwaeDCyGXXGB9JYwC1xHUI6d7KNh1b7qBI31plLz7w+Unuys4obrAQJCGmYKZW5kc3RyZWFtCmVuZG9iagozNCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDUxID4+CnN0cmVhbQp4nDM2tFAwUDA0MAeSRoZAlpGJQoohF0gAxMzlggnmgFkGQBqiOAeuJocrgysNAOG0DZgKZW5kc3RyZWFtCmVuZG9iagozNSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE2MCA+PgpzdHJlYW0KeJxFkDkSAzEIBHO9gidIXIL3rMu1wfr/qQfWR6LpAjQcuhZNynoUaD7psUahutBr6CxKkkTBFpIdUKdjiDsoSExIY5JIth6DI5pYs12YmVQqs1LhtGnFwr/ZWtXIRI1wjfyJ6QZU/E/qXJTwTYOvkjH6GFS8O4OMSfheRdxaMe3+RDCxGfYJb0UmBYSJsanZvs9ghsz3Ctc4x/MNTII36wplbmRzdHJlYW0KZW5kb2JqCjM2IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMzM0ID4+CnN0cmVhbQp4nC1SS3LFIAzbcwpdoDP4B+Q86XS6eL3/tpKTRUYOYPQx5YaJSnxZILej1sS3jcxAheGvq8yFz0jbyDqIy5CLuJIthXtELOQxxDzEgu+r8R4e+azMybMHxi/Zdw8r9tSEZSHjxRnaYRXHYRXkWLB1Iap7eFOkw6kk2OOL/z7Fcy0ELXxG0IBf5J+vjuD5khZp95ht0656sEw7qqSwHGxPc14mX1pnuToezwfJ9q7YEVK7AhSFuTPOc+Eo01ZGtBZ2NkhqXGxvjv1YStCFblxGiiOQn6kiPKCkycwmCuKPnB5yKgNh6pqudHIbVXGnnsw1m4u3M0lm675IsZnCeV04s/4MU2a1eSfPcqLUqQjvsWdL0NA5rp69lllodJsTvKSEz8ZOT06+VzPrITkVCaliWlfBaRSZYgnbEl9TUVOaehn++/Lu8Tt+/gEsc3xzCmVuZHN0cmVhbQplbmRvYmoKMzcgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAzMjAgPj4Kc3RyZWFtCnicNVJLbgUxCNvPKbhApfBPzvOqqou++29rE70VTDBg4ykvWdJLvtQl26XD5Fsf9yWxQt6P7ZrMUsX3FrMUzy2vR88Rty0KBFETPViZLxUi1M/06DqocEqfgVcItxQbvINJAINq+AcepTMgUOdAxrtiMlIDgiTYc2lxCIlyJol/pLye3yetpKH0PVmZy9+TS6XQHU1O6AHFysVJoF1J+aCZmEpEkpfrfbFC9IbAkjw+RzHJgOw2iW2iBSbnHqUlzMQUOrDHArxmmtVV6GDCHocpjFcLs6gebPJbE5WkHa3jGdkw3sswU2Kh4bAF1OZiZYLu5eM1r8KI7VGTXcNw7pbNdwjRaP4bFsrgYxWSgEensRINaTjAiMCeXjjFXvMTOQ7AiGOdmiwMY2gmp3qOicDQnrOlYcbHHlr18w9U6XyHCmVuZHN0cmVhbQplbmRvYmoKMzggMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxOCA+PgpzdHJlYW0KeJwzNrRQMIDDFEOuNAAd5gNSCmVuZHN0cmVhbQplbmRvYmoKMzkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxMzMgPj4Kc3RyZWFtCnicRY9LDgQhCET3nKKOwMcf53Ey6YVz/+2AnW4TYz2FVIG5gqE9LmsDnRUfIRm28beplo5FWT5UelJWD8ngh6zGyyHcoCzwgkkqhiFQi5gakS1lbreA2zYNsrKVU6WOsIujMI/2tGwVHl+iWyJ1kj+DxCov3OO6Hcil1rveoou+f6QBMQkKZW5kc3RyZWFtCmVuZG9iago0MCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDM0MCA+PgpzdHJlYW0KeJw1UjluBDEM6/0KfSCAbtvv2SBIkfy/DanZFANxdFKUO1pUdsuHhVS17HT5tJXaEjfkd2WFxAnJqxLtUoZIqLxWIdXvmTKvtzVnBMhSpcLkpORxyYI/w6WnC8f5trGv5cgdjx5YFSOhRMAyxcToGpbO7rBmW36WacCPeIScK9Ytx1gFUhvdOO2K96F5LbIGiL2ZlooKHVaJFn5B8aBHjX32GFRYINHtHElwjIlQkYB2gdpIDDl7LHZRH/QzKDET6NobRdxBgSWSmDnFunT03/jQsaD+2Iw3vzoq6VtaWWPSPhvtlMYsMul6WPR089bHgws076L859UMEjRljZLGB63aOYaimVFWeLdDkw3NMcch8w6ewxkJSvo8FL+PJRMdlMjfDg2hf18eo4ycNt4C5qI/bRUHDuKzw165gRVKF2uS9wGpTOiB6f+v8bW+19cfHe2AxgplbmRzdHJlYW0KZW5kb2JqCjQxIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjUxID4+CnN0cmVhbQp4nC1RSXIDQQi7zyv0hGan32OXK4fk/9cIygcGDYtAdFrioIyfICxXvOWRq2jD3zMxgt8Fh34r121Y5EBUIEljUDWhdvF69B7YcZgJzJPWsAxmrA/8jCnc6MXhMRlnt9dl1BDsXa89mUHJrFzEJRMXTNVhI2cOP5kyLrRzPTcg50ZYl2GQblYaMxKONIVIIYWqm6TOBEESjK5GjTZyFPulL490hlWNqDHscy1tX89NOGvQ7Fis8uSUHl1xLicXL6wc9PU2AxdRaazyQEjA/W4P9XOyk994S+fOFtPje83J8sJUYMWb125ANtXi37yI4/uMr+fn+fwDX2BbiAplbmRzdHJlYW0KZW5kb2JqCjQyIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTc0ID4+CnN0cmVhbQp4nE2QSQ5DIQxD95zCF6iEM8DnPL+qumjvv61DB3WB/OQgcDw80HEkLnRk6IyOK5sc48CzIGPi0Tj/ybg+xDFB3aItWJd2x9nMEnPCMjECtkbJ2TyiwA/HXAgSZJcfvsAgIl2P+VbzWZP0z7c73Y+6tGZfPaLAiewIxbABV4D9useBS8L5XtPklyolYxOH8oHqIlI2O6EQtVTscqqKs92bK3AV9PzRQ+7tBbUjPN8KZW5kc3RyZWFtCmVuZG9iago0MyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDc1ID4+CnN0cmVhbQp4nDO1NFIwUDA2ABKmZkYKpibmCimGXEA+iJXLZWhkCmblcBlZmilYWAAZJmbmUCGYhhwuY1NzoAFARcamYBqqP4crgysNAJWQEu8KZW5kc3RyZWFtCmVuZG9iago0NCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE0MSA+PgpzdHJlYW0KeJw9j8EOwzAIQ+/5Cv9ApNgpoXxPp2qH7v+vI0u7C3oCY4yF0NAbqprDhmCb48XSJVRr+BTFQCU3yJlgDqWk0h1HkXpiOBhcHrQbjuKx6PoRu5JmfdDGQrolaIB7rFNp3KZxE8QdNQXqKeqco7wQuZ+pZ9g0kt00s5JzuA2/e89T1/+nq7zL+QW9dy7+CmVuZHN0cmVhbQplbmRvYmoKNDUgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMTUgPj4Kc3RyZWFtCnicNVE5DgMhDOz3Ff5AJIwveE+iKM3+v82M0VYewVyGtJQhmfJSk6gh5VM+epkunLrc18xqNOeWtC1zgLi2vC+tksCJZoiDwWmYuAGaPAFD19GoUUMXHtDUpVMosNwEPoq3bg/dY7WBl7Yh54kgYigZLEHNqUUTFm3PJ6Q1v16LG96X7d3IU6XGlhiBBgFWOBzX6NfwlT1PJtF0FTLUqzXLGAkTRSI8+Y6m1RPrWjTSMhLUxhGsagO8O/0wTgAAE3HLAmSfSpSz5MRvsfSzBlf6/gGfR1SWCmVuZHN0cmVhbQplbmRvYmoKMTUgMCBvYmoKPDwgL0Jhc2VGb250IC9EZWphVnVTYW5zIC9DaGFyUHJvY3MgMTYgMCBSCi9FbmNvZGluZyA8PAovRGlmZmVyZW5jZXMgWyAzMiAvc3BhY2UgNDYgL3BlcmlvZCA0OCAvemVybyAvb25lIC90d28gL3RocmVlIC9mb3VyIC9maXZlIC9zaXggNTYKL2VpZ2h0IDY1IC9BIDY4IC9EIDc2IC9MIDk3IC9hIC9iIC9jIC9kIC9lIDEwNSAvaSAxMDggL2wgMTEwIC9uIC9vIDExNCAvcgovcyAvdCAvdSAvdiAxMjEgL3kgXQovVHlwZSAvRW5jb2RpbmcgPj4KL0ZpcnN0Q2hhciAwIC9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnREZXNjcmlwdG9yIDE0IDAgUgovRm9udE1hdHJpeCBbIDAuMDAxIDAgMCAwLjAwMSAwIDAgXSAvTGFzdENoYXIgMjU1IC9OYW1lIC9EZWphVnVTYW5zCi9TdWJ0eXBlIC9UeXBlMyAvVHlwZSAvRm9udCAvV2lkdGhzIDEzIDAgUiA+PgplbmRvYmoKMTQgMCBvYmoKPDwgL0FzY2VudCA5MjkgL0NhcEhlaWdodCAwIC9EZXNjZW50IC0yMzYgL0ZsYWdzIDMyCi9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnROYW1lIC9EZWphVnVTYW5zIC9JdGFsaWNBbmdsZSAwCi9NYXhXaWR0aCAxMzQyIC9TdGVtViAwIC9UeXBlIC9Gb250RGVzY3JpcHRvciAvWEhlaWdodCAwID4+CmVuZG9iagoxMyAwIG9iagpbIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwCjYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgMzE4IDQwMSA0NjAgODM4IDYzNgo5NTAgNzgwIDI3NSAzOTAgMzkwIDUwMCA4MzggMzE4IDM2MSAzMTggMzM3IDYzNiA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiA2MzYKNjM2IDYzNiAzMzcgMzM3IDgzOCA4MzggODM4IDUzMSAxMDAwIDY4NCA2ODYgNjk4IDc3MCA2MzIgNTc1IDc3NSA3NTIgMjk1CjI5NSA2NTYgNTU3IDg2MyA3NDggNzg3IDYwMyA3ODcgNjk1IDYzNSA2MTEgNzMyIDY4NCA5ODkgNjg1IDYxMSA2ODUgMzkwIDMzNwozOTAgODM4IDUwMCA1MDAgNjEzIDYzNSA1NTAgNjM1IDYxNSAzNTIgNjM1IDYzNCAyNzggMjc4IDU3OSAyNzggOTc0IDYzNCA2MTIKNjM1IDYzNSA0MTEgNTIxIDM5MiA2MzQgNTkyIDgxOCA1OTIgNTkyIDUyNSA2MzYgMzM3IDYzNiA4MzggNjAwIDYzNiA2MDAgMzE4CjM1MiA1MTggMTAwMCA1MDAgNTAwIDUwMCAxMzQyIDYzNSA0MDAgMTA3MCA2MDAgNjg1IDYwMCA2MDAgMzE4IDMxOCA1MTggNTE4CjU5MCA1MDAgMTAwMCA1MDAgMTAwMCA1MjEgNDAwIDEwMjMgNjAwIDUyNSA2MTEgMzE4IDQwMSA2MzYgNjM2IDYzNiA2MzYgMzM3CjUwMCA1MDAgMTAwMCA0NzEgNjEyIDgzOCAzNjEgMTAwMCA1MDAgNTAwIDgzOCA0MDEgNDAxIDUwMCA2MzYgNjM2IDMxOCA1MDAKNDAxIDQ3MSA2MTIgOTY5IDk2OSA5NjkgNTMxIDY4NCA2ODQgNjg0IDY4NCA2ODQgNjg0IDk3NCA2OTggNjMyIDYzMiA2MzIgNjMyCjI5NSAyOTUgMjk1IDI5NSA3NzUgNzQ4IDc4NyA3ODcgNzg3IDc4NyA3ODcgODM4IDc4NyA3MzIgNzMyIDczMiA3MzIgNjExIDYwNQo2MzAgNjEzIDYxMyA2MTMgNjEzIDYxMyA2MTMgOTgyIDU1MCA2MTUgNjE1IDYxNSA2MTUgMjc4IDI3OCAyNzggMjc4IDYxMiA2MzQKNjEyIDYxMiA2MTIgNjEyIDYxMiA4MzggNjEyIDYzNCA2MzQgNjM0IDYzNCA1OTIgNjM1IDU5MiBdCmVuZG9iagoxNiAwIG9iago8PCAvQSAxNyAwIFIgL0QgMTggMCBSIC9MIDE5IDAgUiAvYSAyMCAwIFIgL2IgMjEgMCBSIC9jIDIyIDAgUiAvZCAyMyAwIFIKL2UgMjQgMCBSIC9laWdodCAyNSAwIFIgL2ZpdmUgMjYgMCBSIC9mb3VyIDI3IDAgUiAvaSAyOCAwIFIgL2wgMjkgMCBSCi9uIDMxIDAgUiAvbyAzMiAwIFIgL29uZSAzMyAwIFIgL3BlcmlvZCAzNCAwIFIgL3IgMzUgMCBSIC9zIDM2IDAgUgovc2l4IDM3IDAgUiAvc3BhY2UgMzggMCBSIC90IDM5IDAgUiAvdGhyZWUgNDAgMCBSIC90d28gNDEgMCBSIC91IDQyIDAgUgovdiA0MyAwIFIgL3kgNDQgMCBSIC96ZXJvIDQ1IDAgUiA+PgplbmRvYmoKMyAwIG9iago8PCAvRjEgMTUgMCBSID4+CmVuZG9iago0IDAgb2JqCjw8IC9BMSA8PCAvQ0EgMCAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMiA8PCAvQ0EgMSAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMyA8PCAvQ0EgMSAvVHlwZSAvRXh0R1N0YXRlIC9jYSAwLjUgPj4gPj4KZW5kb2JqCjUgMCBvYmoKPDwgPj4KZW5kb2JqCjYgMCBvYmoKPDwgPj4KZW5kb2JqCjcgMCBvYmoKPDwgL0YxLURlamFWdVNhbnMtbWludXMgMzAgMCBSID4+CmVuZG9iagoyIDAgb2JqCjw8IC9Db3VudCAxIC9LaWRzIFsgMTEgMCBSIF0gL1R5cGUgL1BhZ2VzID4+CmVuZG9iago0NiAwIG9iago8PCAvQ3JlYXRpb25EYXRlIChEOjIwMjEwOTE2MTQzNzE3KzAyJzAwJykKL0NyZWF0b3IgKE1hdHBsb3RsaWIgdjMuNC4zLCBodHRwczovL21hdHBsb3RsaWIub3JnKQovUHJvZHVjZXIgKE1hdHBsb3RsaWIgcGRmIGJhY2tlbmQgdjMuNC4zKSA+PgplbmRvYmoKeHJlZgowIDQ3CjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMDAxNiAwMDAwMCBuIAowMDAwMDIxNDQwIDAwMDAwIG4gCjAwMDAwMjExNzcgMDAwMDAgbiAKMDAwMDAyMTIwOSAwMDAwMCBuIAowMDAwMDIxMzQ5IDAwMDAwIG4gCjAwMDAwMjEzNzAgMDAwMDAgbiAKMDAwMDAyMTM5MSAwMDAwMCBuIAowMDAwMDAwMDY1IDAwMDAwIG4gCjAwMDAwMDAzOTkgMDAwMDAgbiAKMDAwMDAxMTc5OCAwMDAwMCBuIAowMDAwMDAwMjA4IDAwMDAwIG4gCjAwMDAwMTE3NzYgMDAwMDAgbiAKMDAwMDAxOTc5MCAwMDAwMCBuIAowMDAwMDE5NTkwIDAwMDAwIG4gCjAwMDAwMTkxMzcgMDAwMDAgbiAKMDAwMDAyMDg0MyAwMDAwMCBuIAowMDAwMDExODE4IDAwMDAwIG4gCjAwMDAwMTE5ODEgMDAwMDAgbiAKMDAwMDAxMjIxOCAwMDAwMCBuIAowMDAwMDEyMzUxIDAwMDAwIG4gCjAwMDAwMTI3MzEgMDAwMDAgbiAKMDAwMDAxMzA0OCAwMDAwMCBuIAowMDAwMDEzMzUzIDAwMDAwIG4gCjAwMDAwMTM2NTcgMDAwMDAgbiAKMDAwMDAxMzk3OSAwMDAwMCBuIAowMDAwMDE0NDQ3IDAwMDAwIG4gCjAwMDAwMTQ3NjkgMDAwMDAgbiAKMDAwMDAxNDkzNSAwMDAwMCBuIAowMDAwMDE1MDc5IDAwMDAwIG4gCjAwMDAwMTUxOTggMDAwMDAgbiAKMDAwMDAxNTM3MCAwMDAwMCBuIAowMDAwMDE1NjA2IDAwMDAwIG4gCjAwMDAwMTU4OTcgMDAwMDAgbiAKMDAwMDAxNjA1MiAwMDAwMCBuIAowMDAwMDE2MTc1IDAwMDAwIG4gCjAwMDAwMTY0MDggMDAwMDAgbiAKMDAwMDAxNjgxNSAwMDAwMCBuIAowMDAwMDE3MjA4IDAwMDAwIG4gCjAwMDAwMTcyOTggMDAwMDAgbiAKMDAwMDAxNzUwNCAwMDAwMCBuIAowMDAwMDE3OTE3IDAwMDAwIG4gCjAwMDAwMTgyNDEgMDAwMDAgbiAKMDAwMDAxODQ4OCAwMDAwMCBuIAowMDAwMDE4NjM1IDAwMDAwIG4gCjAwMDAwMTg4NDkgMDAwMDAgbiAKMDAwMDAyMTUwMCAwMDAwMCBuIAp0cmFpbGVyCjw8IC9JbmZvIDQ2IDAgUiAvUm9vdCAxIDAgUiAvU2l6ZSA0NyA+PgpzdGFydHhyZWYKMjE2NTcKJSVFT0YK\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:17.021702\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 1.2046984434127808\n", "Layer 2 - Variance: 0.5917537212371826\n", "Layer 4 - Variance: 0.2959783673286438\n", "Layer 6 - Variance: 0.24997730553150177\n", "Layer 8 - Variance: 0.2727622389793396\n"]}], "source": ["model = BaseNetwork(act_fn=nn.Tanh()).to(device)\n", "xavier_init(model)\n", "visualize_gradients(model, print_variance=True)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "6f1db629", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.089367, "end_time": "2021-09-16T12:37:18.189510", "exception": false, "start_time": "2021-09-16T12:37:18.100143", "status": "completed"}, "tags": []}, "source": ["Although the variance decreases over depth, it is apparent that the activation distribution becomes more focused on the low values.\n", "Therefore, our variance will stabilize around 0.25 if we would go even deeper.\n", "Hence, we can conclude that the Xavier initialization works well for Tanh networks.\n", "But what about ReLU networks?\n", "Here, we cannot take the previous assumption of the non-linearity becoming linear for small values.\n", "The ReLU activation function sets (in expectation) half of the inputs to 0 so that also the expectation of the input is not zero.\n", "However, as long as the expectation of $W$ is zero and $b=0$, the expectation of the output is zero.\n", "The part where the calculation of the ReLU initialization differs from the identity is when determining $\\text{Var}(w_{ij}x_{j})$:\n", "\n", "$$\\text{Var}(w_{ij}x_{j})=\\underbrace{\\mathbb{E}[w_{ij}^2]}_{=\\text{Var}(w_{ij})}\\mathbb{E}[x_{j}^2]-\\underbrace{\\mathbb{E}[w_{ij}]^2}_{=0}\\mathbb{E}[x_{j}]^2=\\text{Var}(w_{ij})\\mathbb{E}[x_{j}^2]$$\n", "\n", "If we assume now that $x$ is the output of a ReLU activation (from a previous layer, $x=max(0,\\tilde{y})$), we can calculate the expectation as follows:\n", "\n", "\n", "$$\n", "\\begin{split}\n", " \\mathbb{E}[x^2] & =\\mathbb{E}[\\max(0,\\tilde{y})^2]\\\\\n", " & =\\frac{1}{2}\\mathbb{E}[{\\tilde{y}}^2]\\hspace{2cm}\\tilde{y}\\text{ is zero-centered and symmetric}\\\\\n", " & =\\frac{1}{2}\\text{Var}(\\tilde{y})\n", "\\end{split}$$\n", "\n", "Thus, we see that we have an additional factor of 1/2 in the equation, so that our desired weight variance becomes $2/d_x$.\n", "This gives us the Kaiming initialization (see [He, K. et al.\n", "(2015)](https://arxiv.org/pdf/1502.01852.pdf)).\n", "Note that the Kaiming initialization does not use the harmonic mean between input and output size.\n", "In their paper (Section 2.2, Backward Propagation, last paragraph), they argue that using $d_x$ or $d_y$ both lead to stable gradients throughout the network, and only depend on the overall input and output size of the network.\n", "Hence, we can use here only the input $d_x$:"]}, {"cell_type": "code", "execution_count": 19, "id": "264d9108", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:18.387896Z", "iopub.status.busy": "2021-09-16T12:37:18.387392Z", "iopub.status.idle": "2021-09-16T12:37:30.823238Z", "shell.execute_reply": "2021-09-16T12:37:30.822814Z"}, "papermill": {"duration": 12.536784, "end_time": "2021-09-16T12:37:30.823357", "exception": false, "start_time": "2021-09-16T12:37:18.286573", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:22.161434\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["layers.0.weight - Variance: 3.414905950194225e-05\n", "layers.2.weight - Variance: 3.843478407361545e-05\n", "layers.4.weight - Variance: 4.713246744358912e-05\n", "layers.6.weight - Variance: 0.00010930334246950224\n", "layers.8.weight - Variance: 0.0017839515348896384\n"]}, {"data": {"application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUgovR3JvdXAgPDwgL0NTIC9EZXZpY2VSR0IgL1MgL1RyYW5zcGFyZW5jeSAvVHlwZSAvR3JvdXAgPj4KL01lZGlhQm94IFsgMCAwIDg5NC4wMjUgMjE2LjY2NTYyNSBdIC9QYXJlbnQgMiAwIFIgL1Jlc291cmNlcyA4IDAgUgovVHlwZSAvUGFnZSA+PgplbmRvYmoKOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDEyIDAgUiA+PgpzdHJlYW0KeJzVnU+TJLmR3e/1KfK4e2AQ/+E4kkZpzNZ04e6YdJDpMJqdlYbWPWvkkFzbb6/3gMgID6RXc6o7uyhwbGhVb7KQ4b9AIOCAw93f/vDy69/42//5+Yb/u7nbH/Dvf+Dnb/j7i8NvH1+kpc2FjJ8/HD8HX7ZScsGPH/Cxy6//9+Xl317c1nwtqboscpt/Sc35VlyV25/4pd88fOD45WX69MtLapvga1LY2vjCjy/49FZS9BKU/EHLvsomd31v4aL1a/7j7aFxH8omfv8H7cS8tduffrj9j9tPt1//Jgxw/4R//4B/O7iXX//uh7/++P0P//zNb2/f//xS/BZKSrFcrvhUL1fx8i8vv7/98d6w23zGTbm33X/9Zldf/vjiQe5XDv8pR0jZFR9CjbeQN+/Y3PcfX3777e3X/9XfvL99+28vbcNdqq1I5a389l9f/uftH/I/3v7X7dt/evkv38J4tznPRp366fuPbOFXv/vhD9/997/8y3c//fyrjz/+9Jefb7/799vvX37fr/b5zLxPW4ulzLf5lJ9AzXu/JbaG7lXK69jcAcudsL6a5UW24kR8ulp+ys+wvNBWtBaz/CLDszZcNVTQpURqleSav7Wt6jbCtY3f/CO+d2uhOon83+0fvv/zj3/97s8//vtPt79+9+Hnr8/2lw8cn4c1hN5+lK1KbrF+sitt79CZ9hZr3FJxrqarwaf8hQb3XuCiFGlOPmmzfzebvYtbyymHyWilf6HVDd29JokliP/0nQ7vZ3VMW0rS8Ba4Wn3qX2i1j7CntCC51BA/aXZ8P7ML3gsppjCbfepfanZxWy2xRrxlP211ug6VbONXbA1jX0RDIhhso7+OttNI+bsffvr5xz//5xfii8PM0PBotuBzxUwqjTlViJgAbOU+q0p9EobbGqKElPLO+vLHN/3HL9Mfv7zkvGXMDMFGj665btLwqMllcL2q1bvkqV5auKuYlf32/3/jDTMxw0MLDvc6Xoy/qpjk5vKIZFeXMN4yE8aHDU8KLu5q/EWtDc/nI5JdXcN4w0wYn3B9objpzl9UzoPiA5K7uobxhpkwvm7eS5Tpzp9qpGtR0PoFiVLXMN4wE8a3LYXka7oaf6owM/mUwgOSu7qG8YaZH1+q32poLbeL8UqFmbXF7B+Q3NUljLfMhPFpczHXdL3zSo1byDn2cf2C5FDXMN4wE8ZjbpN8DtOdP1V4GgE+a5yRHOoaxhtmwnh45qlGP935U4XD4aXibk9IDnUN4w0zP76IgzcVXbveeaWmLZac+1xOtaDUJYy3zITxfHpF5HrnlZo3NCIzEqWuYbxhJozHfL2kUqY7f6rwwioGN7kiUeoaxhtmwnjM16tLOV+NP1XZ4Cu6WK5IlLqG8YaZH18avORawrXXH2L2mNBGNH4BotUlTDeMhOVhSxJcuN72U4UP69BeaxcgSlzDdMNK2J62KlX81fS7CB8OrnvwcuGh1TVMfzQSltfNYYLaprt+qEU2ByfGhwsQJa5humElbG9bbC1enblDrHULFXO5Kw+trmH6o5EfX7zDLN1l+KTXLcVTxhMeXeCWjEKixSWMN+2k+Zine9dm6+8qfwrNdztVE1pexPxHO2k9Juq+1Mum7Qct++C2mhOmsVcsWl4EgGEqCWC2HkIOfiJwyD4K/LfaSpzIaH0RBIat3G3GnD1IdFMnOGWfMdzDoYsyodH6GggsW4kAM/fYN3evCA7Zl4Ivxp/HCY3WF0Fg2EoEmL/HJnXuBYfsa8Dgn9DWhEbriyAwbCUCzONTLmXuBYfsS8OND87NaLS+CALDViAI3Br0KU294JR9bpjs+jaBOdU1zLfspPmY0+ca4tQDTtmn2Hdd+1arxqL1RRAYthJB4g6w83VCcMg+YArsorQwodH6IggMW4kAhhQRN0XZnTJfAA7Dn8QJjdYXQWDYSgSY6MO1kbkXHHKLmABiElCvZLS8CADDUgCIfitook594JRr20op1aUrFy2vAcCylADi1qTw768ADhmuvziXYrxy0fIiAAxLCaDQxWtp7gGHXDxG/5JcmbgoeREAhqUEIFtutYa5BxxyZkQl93avXLS8CADDUgBImOa7mP3UA045e0yCMQEoExclrwHAspQAMMl3Dc1MAE5ZMPCnsautG1HyIgAMSwkAU3yfncQJwCkz3C2GHtChG1HyIgAMSwkAU/zgpMw94JS57J/F1YmLkhcBYFgKABlT/FC4e38BoOS0Rd+yaxMXJa8BwLKUADDBjyFNa+NahufTWooTllNdxHzDTprPJS4JYb7/pxy3jLn/WB68UDnkRQAYlhIA5vYpOTff/1PGqFdiqu2ByyEvAsCwlAAwt0+ttrkHnHLcvK+hPGC5q4uYb9jJEzmY2eec6zUAQssM6yvSYx2uVA55DQCWpQSAmX3B9DZNAE5ZBXVOXJaK9TQtJQAucVef5h5wkdsIc5raaAsFP5l2jiNpuYYW5/t/ke+B/FMjS8X3m5byoInbpEr10/2/yvdg/pnLF8b4Xw/HbPn2Hy+vWnTF8c/f3K6HaObDG81jdvrxBWY4RiuN+OQqufQIRu5p8N71OT14hhHWiJ/Ed7VkH/v0Rhzf9IFuT9pygRkUOQw0aX0iSEvHRyNarY1+Ztmca2hhRE5VBksCZAW7FHsMjQBLjU16lFVxqdYeYgbPKrKDEWRrcbTQGHYkoE7fG3/vu+fVwlZKTT7fMv2xUvqqXEtbcjmldCv475LxjqLao7d8vKH/AhIPivZYh+iAufEMnW8u9BjWJjxPJynfKn4SqW5sC2Lq1/BbHGd/QhovRBfgEpSQuGce8ImIOdG+lSwxlr5oKlvA72M92dUNs0iQu3FFAWaP6+Ouo+C6vNzwbG0ue+/3nbgSKj42tiA49Ri65+HEGmM/HFVbkjZ03HnYjk7LPRvwJcqxnZVDrIUnqAEz5jrmMz71c0wu920utIVL2/XC50z651trPsi+JyTJ865QRwcpY6fAo+sE3sQbnxsGCTq/6/jjfoZLOFy6fmCPSHyW3DcY41ad83HsuuH1ioaBijONgI41/I6A56BmPI6dTobpu44bj94d6w03GLev9H7Mw22YpVZ8Bp1XWpYxdkWP+xkjY1sSD/25fW0P/QQXL/0cbHEujhseIxf6aqm3graFz+C+DoTL97Gx10Vh+HuXGS3Tw0HZRdHLy33RJMWMu8nD3aDeY6h8AlKgHLGEeLLT8C4xuW54aEJ/TvDEj7XXvvPS8vC6edUDIT7jQ8aTxydQWnNjmQoeSsG8fAQsC65+eK4Z/QUPWX+wgws9nomTWby+cSGUY0ptb0LwvAb06j5i8NS+3N988H32Ux6ulWQNk+hNn5Cvh/cCxz3ctn0o/5unfK3z/mjRPPz78dWsAfiLN50gfvzWT7btYNQvPZWIkUQybjn/8Tcv6d7S64cK/9t3//nDn/CbPlW4J354S6aGkR5iztjwkOzhmrEhBAyYj4Gk6L+bFylTlI2SL4xUK4+ZG9g7jg/wn899lb5YfQqPGh9wmWaASn7CqXxefmoeIzL///3TOHxFfMwZggc/XteRlfwMfI3zDbSGVzta/Jx0Dl+PQOQCiGeahOkJOOQnEIh4KTS2hlH9i9I6BLwwhMemA4/KPzmvw1fsZm8aZj6zl2E6p67+c3M9vAuEyrlbwMzmwuBQn4SgcGUjSAyVa1pvyv7wLhR6cCYmun7qC0p/EonWOJ+qoWK+Wt+WEeJ9SETM22KJuJ4riVN/EglOpBkAFbjmW96WJeJ9UDADTosBjsQVxak/C0XxeAnHjHG5/Y08Ia+mjmhxnKMZLX5BConPZMvVll+6vPC3V1s+uVSDySL8J/TINgUfca6VkqtT8NEkw83pDuHUyF0+VlvWAGFYTBB1g5eYphP2kywtdOdwauQuLwbCsBggEl6suac8u4BQMv3N0no+Cd2IktcCYVlMEGGLWfBnE4hThsWxpL7UM/G5y4uBMCwmCLzISpQ094hT5uZLiK0+8rnLi4EwLCaIwoXcEuYeccrchYuhb8Ne+RzyYiAMiwmi8TAaF3CvIE65r+COVaaJz11eDIRhMUDwBK54P4UsaTlu8BxzbA98DnktEJbFBIHJlJQ2hS5pGQ5Sw7TNP/A55MVAGBYTREEXD3U6wa9lLtxK6PsVVz6HvBgIw2KCkC01yTlNIE45bcmHEbKhG1HyYiAMiwGiuE0YoTz1CCVnpmwJoU18lLwWCMtiguBr0Pkpx4GWywY/P6WZj5IXA2FYTBC4s547DROIU0Yn8C73E066ESUvBsKwmCDgPQRf2twjDpkvCp5/lysfLS8GwrAYIBgWMJb6LiBOOafuZtaZj5LXAmFZTBDwHiKmylOPOOXCDCApjtfn2YiWFwNhWEwQXD1sLk8c7irz3Dmf+3aypqPkxTA82ksKhQEUEmcMh8zcX41bMxc6Wl0Mg2EwOcBzyK5Mq3SHKlyF8XG8L84mtLwYhkd7QUEw7POs09QbTplhQYlxTnLBc5HXAmGZTBLwG0qYYuROlYFBtTEeZeKj9cVAPFpMDnAbSm117hGHzLOSqeZS6hWQlhcDYZhMEpgb1jjlUzhVD7cq5OrHGp3io/XFQDxazOgHt9XaN9gvIE6ZoWxRcg/9UnyUuhYGy2ByiDwSFafcAkpm7HDFdNpPeJS8GAjDYoLImAt476cn45QZOunxc2oTIK0vhsKwmSgqPsnPTigOmSaLYz7RR0SHvhgKw+aPLxEzAudCkSlI7pR7JGz2DGa9ItL6UihMm4kC3oOTVP2E4pAZ3NqS+JmQkhcDYVhMEHAffAp57hOHzACMFhNcrQmQ1hdDYdhMFIw9c/1EwgXFIfdQdcylSpgQaX0xFIbNRAEvImQJc684ZIwNzaXaI/JVI1peDIRhMUB4nlTwZcpPoGQJmzAHebvy0fJaICyLCQJeRKzc176COOQK2Yec0pWPlhcDYVhMEHAjHoP0lcxSFC2M8x6qES0vBsKwmCDgRiRpZe4Rh5z5qgyuTxpUI1peDIRhMUCMRJ01Tz3ilLnh6cTXCc+proXBspcY4EQUl+PUH04581RWrP2Em6aj5MVAGBYTBFyIUmKY+8Mp94zO/XSWbuNUF8Ng2EsMlQn6eQL0iuGUMSTkVPoxOt2IkhcDYVhMEI1hozJlN9AyXg2SWj9WqRtR8mIgDIsBIoYtSCz1Gjik5dQz3feEBlc+h7wWCMtigkiPNe4+XGS8GxpjZCY+Sl4MhGExQRSrrJ+W45Z9kh4zduVzyIuBMCwmiGaV+NNy3CIPdssDn0NeDIRhMUAkb5X70zKzIvvYa81e+RzyWiAsiwkiWqX/tMxcCr71o9hXPoe8GAjDYoIoVhlALauQ64nPkpHYpsUEIVZJQC2r0wgTn2cdUtBno156spDbL8TymCxkPlnDg2N76L2HRXGv6MfQ4RFHHHPpmfUx5uGRH+HF8KKSR3cPlGuQESwEHzOVGkOfOyQpEmQPr4sl9UQk3BNOJYQuM0uH5xpY3by4kVKNIWiSeKi4z8zRxtgrYSHgJpXeMDPSturu4Ulc8MjplsOWa4tuNMK9tiQ9aCkKU3B0tW0lxhzKLZet+Oj6yiEjeyosYdZLLiQx/K/LfmvZMdEJLjQx9KdvX9WIq2Yc1KgQ42utezyIpIpGbz3BihTZ4yOy5JIC01H44PYYGmEORebnENlqzbnuIQSJ969ntPCYi/b8GkEaXzUeHcrnzAMwI+8Ot9Hg0Ar6H0sbN9AcgSnNbyVJHifhAt5W3ZgW4PXWwHzeuOxaSh1xbo1JjXKLIzsIxvQQx+fxzk+ZwV59PwaX62Tfp/FohclK6jh8MaKjoFd0xjIlzQ+tcBLFfBZjX8f5cagJOldu0541BI9tT4bCzQ/mMGn9MsWj543LYaIYV3pC+sT0/CGP7aQGd7bFMrKGCE/UdzfX+S21GplaAV1OQHgsAzgOU4m563oCEXS58WZ3ecuwltlHfOafjswX0eOusc1yEzwduIFjRuSZAAW9ONxGSbyS76tyTnBp/sYCkfCq2liVw6OM7g0XszA6IuehVu74MW0Js4lU3M+0L1TA2ohnBX3UYXgJZffbXeTJeK5fMA/jGHaD35iKEUZnWJww2IxP89Zw/YtevivZyfBxEnwcz/wyeOJKLwDe5YLrkzRyW6Jr1+EZopuEkkfyH4BqtezugUj2I/1PxoNVxowJvS7ge7v7hIEhjwuJuJ2+MFMYRw+8HaPf5w81JjeqKKLLx36YOTLHJKOJxqo4Qw3bNNJCltwD+G31DVlGXjlr/lqOCrRsHkP/+Gq2C/zFm8+z29/+ye94S9aRUMqW9pcP999/edqR8HdJO4JrxX9/iOXN6G31saiJki+QVCt22pHzA+76VV98YDn5sLns4lRrVslPyBqBNyvAOYwa7ZN5M75W2pGviI/Rxxj1p8SVSn4GPr5R2BrGefeLsm48JIP4egRywERL+vv88gSc8hMIYDjHCMrWPHOYfUHeEYzInJz1LvbcpCNfsZO9aZD5zMQuteqr/9ykI+8CgfWdnWPXukA45SdBKMyu5THPw6TwbVlH3gWDx5wQUy5Mta4clP4kEAwZBldmRMTL721pR94HBfMVVo8p64Ti1J+FAo1L8/BlHGeHb0o78j4o+qmHwrx9VxSn/iwU8Drh1WGYyPQ6PiftSIQbks4WvyDtyGeyfdf1m8SMmfJQ7oVzLaPcyySjF/dbNzVyl88kr0uAMCwmiLwZZV9elU1si4EwTCOIapV/meRaXV8bmBq5y4uBMCwmiGaVgZnkxlzXj3zu8mIgDIsBIgSrHIyW4+bCHkakG1HyWiAsiwkiWWVhtNxj1kfK5onPXV4MhGExQRSrPIyWuQ7nepTAFc9dXQyDYS8xNKtIjJZhcMqtZ0ie6NzlxUAYFgMEM0g/FovRMrOLpNYX4a98DnktEJbFBBGtsjFaZr24VkUe+BzyYiAMiwmiWOVjtMwtKzgFaeKj5MVAGBYThFhlZLRctsAj42Xio+TFQBgWA0RyVkEZLXNzJ4yEpboRJa8FwrKYIPjLQ2EZJffdqjyqSqtGtLwYCMNigshWgRklMxcRpg9jGqH4KHkxEIbFBFGtUjNK5tZorFKveJS6GAbDXmJoVsEZJbN0BytVzHSUvBgIw2KAyGHzgkd96g+nLB5fkpL4Kx8trwXCspggEkt/BJl6xCmz1k1Fw/7KR8uLgTAsJoiyVX5s7hGHjCkTw3BHZIFq5aIvhsKwmShaLzNcJhJ31QfPmtw+TXyUvBiGR3u5yYvJgGOVrSuGU+7bD3hPjNWpE4+W1wJhmUwS8B16Sa4riLvaa5MyUidNfLS+GIhHi8mBEVH8+wnEIfPAfBTfeqSeAqTlxUAYJpMEIwhnDEPrcXzR1x5jqdlofTEIs7UgUPGxiI9MEE6ZcYiuMXXfBY4S14JgGUwO+CWWeXPrUHvVRt/DRK90tL4YiEeLyQE+Q8IfzP3hkFnBOKccx0TqBKTlxUAYJpMEvAZMkKd51KHy9ZjzON6m8Sh5MQyP9pICXIaMBqZJ1Cl7bmfl4Gc8Wl4MhGEySAh8hox33zWOScksX9nK3iFOPqe6FgbLXmKAw1ByalOHOGVpaDe4fgBWNaLlxUAYFhMEa0n6IHN/OOSKV4OrrUeYq0a0vBgIw2KCaJvjdv7cIw651B507+qVj5YXA2FYzOKacBwlSJ56xCmzAC0LgNYrHy2vBcKymCD62Z45Ll/JTB7RoqQJz6kuhsGwlxhwSS1x9fmK4ZDhWIaAZutER8mLgTAsJgh4Dc6FKZeClivr6PZcXLqNU10Mg2Hvx5fs8EmXW7uGCWk5bzWw8OlER8lLgTAtJojA6s088XcFccqJlbdjX4PTjSh5MRCGxQQBv8HXXOYeccqJFc5dP4155XPIi4EwLCaIyurqDBK9gjhlFjkfR1t1G6e6GAbDXmJgLH/zce4PpxwxjWxjKfJK55AXA2FYDBA+PJY+/HCRI96RIj3575XPIa8FwrKYIOA3PBZ71HLkFnfrR2SvfA55MRCGxQRRrGKPWo485+t75vwrn0NeDIRhMUE0q9ijlpkwIZd+znjic5cXA2FYzLN73ir2OMn3kwdTI0seSDAtHocYjWKPk3wPuJ/5rBiHb1pMELikx2KPk3z0iInPk3rEM1OMPJ6icfUeZl9qGO87NCx72DBPJrc48irhb0c2kRQDviUnV8a0MY0MBIykFKhjWp2K30NuIwaM7GS4HbDYuRF3ydS4LY0oRPF+pFhPEU9VEc/AtbpJLaVnIGAoWmDC6cwsI74m31/IKdHVzzl3z1dSzmMFgO8q0JCxAAAw48Np68X1epYRjGO5Fy5O+Hr4ArhvTCfiW9436PD1NUvEDKj09TXZL0R4MN75G5OTiMRx8CD3SDkeB6tti94xZwnluOHaak43pvMIuGHdxlxZsUj8SGpRML+o+0Z5C8zt0cuGu5TLOPnEwtlRCtfAS9pycW6AhV4Bqm8m9UNudddxva1ydxUsmdfD7ypxctcVlx6SHwksuAHnYo6h7QlFpLS46/hK5kK5JA7BrBCQkuPVUE8uSN71WiOzpLCdnPa60Nzait0mfit8zp7Egiprgfc9YGEA7kgpkfinntaOGgToLem+LVTgnqDfsup3j+DuV1lxg5nkhklYeMI5yoi+qIzTY9IXz2jO7GNp+yo6cPecLR7GljiuUfDgVjTSWC0n4DkYFyl94YC5U2piopw8+hE+49F3cHNoM2bJY/mteZ65cKknCKme+TK6DEhoPPaUNyH7fk43tYQblpnvJDM4hMs0+1JFKIlpkhlZFasvZffcWR6z9Op4MH2UwUuNfa1x4zHzxG50d+8/obdy7Sj3ON6Sdl8PfaWN8GfGt4bh+qBjZnS9XooRz27uEVwZ40RjrMaeR6xfdA6p/9z2sav1W2AMhu11+Q15QF45/v1a0gi0bJ4M//hq+gn8xZuPmNvf/snveEseEI5c7U35P9LfJf9HlsARfg6rrUz4/Vh4RckXOKoVO//H8QH+kz7rjfla+gZmY4pe5pnPKT8jfQPGFcwQMRAwzuPd0398PXoFwxDGRXHX9WklP4FeYdO9tcA9jc9J//H1CGAGgGG9v9qvD8AhP4NAY+ostoZR3n9J+o+C68LYjnd+T2H03AwgX/EpfdMw85kPac766j83A8i7QOAsSOBftSuEU34SBEzJsutzKXS9t6UAeRcODIFg6XQfryCU/iQSmHlx+S/Ar8FFvykDyPuQgC/mQ6MTcCVx6k8i4UNl2sVQmNUvvS0DyPuggGOAWw/XZEJx6s9CkZmTHG5OjPAiPysDSKYfebb4BRlAPpPt+y6viFjVZDCNtqrJaJmrEW6koNSNKPlYXlkChGUxQQSrmswkN/j3MvFR8mIgDIsJIlvVZLTMHLlw+ePER8mLgTAsJgjmW32oJqPlCwiTz2IgDIsJolnVZLTM3Pf46/DA55AXA2FYDOcKcxujnoyWYXFhdtZHPnd5KRCmxQSRrHoyWk54PcSWZj5KXgyEYTFBFKuijJbTRt+5R3Vc+RzyYiAMiwlCrIoyWmbi7io7iLMRJS8GwrAYILy3KspoOW+SCmMjZz6HvBYIy2KCiFZFGS3jRRHQZJz4KHkxEIbFBJGtijJaFibZlx7poxtR8mIgDIsJQqyKMkruO4jRx4mPlhcDYVgMEMFZFWWUzET21d9BnHyUvBYIy2KCCFZFGSXD6y+l1OInPkpeDIRhMUFkq6KMkjMre0gbr0/FR8mLgTAsJohqVZRRcmHlNTe2Q1UjWl4MhGExQTQGiLdLdvwPWmYdjZSzn/hoeTEQhsUAEQOjI1h45gLilMVtTUJ1Ex8trwXCspggGDTgc5h6xCk3piGvuU58tLwYCMNigigsLBP93CMO2XNazbThE6CLvhgKw2aigEE5ujb3iUOmDTVzO21CpPXFUBg2c+ebYSwiMvWKU/asXupHXJgmpOS1QFgWEwRXm1IpU584ZZ9ly6H2Cl+aj5IXA2FYTBDwIOYKQx+0zPC85LLrgW8akNYXQ2HYTBTS69XOz8ZdvcQXakBaXwzEo8XgAFcySXBh6hKnzDjOyqys9QpIy2uBsEwmibBVqdNJ91NlgqTgaqkzH60vBuLRYnLIm2sxt7lHHLKPDMQbhTwUH6UuhsEwmBwqnvQ25XA+Vc/TN+j/PXOYpqP1xUA8WkwOcCAcQ9MnEIfcSq+r2Isgnm1odTEMhsHgUAJLXs6PxaHCMnQAGS+Mswktr4XBsJcUEmuo1inrgZLhbmPG0Hqh07MNrS6GwTCYHAqj9vOU9kDJ+DvmWkxXOkpdDINhLzHAawgS3dwdDhlT6OaSa+lKR8uLgTAsBgh08PAYna9knruJzveDPJqPktcCYVlMEDSoSZ16xClz86Lm3HMkaT5KXgyEYTFBcP8ylzL3iFNm3en9WJtuRMmLgTAsJgj4DNmnNPeIUy4bvtCPfU/ViJIXA2FYDBCCP8s1xKlHKDmz4m7yZeKj5LVAWBYTRGDFazclPtBy2kpqrp9t040oeTEQhsUEkXmyVKbEB1qOTD0pD3yUvBgIw2KCYCnrxGxQVxCnzHPM1fXTp1c+h7wYCMNigmiPdQk/XOTIjEnjZPKVzyEvBsKwGCCat0oKaplVo4ofM8sLn0NeC4RlMUGkzSi5qGWWU0uxB1Ne+RzyYiAMiwmiWJUYJ1lqcHXio+TFQBgWE4RYlRhflU1si4EwTBsnG41KjJNcyz6hujZyl5cCYVpMENGqxDjJd4tf4fOlIJ6ZE2Q6VwP/uIfdly3G0o/vx80LI35uDCuXrfDMa8Q0MbY9jDRvxTE9Qc8sEGsbg4LHFSVW42baj+bS8MY9xswqIj2/h0ul9OwUxXMEKWF4arnCRevvXF+2DqdH58E2GTHdvm6Vk9RRmKqm2DMZ4kWEW+D4S+b1gYfsAUvM3MRcC2Qa8oh1DNxxy5W1nmBUimNHMqStosMAW4GT7DDxa3usj0j0XGti06nWtke+lAC3qZ+YDBLTaFlweU24QIevro6FgkZ0SIuSwaHCdPjfZWyL+801DJnC9cuYwO8eQhEygzF5OCzjtvSM8EABmCAReTSxRRfHhnKG/xKKG0VX8E39bnGfGRcDxug/uItl5Pcrief5C7OVcEPF4ab6ofMcaWL6Hs9cJMn5+96tpNDPNJe8RUHXkn2njrOgNHYsW6+YNuQs6Gsj0UhP4TG2swqvgP2ICUISHoUUdx13O5WRgKRKTWOTHP0j9o439JrDCLtiXhhccR6VEwK6eLzvJ+K2pCb9e+G4uPGWTszTVFgKlvto0nwubdfxzHkZaVXw0qpS9u2m2pIwAz1LuMSRHZK7DgldtEcvOOY32WXWLyhMn4L7kVzwaSw6OXYp5okRliCNUuO+RhvF42d43tmFEIeKnginEw8QLhy9THoeF65kFhYkDD2xsfiyfxquemRxd+arwUWngR33IEbcnML4RDwzdXwaLaYWMroJnglXXBhdgGvEGJ0ZzejZR6q/L5gwgbDvT2eOaZwfLbXwKDvzizD/SOzP6fCa0Ysia2nlvjNV7j52C3zgOB5IaLlnzinCTCLsGJS9dzKuG3wwLPZo/Ajc1fUTO5xb4UGWvZyrlLFy0yqT0O0uboXN1tALB9a/Lr8hA8krx89fS1uBls2T6R9fTYDBnCVvPeJuf/snv+MtGUgwnKHDoBX+86ZMJOXvkomk4klzj8G9Iqx4e4V31y541N/bOUjOD/jrl3x5DonauPoYpmS2Sn5CDolamZECwwj81/CJHBJWEpLwhCQkXw+f4L1eWCnlik/JT8AnmFN4tua5SPw5OUi+IgAmACsuTQcFlfwMAHzJsDW8lX4RgPBKCpLRrTHZKD170nNTkHzFh/SXjy+f+YDGqC/9c/OPvAsB4aym1y07CZzakwhgssHCFpj1y6dzjzxk3HgXBt5zxpfgPigISnwSBYa2YG7dIqb1mI794hwT78cBs2OHSWsImsMpPosD5qf04CMTFn6aQ3kl10aF1fls8QtybXwm2PddtuDw/1hUpTLP42NRFS0zc0CtPchQN6LkY9liDRCGxZxeOauoipa5vN/kEc9dXQuDZS8xBKukyquyCW0xEIZpBJGsoipa1g+GamTZB8OymCCqVVRFy3FLmBD3hbIrn0NeDIRh8XDAjLIqWoZTjEZ3EGcjSl4MhGExQIi3yqpouR98HAt2Vz6HvBYIy2KCSFZZFS0nDAsjLv2K564uhsGwlxiKVVRFy5kro3H0B9WIkhcDYVhMEGKVVdEyk+XxRTHxUfJiIAyLAaJ5q6yKlvuydN1BnI0oeS0QlsUEgZfgY1kVJTPPd4Pt5cpHy4uBMCwmiGyVVVEytyck9x0EjedUF8Ng2EsMYhVVUTI8ZWZY71uDmo6SFwNhWPzxRZyziqoouTCvSBqnNlQjWl4KhGkxQQSrqIqS4WsmfGvfndR8lLwYCMNigoDn8FhURcnCXVAXS73y0fJiIAyLCYLFJnLyc484ZOl1Hfh3Vz5KXgyEYTFBtC0Lt1kmEIfs3djE91c+Wl4MhGExQLAoiZQ2xRYpmbEHrgbnZAKk9bVQWDYTBXzIFuqUSEDJPcW1HzE+mpCSFwNhWEwQLKogOacJxCF7RlK1WMoMSOuLoTBsJgr4Dy7FOPeJQ+7FhBh6NSPS+mIoDJu5y+0ZQ+OnpApKvmRP0IjMrApLoLBsJgrWJptq+HzQMktJsYJGjxLUiLS+GArDZqJgYWdf2twrDrlXz3L7CsQFkdIXQ2HYTBTCImssN3ZFcci47bAh1R7gqRFpfTEUhs1AER0XW0KZesUpe+jehdRjAy+IlL4WCstmooA3EZubAsMOlZMHnmTooYkakNYXA/FoMTnAmWCFuhnEIXsfeUa49SxECpCWFwNhmEwS8Caym1b2T5V1AhMd0Ac+Sl8MxKPF5ABfIpf08GgcchPW2pR+IPRsQ6uLYTAMBofEupDBT6+NQ20ZT0JwO4UTjpLXwmDYSwrwIkptdeoNp8yzCrWNQplnG1pdDINhMDnAhahxSrBwqpWFbgMmTlc4Wl4Mw6O9pNC9pzkYXcndpyiSy5WOUhfDYBgMDtlvTnKckgooufAIgx/HOlQjWl4LhGUxQUTcWd+PfF1AHDKjhlMIwU98lLwYCMNigsC43/jZCcQhZ6YzTGOAONtQ6mIYDHuJAS6DY33uCcMh57rVUkOrEx0lLwbCsJix93AZnLDk9gXEKTOLfguxHwfTfJS8FgjLYoLgKYzEI3tXEIecmSo8xr7jq/koeTEQhsUEkXgajmXMryBOmecWY+gB9LoRJS8GwrCYIOpjDcAPFxmDgfAI58RHyYuBMCwmiGZVPdRy2dDgyIWsG1HyYiAMiwGieqvqoZaZPD3kBz5KXguEZTFBJKvqoZbPQMIrniXDC017iQGOw2PNQy3z8HsvbaibOMTFIBjWEoJY9Q61HDeMBl7qA5tDXgyEYTGPFHmr3qGWVdz5lc/TwtGfmWfjeqbG1QB/YD9/kPjOH0dMEjyEHhKGO1ur1NZPwEtyY27IIGvXit/HxrqfgJfI/HU84M6kDCWHUPYI3OhdyCMw1zvfM3MwHtUlCTKyATInSQ/GE75oS/S5v3JZDq1fiGDcDYmpLVKj91b7bJ0hjJJ89X0ml1u+ywl/yoJZt8zKBhiu6x7gJq16BjyWvuc0vrLVzeUsrCWE6Q4uxIU9Cix49utb8Swk0kYulda2hlEf948ocSlthEp5gsfNZ+Zb1qce46RDPwBjrkPUfotD3cNoslR0WC5WCTpMz9khjhF4uaV+PgzNRZf2WJMkkbl08eUu+BrHdqpD/3GO68CB6SLY/bvOoDU4dky4ARC1+J6fQ3zdYgziZGy0Mb1/3PXm0ZOYmyIyV0bpXHC1W0I/aCPlRilwn/vleFwEz8TWvnfZfI49Twr1kh2Py2G6kEvJ/dYJri0EIItdL7gDI6YqMLWJSCldj7jssY4XPH7Bfx1bprAjj68NYcsJf557qozKCfp9i7Xy/vHyA+vO1tE8cwagQ9de1UFcSeOO4FFzvB5cDjpHKZL25ukFusbHDz0Pj+DoS9x1aDWzrr3HFTSMLnUsu45klbif6NAY/Pm/rsvmmd8Dl+loCb4l7QtzANuC3PD08DhlP57IdSr0IKa4wCMYMfUe9wS9HA8xvFKm3cAFjTqZknHL8cC5vsgTXQl++PSBMRawnF3VtxgGYfpzObjai2SlnF0dn8azjmvOPefMGDd2hziXhuGzh38mDg27VxQwA8ZjwNwdwNsTqXBqDFNb7EX6Yho5cXAzmTeHwx4ZogO44VnhBrS05+gQ3KBhTXVcr45llMbMmZtclLl+y8Gtzy1IpIOtGdjqXlEUz2gcl81kkQw/6vmA2L/HUHwZodHweO5M9Q3JOF45iP1a/ga0/HhG+6OdA4KJO950zNv+0tdbf0sCDo5yzNY8/nlLAg65vXL0PmIALD2zg2/SL1A1layT95W9FO8kvCLUyft//fHnP//px//9F/5yORr78v8A7sMH1QplbmRzdHJlYW0KZW5kb2JqCjEyIDAgb2JqCjEwODczCmVuZG9iagoxMCAwIG9iagpbIF0KZW5kb2JqCjE3IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggOTEgPj4Kc3RyZWFtCnicNYy7DcAwCER7prgR+DiA94miFPb+bYgtF9w96YnzbGBknYcjtOMWsqZwU0xSTqh3DGqlNx076CXN/TTJei4a9A9x9RW2mwOSUSSRh0SXy5Vn5V98PgxvHGIKZW5kc3RyZWFtCmVuZG9iagoxOCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE2NCA+PgpzdHJlYW0KeJw9kMERQyEIRO9WsSWAgEA9yWRy+L//a0CTXGQdYPepO4GQUYczw2fiyYPTsTRwbxWMawivI/QITQKTwMTBmngMCwGnYZFjLt9VllWnla6ajZ7XvWNB1WmXNQ1t2oHyrY8/wjXeo/Aa7B5CB7EodG5lWguZWDxrnDvMo8znfk7bdz0YrabUrDdy2dc9OsvUUF5a+4TOaLT9J9cvuzFeH4UUOQgKZW5kc3RyZWFtCmVuZG9iagoxOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDYxID4+CnN0cmVhbQp4nDM1NVcwULC0ABKmpkYK5kaWCimGXEA+iJXLZWhpDmblgFkWxkAGSBmcYQCkwZpzYHpyuDK40gDLFRDMCmVuZHN0cmVhbQplbmRvYmoKMjAgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAzMDcgPj4Kc3RyZWFtCnicPZJLbgMxDEP3PoUuEMD62Z7zpCi6mN5/2ycl6Yoc2RZFapa6TFlTHpA0k4R/6fBwsZ3yO2zPZmbgWqKXieWU59AVYu6ifNnMRl1ZJ8XqhGY6t+hRORcHNk2qn6sspd0ueA7XJp5b9hE/vNCgHtQ1Lgk3dFejZSk0Y6r7f9J7/Iwy4GpMXWxSq3sfPF5EVejoB0eJImOXF+fjQQnpSsJoWoiVd0UDQe7ytMp7Ce7b3mrIsgepmM47KWaw63RSLm4XhyEeyPKo8OWj2GtCz/iwKyX0SNiGM3In7mjG5tTI4pD+3o0ES4+uaCHz4K9u1i5gvFM6RWJkTnKsaYtVTvdQFNO5w70MEPVsRUMpc5HV6l/DzgtrlmwWeEr6BR6j3SZLDlbZ26hO76082dD3H1rXdB8KZW5kc3RyZWFtCmVuZG9iagoyMSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDI0NCA+PgpzdHJlYW0KeJxFkU1yBSEIhPeeoi/wquRXPc+kUllM7r8NzbwkK1qF5gPTAhNH8BJD7ImVEx8yfC/oMny3MjvwOtmZcE+4blzDZcMzYVvgOyrLO15Dd7ZSP52hqu8aOd4uUjV0ZWSfeqGaC8yQiK4RWXQrl3VA05TuUuEabFuCFPVKrCedoDToEcrwd5RrfHUTT6+x5FTNIVrNrRMairBseEHUySQRtQ2LJ5ZzIVH5qhurOi5gkyXi9IDcoJVmfHpSSREwg3ysyWjMAjbQk7tnF8aaSx5Fjlc0mLA7STXwgPfitr73NnGP8xf4hXff/ysOfdcCPn8AS/5dBgplbmRzdHJlYW0KZW5kb2JqCjIyIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjMyID4+CnN0cmVhbQp4nDVRSW7EMAy7+xX8wADW7rwnxaCH9v/XUsoUCEAltrglYmMjAi8x+DmI3PiSNaMmfmdyV/wsT4VHwq3gSRSBl+FedoLLG8ZlPw4zH7yXVs6kxpMMyEU2PTwRMtglEDowuwZ12Gbaib4h4bMjUs1GltPXEvTSKgTKU7bf6YISbav6c/usC2372hNOdnvqSeUTiOeWrMBl4xWTxVgGPVG5SzF9kOpsoSehvCifg2w+aohElyhn4InBwSjQDuy57WfiVSFoXd2nbWOoRkrH078NTU2SCPlECWe2NO4W/n/Pvb7X+w9OIVQRCmVuZHN0cmVhbQplbmRvYmoKMjMgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMzEgPj4Kc3RyZWFtCnicNU85kgQhDMt5hT4wVRjbQL+np7Y22Pl/upKZTpDwIcnTEx2ZeJkjI7Bmx9taZCBm4FNMxb/2tA8TqvfgHiKUiwthhpFw1qzjbp6OF/92lc9YB+82+IpZXhDYwkzWVxZnLtsFY2mcxDnJboxdE7GNda2nU1hHMKEMhHS2w5Qgc1Sk9MmOMuboOJEnnovv9tssdjl+DusLNo0hFef4KnqCNoOi7HnvAhpyQf9d3fgeRbvoJSAbCRbWUWLunOWEX712dB61KBJzQppBLhMhzekqphCaUKyzo6BSUXCpPqforJ9/5V9cLQplbmRzdHJlYW0KZW5kb2JqCjI0IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjQ5ID4+CnN0cmVhbQp4nD1QO45EIQzrOYUv8CTyI3AeRqstZu/frgOaKVBMfrYzJNARgUcMMZSv4yWtoK6Bv4tC8W7i64PCIKtDUiDOeg+IdOymNpETOh2cMz9hN2OOwEUxBpzpdKY9ByY5+8IKhHMbZexWSCeJqiKO6jOOKZ4qe594FiztyDZbJ5I95CDhUlKJyaWflMo/bcqUCjpm0QQsErngZBNNOMu7SVKMGZQy6h6mdiJ9rDzIozroZE3OrCOZ2dNP25n4HHC3X9pkTpXHdB7M+Jy0zoM5Fbr344k2B02N2ujs9xNpKi9Sux1anX51EpXdGOcYEpdnfxnfZP/5B/6HWiIKZW5kc3RyZWFtCmVuZG9iagoyNSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDM5NSA+PgpzdHJlYW0KeJw9UktuxUAI2+cUXKDS8JvPeVJV3bz7b2tDUqkqvIkxxjB9ypC55UtdEnGFybderls8pnwuW1qZeYi7i40lPrbcl+4htl10LrE4HUfyCzKdKkSozarRofhCloUHkE7woQvCfTn+4y+AwdewDbjhPTJBsCTmKULGblEZmhJBEWHnkRWopFCfWcLfUe7r9zIFam+MpQtjHPQJtAVCbUjEAupAAETslFStkI5nJBO/Fd1nYhxg59GyAa4ZVESWe+zHiKnOqIy8RMQ+T036KJZMLVbGblMZX/yUjNR8dAUqqTTylPLQVbPQC1iJeRL2OfxI+OfWbCGGOm7W8onlHzPFMhLOYEs5YKGX40fg21l1Ea4dubjOdIEfldZwTLTrfsj1T/5021rNdbxyCKJA5U1B8LsOrkaxxMQyPp2NKXqiLLAamrxGM8FhEBHW98PIAxr9crwQNKdrIrRYIpu1YkSNimxzPb0E1kzvxTnWwxPCbO+d1qGyMzMqIYLauoZq60B2s77zcLafPzPoom0KZW5kc3RyZWFtCmVuZG9iagoyNiAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDI0OSA+PgpzdHJlYW0KeJxNUUmKAzAMu+cV+kAhXpO8p0OZQ+f/18oOhTkECa+Sk5aYWAsPMYQfLD34kSFzN/0bfqLZu1l6ksnZ/5jnIlNR+FKoLmJCXYgbz6ER8D2haxJZsb3xOSyjmXO+Bx+FuAQzoQFjfUkyuajmlSETTgx1HA5apMK4a2LD4lrRPI3cbvtGZmUmhA2PZELcGICIIOsCshgslDY2EzJZzgPtDckNWmDXqRtRi4IrlNYJdKJWxKrM4LPm1nY3Qy3y4Kh98fpoVpdghdFL9Vh4X4U+mKmZdu6SQnrhTTsizB4KpDI7LSu1e8TqboH6P8tS8P3J9/gdrw/N/FycCmVuZHN0cmVhbQplbmRvYmoKMjcgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCA5NCA+PgpzdHJlYW0KeJxFjcERwCAIBP9UQQkKCtpPJpOH9v+NEDJ8YOcO7oQFC7Z5Rh8FlSZeFVgHSmPcUI9AveFyLcncBQ9wJ3/a0FScltN3aZFJVSncpBJ5/w5nJpCoedFjnfcLY/sjPAplbmRzdHJlYW0KZW5kb2JqCjI4IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNzIgPj4Kc3RyZWFtCnicMzK3UDBQsDQBEoYWJgrmZgYKKYZcQL6piblCLhdIDMTKAbMMgLQlnIKIZ4CYIG0QxSAWRLGZiRlEHZwBkcvgSgMAJdsWyQplbmRzdHJlYW0KZW5kb2JqCjI5IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNDcgPj4Kc3RyZWFtCnicMzK3UDBQsDQBEoYWJgrmZgYKKYZclhBWLhdMLAfMAtGWcAoinsGVBgC5Zw0nCmVuZHN0cmVhbQplbmRvYmoKMzAgMCBvYmoKPDwgL0JCb3ggWyAtMTAyMSAtNDYzIDE3OTQgMTIzMyBdIC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMzkKL1N1YnR5cGUgL0Zvcm0gL1R5cGUgL1hPYmplY3QgPj4Kc3RyZWFtCnic4zI0MFMwNjVVyOUyNzYCs3LALCNzIyALJItgQWQzuNIAFfMKfAplbmRzdHJlYW0KZW5kb2JqCjMxIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTYzID4+CnN0cmVhbQp4nEWQOxIDIQxDe06hI/gjAz7PZjIpNvdvY9hsUsDTWCCDuxOC1NqCieiCh7Yl3QXvrQRnY/zpNm41EuQEdYBWpONolFJ9ucVplXTxaDZzKwutEx1mDnqUoxmgEDoV3u2i5HKm7s75Q3D1X/W/Yt05m4mBycodCM3qU9z5NjuiurrJ/qTH3KzXfivsVWFpWUvLCbedu2ZACdxTOdqrPT8fCjr2CmVuZHN0cmVhbQplbmRvYmoKMzIgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMTggPj4Kc3RyZWFtCnicPVC5jQQxDMtdhRpYwHrtqWcWi0um//RI+fYi0RZFUio1mZIpL3WUJVlT3jp8lsQOeYblbmQ2JSpFL5OwJffQCvF9ieYU993VlrNDNJdoOX4LMyqqGx3TSzaacCoTuqDcwzP6DW10A1aHHrFbINCkYNe2IHLHDxgMwZkTiyIMSk0G/65yj59eixs+w/FDFJGSDuY1/1j98nMNr1OPJ5Fub77iXpypDgMRHJKavCNdWLEuEhFpNUFNz8BaLYC7t17+G7QjugxA9onEcZpSjqG/a3Clzy/lJ1PYCmVuZHN0cmVhbQplbmRvYmoKMzMgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCA4MyA+PgpzdHJlYW0KeJxFjLsNwDAIRHumYAR+JvY+UZTC3r8NECVuuCfdPVwdCZkpbjPDQwaeDCyGXXGB9JYwC1xHUI6d7KNh1b7qBI31plLz7w+Unuys4obrAQJCGmYKZW5kc3RyZWFtCmVuZG9iagozNCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDUxID4+CnN0cmVhbQp4nDM2tFAwUDA0MAeSRoZAlpGJQoohF0gAxMzlggnmgFkGQBqiOAeuJocrgysNAOG0DZgKZW5kc3RyZWFtCmVuZG9iagozNSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE2MCA+PgpzdHJlYW0KeJxFkDkSAzEIBHO9gidIXIL3rMu1wfr/qQfWR6LpAjQcuhZNynoUaD7psUahutBr6CxKkkTBFpIdUKdjiDsoSExIY5JIth6DI5pYs12YmVQqs1LhtGnFwr/ZWtXIRI1wjfyJ6QZU/E/qXJTwTYOvkjH6GFS8O4OMSfheRdxaMe3+RDCxGfYJb0UmBYSJsanZvs9ghsz3Ctc4x/MNTII36wplbmRzdHJlYW0KZW5kb2JqCjM2IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMzM0ID4+CnN0cmVhbQp4nC1SS3LFIAzbcwpdoDP4B+Q86XS6eL3/tpKTRUYOYPQx5YaJSnxZILej1sS3jcxAheGvq8yFz0jbyDqIy5CLuJIthXtELOQxxDzEgu+r8R4e+azMybMHxi/Zdw8r9tSEZSHjxRnaYRXHYRXkWLB1Iap7eFOkw6kk2OOL/z7Fcy0ELXxG0IBf5J+vjuD5khZp95ht0656sEw7qqSwHGxPc14mX1pnuToezwfJ9q7YEVK7AhSFuTPOc+Eo01ZGtBZ2NkhqXGxvjv1YStCFblxGiiOQn6kiPKCkycwmCuKPnB5yKgNh6pqudHIbVXGnnsw1m4u3M0lm675IsZnCeV04s/4MU2a1eSfPcqLUqQjvsWdL0NA5rp69lllodJsTvKSEz8ZOT06+VzPrITkVCaliWlfBaRSZYgnbEl9TUVOaehn++/Lu8Tt+/gEsc3xzCmVuZHN0cmVhbQplbmRvYmoKMzcgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAzMjAgPj4Kc3RyZWFtCnicNVJLbgUxCNvPKbhApfBPzvOqqou++29rE70VTDBg4ykvWdJLvtQl26XD5Fsf9yWxQt6P7ZrMUsX3FrMUzy2vR88Rty0KBFETPViZLxUi1M/06DqocEqfgVcItxQbvINJAINq+AcepTMgUOdAxrtiMlIDgiTYc2lxCIlyJol/pLye3yetpKH0PVmZy9+TS6XQHU1O6AHFysVJoF1J+aCZmEpEkpfrfbFC9IbAkjw+RzHJgOw2iW2iBSbnHqUlzMQUOrDHArxmmtVV6GDCHocpjFcLs6gebPJbE5WkHa3jGdkw3sswU2Kh4bAF1OZiZYLu5eM1r8KI7VGTXcNw7pbNdwjRaP4bFsrgYxWSgEensRINaTjAiMCeXjjFXvMTOQ7AiGOdmiwMY2gmp3qOicDQnrOlYcbHHlr18w9U6XyHCmVuZHN0cmVhbQplbmRvYmoKMzggMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxOCA+PgpzdHJlYW0KeJwzNrRQMIDDFEOuNAAd5gNSCmVuZHN0cmVhbQplbmRvYmoKMzkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxMzMgPj4Kc3RyZWFtCnicRY9LDgQhCET3nKKOwMcf53Ey6YVz/+2AnW4TYz2FVIG5gqE9LmsDnRUfIRm28beplo5FWT5UelJWD8ngh6zGyyHcoCzwgkkqhiFQi5gakS1lbreA2zYNsrKVU6WOsIujMI/2tGwVHl+iWyJ1kj+DxCov3OO6Hcil1rveoou+f6QBMQkKZW5kc3RyZWFtCmVuZG9iago0MCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDM0MCA+PgpzdHJlYW0KeJw1UjluBDEM6/0KfSCAbtvv2SBIkfy/DanZFANxdFKUO1pUdsuHhVS17HT5tJXaEjfkd2WFxAnJqxLtUoZIqLxWIdXvmTKvtzVnBMhSpcLkpORxyYI/w6WnC8f5trGv5cgdjx5YFSOhRMAyxcToGpbO7rBmW36WacCPeIScK9Ytx1gFUhvdOO2K96F5LbIGiL2ZlooKHVaJFn5B8aBHjX32GFRYINHtHElwjIlQkYB2gdpIDDl7LHZRH/QzKDET6NobRdxBgSWSmDnFunT03/jQsaD+2Iw3vzoq6VtaWWPSPhvtlMYsMul6WPR089bHgws076L859UMEjRljZLGB63aOYaimVFWeLdDkw3NMcch8w6ewxkJSvo8FL+PJRMdlMjfDg2hf18eo4ycNt4C5qI/bRUHDuKzw165gRVKF2uS9wGpTOiB6f+v8bW+19cfHe2AxgplbmRzdHJlYW0KZW5kb2JqCjQxIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMjUxID4+CnN0cmVhbQp4nC1RSXIDQQi7zyv0hGan32OXK4fk/9cIygcGDYtAdFrioIyfICxXvOWRq2jD3zMxgt8Fh34r121Y5EBUIEljUDWhdvF69B7YcZgJzJPWsAxmrA/8jCnc6MXhMRlnt9dl1BDsXa89mUHJrFzEJRMXTNVhI2cOP5kyLrRzPTcg50ZYl2GQblYaMxKONIVIIYWqm6TOBEESjK5GjTZyFPulL490hlWNqDHscy1tX89NOGvQ7Fis8uSUHl1xLicXL6wc9PU2AxdRaazyQEjA/W4P9XOyk994S+fOFtPje83J8sJUYMWb125ANtXi37yI4/uMr+fn+fwDX2BbiAplbmRzdHJlYW0KZW5kb2JqCjQyIDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTc0ID4+CnN0cmVhbQp4nE2QSQ5DIQxD95zCF6iEM8DnPL+qumjvv61DB3WB/OQgcDw80HEkLnRk6IyOK5sc48CzIGPi0Tj/ybg+xDFB3aItWJd2x9nMEnPCMjECtkbJ2TyiwA/HXAgSZJcfvsAgIl2P+VbzWZP0z7c73Y+6tGZfPaLAiewIxbABV4D9useBS8L5XtPklyolYxOH8oHqIlI2O6EQtVTscqqKs92bK3AV9PzRQ+7tBbUjPN8KZW5kc3RyZWFtCmVuZG9iago0MyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDc1ID4+CnN0cmVhbQp4nDO1NFIwUDA2ABKmZkYKpibmCimGXEA+iJXLZWhkCmblcBlZmilYWAAZJmbmUCGYhhwuY1NzoAFARcamYBqqP4crgysNAJWQEu8KZW5kc3RyZWFtCmVuZG9iago0NCAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDE0MSA+PgpzdHJlYW0KeJw9j8EOwzAIQ+/5Cv9ApNgpoXxPp2qH7v+vI0u7C3oCY4yF0NAbqprDhmCb48XSJVRr+BTFQCU3yJlgDqWk0h1HkXpiOBhcHrQbjuKx6PoRu5JmfdDGQrolaIB7rFNp3KZxE8QdNQXqKeqco7wQuZ+pZ9g0kt00s5JzuA2/e89T1/+nq7zL+QW9dy7+CmVuZHN0cmVhbQplbmRvYmoKNDUgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMTUgPj4Kc3RyZWFtCnicNVE5DgMhDOz3Ff5AJIwveE+iKM3+v82M0VYewVyGtJQhmfJSk6gh5VM+epkunLrc18xqNOeWtC1zgLi2vC+tksCJZoiDwWmYuAGaPAFD19GoUUMXHtDUpVMosNwEPoq3bg/dY7WBl7Yh54kgYigZLEHNqUUTFm3PJ6Q1v16LG96X7d3IU6XGlhiBBgFWOBzX6NfwlT1PJtF0FTLUqzXLGAkTRSI8+Y6m1RPrWjTSMhLUxhGsagO8O/0wTgAAE3HLAmSfSpSz5MRvsfSzBlf6/gGfR1SWCmVuZHN0cmVhbQplbmRvYmoKMTUgMCBvYmoKPDwgL0Jhc2VGb250IC9EZWphVnVTYW5zIC9DaGFyUHJvY3MgMTYgMCBSCi9FbmNvZGluZyA8PAovRGlmZmVyZW5jZXMgWyAzMiAvc3BhY2UgNDYgL3BlcmlvZCA0OCAvemVybyAvb25lIC90d28gL3RocmVlIC9mb3VyIC9maXZlIC9zaXggNTYKL2VpZ2h0IDY1IC9BIDY4IC9EIDc2IC9MIDk3IC9hIC9iIC9jIC9kIC9lIDEwNSAvaSAxMDggL2wgMTEwIC9uIC9vIDExNCAvcgovcyAvdCAvdSAvdiAxMjEgL3kgXQovVHlwZSAvRW5jb2RpbmcgPj4KL0ZpcnN0Q2hhciAwIC9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnREZXNjcmlwdG9yIDE0IDAgUgovRm9udE1hdHJpeCBbIDAuMDAxIDAgMCAwLjAwMSAwIDAgXSAvTGFzdENoYXIgMjU1IC9OYW1lIC9EZWphVnVTYW5zCi9TdWJ0eXBlIC9UeXBlMyAvVHlwZSAvRm9udCAvV2lkdGhzIDEzIDAgUiA+PgplbmRvYmoKMTQgMCBvYmoKPDwgL0FzY2VudCA5MjkgL0NhcEhlaWdodCAwIC9EZXNjZW50IC0yMzYgL0ZsYWdzIDMyCi9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnROYW1lIC9EZWphVnVTYW5zIC9JdGFsaWNBbmdsZSAwCi9NYXhXaWR0aCAxMzQyIC9TdGVtViAwIC9UeXBlIC9Gb250RGVzY3JpcHRvciAvWEhlaWdodCAwID4+CmVuZG9iagoxMyAwIG9iagpbIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwCjYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgNjAwIDYwMCA2MDAgMzE4IDQwMSA0NjAgODM4IDYzNgo5NTAgNzgwIDI3NSAzOTAgMzkwIDUwMCA4MzggMzE4IDM2MSAzMTggMzM3IDYzNiA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiA2MzYKNjM2IDYzNiAzMzcgMzM3IDgzOCA4MzggODM4IDUzMSAxMDAwIDY4NCA2ODYgNjk4IDc3MCA2MzIgNTc1IDc3NSA3NTIgMjk1CjI5NSA2NTYgNTU3IDg2MyA3NDggNzg3IDYwMyA3ODcgNjk1IDYzNSA2MTEgNzMyIDY4NCA5ODkgNjg1IDYxMSA2ODUgMzkwIDMzNwozOTAgODM4IDUwMCA1MDAgNjEzIDYzNSA1NTAgNjM1IDYxNSAzNTIgNjM1IDYzNCAyNzggMjc4IDU3OSAyNzggOTc0IDYzNCA2MTIKNjM1IDYzNSA0MTEgNTIxIDM5MiA2MzQgNTkyIDgxOCA1OTIgNTkyIDUyNSA2MzYgMzM3IDYzNiA4MzggNjAwIDYzNiA2MDAgMzE4CjM1MiA1MTggMTAwMCA1MDAgNTAwIDUwMCAxMzQyIDYzNSA0MDAgMTA3MCA2MDAgNjg1IDYwMCA2MDAgMzE4IDMxOCA1MTggNTE4CjU5MCA1MDAgMTAwMCA1MDAgMTAwMCA1MjEgNDAwIDEwMjMgNjAwIDUyNSA2MTEgMzE4IDQwMSA2MzYgNjM2IDYzNiA2MzYgMzM3CjUwMCA1MDAgMTAwMCA0NzEgNjEyIDgzOCAzNjEgMTAwMCA1MDAgNTAwIDgzOCA0MDEgNDAxIDUwMCA2MzYgNjM2IDMxOCA1MDAKNDAxIDQ3MSA2MTIgOTY5IDk2OSA5NjkgNTMxIDY4NCA2ODQgNjg0IDY4NCA2ODQgNjg0IDk3NCA2OTggNjMyIDYzMiA2MzIgNjMyCjI5NSAyOTUgMjk1IDI5NSA3NzUgNzQ4IDc4NyA3ODcgNzg3IDc4NyA3ODcgODM4IDc4NyA3MzIgNzMyIDczMiA3MzIgNjExIDYwNQo2MzAgNjEzIDYxMyA2MTMgNjEzIDYxMyA2MTMgOTgyIDU1MCA2MTUgNjE1IDYxNSA2MTUgMjc4IDI3OCAyNzggMjc4IDYxMiA2MzQKNjEyIDYxMiA2MTIgNjEyIDYxMiA4MzggNjEyIDYzNCA2MzQgNjM0IDYzNCA1OTIgNjM1IDU5MiBdCmVuZG9iagoxNiAwIG9iago8PCAvQSAxNyAwIFIgL0QgMTggMCBSIC9MIDE5IDAgUiAvYSAyMCAwIFIgL2IgMjEgMCBSIC9jIDIyIDAgUiAvZCAyMyAwIFIKL2UgMjQgMCBSIC9laWdodCAyNSAwIFIgL2ZpdmUgMjYgMCBSIC9mb3VyIDI3IDAgUiAvaSAyOCAwIFIgL2wgMjkgMCBSCi9uIDMxIDAgUiAvbyAzMiAwIFIgL29uZSAzMyAwIFIgL3BlcmlvZCAzNCAwIFIgL3IgMzUgMCBSIC9zIDM2IDAgUgovc2l4IDM3IDAgUiAvc3BhY2UgMzggMCBSIC90IDM5IDAgUiAvdGhyZWUgNDAgMCBSIC90d28gNDEgMCBSIC91IDQyIDAgUgovdiA0MyAwIFIgL3kgNDQgMCBSIC96ZXJvIDQ1IDAgUiA+PgplbmRvYmoKMyAwIG9iago8PCAvRjEgMTUgMCBSID4+CmVuZG9iago0IDAgb2JqCjw8IC9BMSA8PCAvQ0EgMCAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMiA8PCAvQ0EgMSAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMyA8PCAvQ0EgMSAvVHlwZSAvRXh0R1N0YXRlIC9jYSAwLjUgPj4gPj4KZW5kb2JqCjUgMCBvYmoKPDwgPj4KZW5kb2JqCjYgMCBvYmoKPDwgPj4KZW5kb2JqCjcgMCBvYmoKPDwgL0YxLURlamFWdVNhbnMtbWludXMgMzAgMCBSID4+CmVuZG9iagoyIDAgb2JqCjw8IC9Db3VudCAxIC9LaWRzIFsgMTEgMCBSIF0gL1R5cGUgL1BhZ2VzID4+CmVuZG9iago0NiAwIG9iago8PCAvQ3JlYXRpb25EYXRlIChEOjIwMjEwOTE2MTQzNzMwKzAyJzAwJykKL0NyZWF0b3IgKE1hdHBsb3RsaWIgdjMuNC4zLCBodHRwczovL21hdHBsb3RsaWIub3JnKQovUHJvZHVjZXIgKE1hdHBsb3RsaWIgcGRmIGJhY2tlbmQgdjMuNC4zKSA+PgplbmRvYmoKeHJlZgowIDQ3CjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMDAxNiAwMDAwMCBuIAowMDAwMDIxMDExIDAwMDAwIG4gCjAwMDAwMjA3NDggMDAwMDAgbiAKMDAwMDAyMDc4MCAwMDAwMCBuIAowMDAwMDIwOTIwIDAwMDAwIG4gCjAwMDAwMjA5NDEgMDAwMDAgbiAKMDAwMDAyMDk2MiAwMDAwMCBuIAowMDAwMDAwMDY1IDAwMDAwIG4gCjAwMDAwMDAzOTkgMDAwMDAgbiAKMDAwMDAxMTM2OSAwMDAwMCBuIAowMDAwMDAwMjA4IDAwMDAwIG4gCjAwMDAwMTEzNDcgMDAwMDAgbiAKMDAwMDAxOTM2MSAwMDAwMCBuIAowMDAwMDE5MTYxIDAwMDAwIG4gCjAwMDAwMTg3MDggMDAwMDAgbiAKMDAwMDAyMDQxNCAwMDAwMCBuIAowMDAwMDExMzg5IDAwMDAwIG4gCjAwMDAwMTE1NTIgMDAwMDAgbiAKMDAwMDAxMTc4OSAwMDAwMCBuIAowMDAwMDExOTIyIDAwMDAwIG4gCjAwMDAwMTIzMDIgMDAwMDAgbiAKMDAwMDAxMjYxOSAwMDAwMCBuIAowMDAwMDEyOTI0IDAwMDAwIG4gCjAwMDAwMTMyMjggMDAwMDAgbiAKMDAwMDAxMzU1MCAwMDAwMCBuIAowMDAwMDE0MDE4IDAwMDAwIG4gCjAwMDAwMTQzNDAgMDAwMDAgbiAKMDAwMDAxNDUwNiAwMDAwMCBuIAowMDAwMDE0NjUwIDAwMDAwIG4gCjAwMDAwMTQ3NjkgMDAwMDAgbiAKMDAwMDAxNDk0MSAwMDAwMCBuIAowMDAwMDE1MTc3IDAwMDAwIG4gCjAwMDAwMTU0NjggMDAwMDAgbiAKMDAwMDAxNTYyMyAwMDAwMCBuIAowMDAwMDE1NzQ2IDAwMDAwIG4gCjAwMDAwMTU5NzkgMDAwMDAgbiAKMDAwMDAxNjM4NiAwMDAwMCBuIAowMDAwMDE2Nzc5IDAwMDAwIG4gCjAwMDAwMTY4NjkgMDAwMDAgbiAKMDAwMDAxNzA3NSAwMDAwMCBuIAowMDAwMDE3NDg4IDAwMDAwIG4gCjAwMDAwMTc4MTIgMDAwMDAgbiAKMDAwMDAxODA1OSAwMDAwMCBuIAowMDAwMDE4MjA2IDAwMDAwIG4gCjAwMDAwMTg0MjAgMDAwMDAgbiAKMDAwMDAyMTA3MSAwMDAwMCBuIAp0cmFpbGVyCjw8IC9JbmZvIDQ2IDAgUiAvUm9vdCAxIDAgUiAvU2l6ZSA0NyA+PgpzdGFydHhyZWYKMjEyMjgKJSVFT0YK\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:29.851882\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["Layer 0 - Variance: 1.0256913900375366\n", "Layer 2 - Variance: 1.0101124048233032\n", "Layer 4 - Variance: 1.0158814191818237\n", "Layer 6 - Variance: 1.1398581266403198\n", "Layer 8 - Variance: 0.46903371810913086\n"]}], "source": ["def kaiming_init(model):\n", " for name, param in model.named_parameters():\n", " if name.endswith(\".bias\"):\n", " param.data.fill_(0)\n", " elif name.startswith(\"layers.0\"): # The first layer does not have ReLU applied on its input\n", " param.data.normal_(0, 1 / math.sqrt(param.shape[1]))\n", " else:\n", " param.data.normal_(0, math.sqrt(2) / math.sqrt(param.shape[1]))\n", "\n", "\n", "model = BaseNetwork(act_fn=nn.ReLU()).to(device)\n", "kaiming_init(model)\n", "visualize_gradients(model, print_variance=True)\n", "visualize_activations(model, print_variance=True)"]}, {"cell_type": "markdown", "id": "fee08055", "metadata": {"papermill": {"duration": 0.100612, "end_time": "2021-09-16T12:37:31.027664", "exception": false, "start_time": "2021-09-16T12:37:30.927052", "status": "completed"}, "tags": []}, "source": ["The variance stays stable across layers.\n", "We can conclude that the Kaiming initialization indeed works well for ReLU-based networks.\n", "Note that for Leaky-ReLU etc., we have to slightly adjust the factor of $2$ in the variance as half of the values are not set to zero anymore.\n", "PyTorch provides a function to calculate this factor for many activation\n", "function, see `torch.nn.init.calculate_gain`\n", "([link](https://pytorch.org/docs/stable/nn.init.html#torch.nn.init.calculate_gain))."]}, {"cell_type": "markdown", "id": "7357c14d", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.101571, "end_time": "2021-09-16T12:37:31.230155", "exception": false, "start_time": "2021-09-16T12:37:31.128584", "status": "completed"}, "tags": []}, "source": ["## Optimization\n", "\n", "
\n", "\n", "Besides initialization, selecting a suitable optimization algorithm can be an important choice for deep neural networks.\n", "Before taking a closer look at them, we should define code for training the models.\n", "Most of the following code is copied from the previous tutorial, and only slightly altered to fit our needs."]}, {"cell_type": "code", "execution_count": 20, "id": "612da712", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:31.465358Z", "iopub.status.busy": "2021-09-16T12:37:31.463919Z", "iopub.status.idle": "2021-09-16T12:37:31.467057Z", "shell.execute_reply": "2021-09-16T12:37:31.467443Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.136711, "end_time": "2021-09-16T12:37:31.467583", "exception": false, "start_time": "2021-09-16T12:37:31.330872", "status": "completed"}, "tags": []}, "outputs": [], "source": ["def _get_config_file(model_path, model_name):\n", " return os.path.join(model_path, model_name + \".config\")\n", "\n", "\n", "def _get_model_file(model_path, model_name):\n", " return os.path.join(model_path, model_name + \".tar\")\n", "\n", "\n", "def _get_result_file(model_path, model_name):\n", " return os.path.join(model_path, model_name + \"_results.json\")\n", "\n", "\n", "def load_model(model_path, model_name, net=None):\n", " config_file = _get_config_file(model_path, model_name)\n", " model_file = _get_model_file(model_path, model_name)\n", " assert os.path.isfile(\n", " config_file\n", " ), f'Could not find the config file \"{config_file}\". Are you sure this is the correct path and you have your model config stored here?'\n", " assert os.path.isfile(\n", " model_file\n", " ), f'Could not find the model file \"{model_file}\". Are you sure this is the correct path and you have your model stored here?'\n", " with open(config_file) as f:\n", " config_dict = json.load(f)\n", " if net is None:\n", " act_fn_name = config_dict[\"act_fn\"].pop(\"name\").lower()\n", " assert (\n", " act_fn_name in act_fn_by_name\n", " ), f'Unknown activation function \"{act_fn_name}\". Please add it to the \"act_fn_by_name\" dict.'\n", " act_fn = act_fn_by_name[act_fn_name]()\n", " net = BaseNetwork(act_fn=act_fn, **config_dict)\n", " net.load_state_dict(torch.load(model_file))\n", " return net\n", "\n", "\n", "def save_model(model, model_path, model_name):\n", " config_dict = model.config\n", " os.makedirs(model_path, exist_ok=True)\n", " config_file = _get_config_file(model_path, model_name)\n", " model_file = _get_model_file(model_path, model_name)\n", " with open(config_file, \"w\") as f:\n", " json.dump(config_dict, f)\n", " torch.save(model.state_dict(), model_file)\n", "\n", "\n", "def train_model(net, model_name, optim_func, max_epochs=50, batch_size=256, overwrite=False):\n", " \"\"\"Train a model on the training set of FashionMNIST.\n", "\n", " Args:\n", " net: Object of BaseNetwork\n", " model_name: (str) Name of the model, used for creating the checkpoint names\n", " max_epochs: Number of epochs we want to (maximally) train for\n", " patience: If the performance on the validation set has not improved for #patience epochs, we stop training early\n", " batch_size: Size of batches used in training\n", " overwrite: Determines how to handle the case when there already exists a checkpoint. If True, it will be overwritten. Otherwise, we skip training.\n", " \"\"\"\n", " file_exists = os.path.isfile(_get_model_file(CHECKPOINT_PATH, model_name))\n", " if file_exists and not overwrite:\n", " print(f'Model file of \"{model_name}\" already exists. Skipping training...')\n", " with open(_get_result_file(CHECKPOINT_PATH, model_name)) as f:\n", " results = json.load(f)\n", " else:\n", " if file_exists:\n", " print(\"Model file exists, but will be overwritten...\")\n", "\n", " # Defining optimizer, loss and data loader\n", " optimizer = optim_func(net.parameters())\n", " loss_module = nn.CrossEntropyLoss()\n", " train_loader_local = data.DataLoader(\n", " train_set, batch_size=batch_size, shuffle=True, drop_last=True, pin_memory=True\n", " )\n", "\n", " results = None\n", " val_scores = []\n", " train_losses, train_scores = [], []\n", " best_val_epoch = -1\n", " for epoch in range(max_epochs):\n", " train_acc, val_acc, epoch_losses = epoch_iteration(\n", " net, loss_module, optimizer, train_loader_local, val_loader, epoch\n", " )\n", " train_scores.append(train_acc)\n", " val_scores.append(val_acc)\n", " train_losses += epoch_losses\n", "\n", " if len(val_scores) == 1 or val_acc > val_scores[best_val_epoch]:\n", " print(\"\\t (New best performance, saving model...)\")\n", " save_model(net, CHECKPOINT_PATH, model_name)\n", " best_val_epoch = epoch\n", "\n", " if results is None:\n", " load_model(CHECKPOINT_PATH, model_name, net=net)\n", " test_acc = test_model(net, test_loader)\n", " results = {\n", " \"test_acc\": test_acc,\n", " \"val_scores\": val_scores,\n", " \"train_losses\": train_losses,\n", " \"train_scores\": train_scores,\n", " }\n", " with open(_get_result_file(CHECKPOINT_PATH, model_name), \"w\") as f:\n", " json.dump(results, f)\n", "\n", " # Plot a curve of the validation accuracy\n", " sns.set()\n", " plt.plot([i for i in range(1, len(results[\"train_scores\"]) + 1)], results[\"train_scores\"], label=\"Train\")\n", " plt.plot([i for i in range(1, len(results[\"val_scores\"]) + 1)], results[\"val_scores\"], label=\"Val\")\n", " plt.xlabel(\"Epochs\")\n", " plt.ylabel(\"Validation accuracy\")\n", " plt.ylim(min(results[\"val_scores\"]), max(results[\"train_scores\"]) * 1.01)\n", " plt.title(f\"Validation performance of {model_name}\")\n", " plt.legend()\n", " plt.show()\n", " plt.close()\n", "\n", " print((f\" Test accuracy: {results['test_acc']*100.0:4.2f}% \").center(50, \"=\") + \"\\n\")\n", " return results\n", "\n", "\n", "def epoch_iteration(net, loss_module, optimizer, train_loader_local, val_loader, epoch):\n", " ############\n", " # Training #\n", " ############\n", " net.train()\n", " true_preds, count = 0.0, 0\n", " epoch_losses = []\n", " t = tqdm(train_loader_local, leave=False)\n", " for imgs, labels in t:\n", " imgs, labels = imgs.to(device), labels.to(device)\n", " optimizer.zero_grad()\n", " preds = net(imgs)\n", " loss = loss_module(preds, labels)\n", " loss.backward()\n", " optimizer.step()\n", " # Record statistics during training\n", " true_preds += (preds.argmax(dim=-1) == labels).sum().item()\n", " count += labels.shape[0]\n", " t.set_description(f\"Epoch {epoch+1}: loss={loss.item():4.2f}\")\n", " epoch_losses.append(loss.item())\n", " train_acc = true_preds / count\n", "\n", " ##############\n", " # Validation #\n", " ##############\n", " val_acc = test_model(net, val_loader)\n", " print(\n", " f\"[Epoch {epoch+1:2i}] Training accuracy: {train_acc*100.0:05.2f}%, Validation accuracy: {val_acc*100.0:05.2f}%\"\n", " )\n", " return train_acc, val_acc, epoch_losses\n", "\n", "\n", "def test_model(net, data_loader):\n", " \"\"\"Test a model on a specified dataset.\n", "\n", " Args:\n", " net: Trained model of type BaseNetwork\n", " data_loader: DataLoader object of the dataset to test on (validation or test)\n", " \"\"\"\n", " net.eval()\n", " true_preds, count = 0.0, 0\n", " for imgs, labels in data_loader:\n", " imgs, labels = imgs.to(device), labels.to(device)\n", " with torch.no_grad():\n", " preds = net(imgs).argmax(dim=-1)\n", " true_preds += (preds == labels).sum().item()\n", " count += labels.shape[0]\n", " test_acc = true_preds / count\n", " return test_acc"]}, {"cell_type": "markdown", "id": "ed5321ba", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.101675, "end_time": "2021-09-16T12:37:31.672791", "exception": false, "start_time": "2021-09-16T12:37:31.571116", "status": "completed"}, "tags": []}, "source": ["First, we need to understand what an optimizer actually does.\n", "The optimizer is responsible to update the network's parameters given the gradients.\n", "Hence, we effectively implement a function $w^{t} = f(w^{t-1}, g^{t}, ...)$ with $w$ being the parameters, and $g^{t} = \\nabla_{w^{(t-1)}} \\mathcal{L}^{(t)}$ the gradients at time step $t$.\n", "A common, additional parameter to this function is the learning rate, here denoted by $\\eta$.\n", "Usually, the learning rate can be seen as the \"step size\" of the update.\n", "A higher learning rate means that we change the weights more in the direction of the gradients, a smaller means we take shorter steps.\n", "\n", "As most optimizers only differ in the implementation of $f$, we can define a template for an optimizer in PyTorch below.\n", "We take as input the parameters of a model and a learning rate.\n", "The function `zero_grad` sets the gradients of all parameters to zero, which we have to do before calling `loss.backward()`.\n", "Finally, the `step()` function tells the optimizer to update all weights based on their gradients.\n", "The template is setup below:"]}, {"cell_type": "code", "execution_count": 21, "id": "d0480429", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:31.880757Z", "iopub.status.busy": "2021-09-16T12:37:31.880264Z", "iopub.status.idle": "2021-09-16T12:37:31.881827Z", "shell.execute_reply": "2021-09-16T12:37:31.882297Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.109458, "end_time": "2021-09-16T12:37:31.882433", "exception": false, "start_time": "2021-09-16T12:37:31.772975", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class OptimizerTemplate:\n", " def __init__(self, params, lr):\n", " self.params = list(params)\n", " self.lr = lr\n", "\n", " def zero_grad(self):\n", " # Set gradients of all parameters to zero\n", " for p in self.params:\n", " if p.grad is not None:\n", " p.grad.detach_() # For second-order optimizers important\n", " p.grad.zero_()\n", "\n", " @torch.no_grad()\n", " def step(self):\n", " # Apply update step to all parameters\n", " for p in self.params:\n", " if p.grad is None: # We skip parameters without any gradients\n", " continue\n", " self.update_param(p)\n", "\n", " def update_param(self, p):\n", " # To be implemented in optimizer-specific classes\n", " raise NotImplementedError"]}, {"cell_type": "markdown", "id": "ea589ab3", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.100149, "end_time": "2021-09-16T12:37:32.084995", "exception": false, "start_time": "2021-09-16T12:37:31.984846", "status": "completed"}, "tags": []}, "source": ["The first optimizer we are going to implement is the standard Stochastic Gradient Descent (SGD).\n", "SGD updates the parameters using the following equation:\n", "\n", "$$\n", "\\begin{split}\n", " w^{(t)} & = w^{(t-1)} - \\eta \\cdot g^{(t)}\n", "\\end{split}\n", "$$\n", "\n", "As simple as the equation is also our implementation of SGD:"]}, {"cell_type": "code", "execution_count": 22, "id": "347cd0a8", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:32.292286Z", "iopub.status.busy": "2021-09-16T12:37:32.291809Z", "iopub.status.idle": "2021-09-16T12:37:32.293396Z", "shell.execute_reply": "2021-09-16T12:37:32.293806Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.107225, "end_time": "2021-09-16T12:37:32.293938", "exception": false, "start_time": "2021-09-16T12:37:32.186713", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class SGD(OptimizerTemplate):\n", " def __init__(self, params, lr):\n", " super().__init__(params, lr)\n", "\n", " def update_param(self, p):\n", " p_update = -self.lr * p.grad\n", " p.add_(p_update) # In-place update => saves memory and does not create computation graph"]}, {"cell_type": "markdown", "id": "83492492", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.102455, "end_time": "2021-09-16T12:37:32.496757", "exception": false, "start_time": "2021-09-16T12:37:32.394302", "status": "completed"}, "tags": []}, "source": ["In the lecture, we also have discussed the concept of momentum which replaces the gradient in the update by an exponential average of all past gradients including the current one:\n", "\n", "$$\n", "\\begin{split}\n", " m^{(t)} & = \\beta_1 m^{(t-1)} + (1 - \\beta_1)\\cdot g^{(t)}\\\\\n", " w^{(t)} & = w^{(t-1)} - \\eta \\cdot m^{(t)}\\\\\n", "\\end{split}\n", "$$\n", "\n", "Let's also implement it below:"]}, {"cell_type": "code", "execution_count": 23, "id": "28a7d8c5", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:32.707228Z", "iopub.status.busy": "2021-09-16T12:37:32.706744Z", "iopub.status.idle": "2021-09-16T12:37:32.708337Z", "shell.execute_reply": "2021-09-16T12:37:32.708746Z"}, "lines_to_next_cell": 2, "papermill": {"duration": 0.10963, "end_time": "2021-09-16T12:37:32.708874", "exception": false, "start_time": "2021-09-16T12:37:32.599244", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class SGDMomentum(OptimizerTemplate):\n", " def __init__(self, params, lr, momentum=0.0):\n", " super().__init__(params, lr)\n", " self.momentum = momentum # Corresponds to beta_1 in the equation above\n", " self.param_momentum = {p: torch.zeros_like(p.data) for p in self.params} # Dict to store m_t\n", "\n", " def update_param(self, p):\n", " self.param_momentum[p] = (1 - self.momentum) * p.grad + self.momentum * self.param_momentum[p]\n", " p_update = -self.lr * self.param_momentum[p]\n", " p.add_(p_update)"]}, {"cell_type": "markdown", "id": "1dd09f27", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.103723, "end_time": "2021-09-16T12:37:32.921682", "exception": false, "start_time": "2021-09-16T12:37:32.817959", "status": "completed"}, "tags": []}, "source": ["Finally, we arrive at Adam.\n", "Adam combines the idea of momentum with an adaptive learning rate, which is based on an exponential average of the squared gradients, i.e. the gradients norm.\n", "Furthermore, we add a bias correction for the momentum and adaptive learning rate for the first iterations:\n", "\n", "$$\n", "\\begin{split}\n", " m^{(t)} & = \\beta_1 m^{(t-1)} + (1 - \\beta_1)\\cdot g^{(t)}\\\\\n", " v^{(t)} & = \\beta_2 v^{(t-1)} + (1 - \\beta_2)\\cdot \\left(g^{(t)}\\right)^2\\\\\n", " \\hat{m}^{(t)} & = \\frac{m^{(t)}}{1-\\beta^{t}_1}, \\hat{v}^{(t)} = \\frac{v^{(t)}}{1-\\beta^{t}_2}\\\\\n", " w^{(t)} & = w^{(t-1)} - \\frac{\\eta}{\\sqrt{v^{(t)}} + \\epsilon}\\circ \\hat{m}^{(t)}\\\\\n", "\\end{split}\n", "$$\n", "\n", "Epsilon is a small constant used to improve numerical stability for very small gradient norms.\n", "Remember that the adaptive learning rate does not replace the learning\n", "rate hyperparameter $\\eta$, but rather acts as an extra factor and\n", "ensures that the gradients of various parameters have a similar norm."]}, {"cell_type": "code", "execution_count": 24, "id": "5405912f", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:33.132645Z", "iopub.status.busy": "2021-09-16T12:37:33.132120Z", "iopub.status.idle": "2021-09-16T12:37:33.134264Z", "shell.execute_reply": "2021-09-16T12:37:33.133784Z"}, "papermill": {"duration": 0.111044, "end_time": "2021-09-16T12:37:33.134374", "exception": false, "start_time": "2021-09-16T12:37:33.023330", "status": "completed"}, "tags": []}, "outputs": [], "source": ["class Adam(OptimizerTemplate):\n", " def __init__(self, params, lr, beta1=0.9, beta2=0.999, eps=1e-8):\n", " super().__init__(params, lr)\n", " self.beta1 = beta1\n", " self.beta2 = beta2\n", " self.eps = eps\n", " self.param_step = {p: 0 for p in self.params} # Remembers \"t\" for each parameter for bias correction\n", " self.param_momentum = {p: torch.zeros_like(p.data) for p in self.params}\n", " self.param_2nd_momentum = {p: torch.zeros_like(p.data) for p in self.params}\n", "\n", " def update_param(self, p):\n", " self.param_step[p] += 1\n", "\n", " self.param_momentum[p] = (1 - self.beta1) * p.grad + self.beta1 * self.param_momentum[p]\n", " self.param_2nd_momentum[p] = (1 - self.beta2) * (p.grad) ** 2 + self.beta2 * self.param_2nd_momentum[p]\n", "\n", " bias_correction_1 = 1 - self.beta1 ** self.param_step[p]\n", " bias_correction_2 = 1 - self.beta2 ** self.param_step[p]\n", "\n", " p_2nd_mom = self.param_2nd_momentum[p] / bias_correction_2\n", " p_mom = self.param_momentum[p] / bias_correction_1\n", " p_lr = self.lr / (torch.sqrt(p_2nd_mom) + self.eps)\n", " p_update = -p_lr * p_mom\n", "\n", " p.add_(p_update)"]}, {"cell_type": "markdown", "id": "83ea9b45", "metadata": {"papermill": {"duration": 0.099947, "end_time": "2021-09-16T12:37:33.348929", "exception": false, "start_time": "2021-09-16T12:37:33.248982", "status": "completed"}, "tags": []}, "source": ["### Comparing optimizers on model training\n", "\n", "After we have implemented three optimizers (SGD, SGD with momentum, and Adam), we can start to analyze and compare them.\n", "First, we test them on how well they can optimize a neural network on the FashionMNIST dataset.\n", "We use again our linear network, this time with a ReLU activation and the kaiming initialization, which we have found before to work well for ReLU-based networks.\n", "Note that the model is over-parameterized for this task, and we can achieve similar performance with a much smaller network (for example `100,100,100`).\n", "However, our main interest is in how well the optimizer can train *deep*\n", "neural networks, hence the over-parameterization."]}, {"cell_type": "code", "execution_count": 25, "id": "017e170c", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:33.553073Z", "iopub.status.busy": "2021-09-16T12:37:33.552597Z", "iopub.status.idle": "2021-09-16T12:37:33.565096Z", "shell.execute_reply": "2021-09-16T12:37:33.564688Z"}, "papermill": {"duration": 0.116391, "end_time": "2021-09-16T12:37:33.565206", "exception": false, "start_time": "2021-09-16T12:37:33.448815", "status": "completed"}, "tags": []}, "outputs": [], "source": ["base_model = BaseNetwork(act_fn=nn.ReLU(), hidden_sizes=[512, 256, 256, 128])\n", "kaiming_init(base_model)"]}, {"cell_type": "markdown", "id": "fe2e1e1d", "metadata": {"papermill": {"duration": 0.116162, "end_time": "2021-09-16T12:37:33.782066", "exception": false, "start_time": "2021-09-16T12:37:33.665904", "status": "completed"}, "tags": []}, "source": ["For a fair comparison, we train the exact same model with the same seed with the three optimizers below.\n", "Feel free to change the hyperparameters if you want (however, you have to train your own model then)."]}, {"cell_type": "code", "execution_count": 26, "id": "2d013e73", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:33.992549Z", "iopub.status.busy": "2021-09-16T12:37:33.992058Z", "iopub.status.idle": "2021-09-16T12:37:34.371016Z", "shell.execute_reply": "2021-09-16T12:37:34.370547Z"}, "papermill": {"duration": 0.48688, "end_time": "2021-09-16T12:37:34.371139", "exception": false, "start_time": "2021-09-16T12:37:33.884259", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Model file of \"FashionMNIST_SGD\" already exists. Skipping training...\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:34.140831\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["============= Test accuracy: 89.09% ==============\n", "\n"]}], "source": ["SGD_model = copy.deepcopy(base_model).to(device)\n", "SGD_results = train_model(\n", " SGD_model, \"FashionMNIST_SGD\", lambda params: SGD(params, lr=1e-1), max_epochs=40, batch_size=256\n", ")"]}, {"cell_type": "code", "execution_count": 27, "id": "448f5f88", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:34.592457Z", "iopub.status.busy": "2021-09-16T12:37:34.591978Z", "iopub.status.idle": "2021-09-16T12:37:34.958280Z", "shell.execute_reply": "2021-09-16T12:37:34.958659Z"}, "papermill": {"duration": 0.474961, "end_time": "2021-09-16T12:37:34.958809", "exception": false, "start_time": "2021-09-16T12:37:34.483848", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Model file of \"FashionMNIST_SGDMom\" already exists. Skipping training...\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:34.734286\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["============= Test accuracy: 88.83% ==============\n", "\n"]}], "source": ["SGDMom_model = copy.deepcopy(base_model).to(device)\n", "SGDMom_results = train_model(\n", " SGDMom_model,\n", " \"FashionMNIST_SGDMom\",\n", " lambda params: SGDMomentum(params, lr=1e-1, momentum=0.9),\n", " max_epochs=40,\n", " batch_size=256,\n", ")"]}, {"cell_type": "code", "execution_count": 28, "id": "097c35c5", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:35.182816Z", "iopub.status.busy": "2021-09-16T12:37:35.182331Z", "iopub.status.idle": "2021-09-16T12:37:35.546702Z", "shell.execute_reply": "2021-09-16T12:37:35.546288Z"}, "papermill": {"duration": 0.477353, "end_time": "2021-09-16T12:37:35.546818", "exception": false, "start_time": "2021-09-16T12:37:35.069465", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Model file of \"FashionMNIST_Adam\" already exists. Skipping training...\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:35.325603\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {}, "output_type": "display_data"}, {"name": "stdout", "output_type": "stream", "text": ["============= Test accuracy: 89.46% ==============\n", "\n"]}], "source": ["Adam_model = copy.deepcopy(base_model).to(device)\n", "Adam_results = train_model(\n", " Adam_model, \"FashionMNIST_Adam\", lambda params: Adam(params, lr=1e-3), max_epochs=40, batch_size=256\n", ")"]}, {"cell_type": "markdown", "id": "3372cdee", "metadata": {"papermill": {"duration": 0.107962, "end_time": "2021-09-16T12:37:35.766302", "exception": false, "start_time": "2021-09-16T12:37:35.658340", "status": "completed"}, "tags": []}, "source": ["The result is that all optimizers perform similarly well with the given model.\n", "The differences are too small to find any significant conclusion.\n", "However, keep in mind that this can also be attributed to the initialization we chose.\n", "When changing the initialization to worse (e.g. constant initialization), Adam usually shows to be more robust because of its adaptive learning rate.\n", "To show the specific benefits of the optimizers, we will continue to\n", "look at some possible loss surfaces in which momentum and adaptive\n", "learning rate are crucial."]}, {"cell_type": "markdown", "id": "6afe619e", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.10673, "end_time": "2021-09-16T12:37:35.980280", "exception": false, "start_time": "2021-09-16T12:37:35.873550", "status": "completed"}, "tags": []}, "source": ["### Pathological curvatures\n", "\n", "A pathological curvature is a type of surface that is similar to ravines and is particularly tricky for plain SGD optimization.\n", "In words, pathological curvatures typically have a steep gradient in one direction with an optimum at the center, while in a second direction we have a slower gradient towards a (global) optimum.\n", "Let's first create an example surface of this and visualize it:"]}, {"cell_type": "code", "execution_count": 29, "id": "5cfd764f", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:36.202397Z", "iopub.status.busy": "2021-09-16T12:37:36.201928Z", "iopub.status.idle": "2021-09-16T12:37:36.203949Z", "shell.execute_reply": "2021-09-16T12:37:36.203512Z"}, "papermill": {"duration": 0.113791, "end_time": "2021-09-16T12:37:36.204058", "exception": false, "start_time": "2021-09-16T12:37:36.090267", "status": "completed"}, "tags": []}, "outputs": [], "source": ["def pathological_curve_loss(w1, w2):\n", " # Example of a pathological curvature. There are many more possible, feel free to experiment here!\n", " x1_loss = torch.tanh(w1) ** 2 + 0.01 * torch.abs(w1)\n", " x2_loss = torch.sigmoid(w2)\n", " return x1_loss + x2_loss"]}, {"cell_type": "code", "execution_count": 30, "id": "c8827bdc", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:36.439443Z", "iopub.status.busy": "2021-09-16T12:37:36.436044Z", "iopub.status.idle": "2021-09-16T12:37:37.788861Z", "shell.execute_reply": "2021-09-16T12:37:37.789291Z"}, "papermill": {"duration": 1.475969, "end_time": "2021-09-16T12:37:37.789440", "exception": false, "start_time": "2021-09-16T12:37:36.313471", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["/tmp/ipykernel_879/1102210584.py:5: MatplotlibDeprecationWarning: Calling gca() with keyword arguments was deprecated in Matplotlib 3.4. Starting two minor releases later, gca() will take no keyword arguments. The gca() function should only be used to get the current axes, or if no axes exist, create new axes with default keyword arguments. To create a new axes with non-default arguments, use plt.axes() or plt.subplot().\n", " ax = fig.gca(projection=\"3d\") if plot_3d else fig.gca()\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:36.725553\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["def plot_curve(\n", " curve_fn, x_range=(-5, 5), y_range=(-5, 5), plot_3d=False, cmap=cm.viridis, title=\"Pathological curvature\"\n", "):\n", " fig = plt.figure()\n", " ax = fig.gca(projection=\"3d\") if plot_3d else fig.gca()\n", "\n", " x = torch.arange(x_range[0], x_range[1], (x_range[1] - x_range[0]) / 100.0)\n", " y = torch.arange(y_range[0], y_range[1], (y_range[1] - y_range[0]) / 100.0)\n", " x, y = torch.meshgrid([x, y])\n", " z = curve_fn(x, y)\n", " x, y, z = x.numpy(), y.numpy(), z.numpy()\n", "\n", " if plot_3d:\n", " ax.plot_surface(x, y, z, cmap=cmap, linewidth=1, color=\"#000\", antialiased=False)\n", " ax.set_zlabel(\"loss\")\n", " else:\n", " ax.imshow(z.T[::-1], cmap=cmap, extent=(x_range[0], x_range[1], y_range[0], y_range[1]))\n", " plt.title(title)\n", " ax.set_xlabel(r\"$w_1$\")\n", " ax.set_ylabel(r\"$w_2$\")\n", " plt.tight_layout()\n", " return ax\n", "\n", "\n", "sns.reset_orig()\n", "_ = plot_curve(pathological_curve_loss, plot_3d=True)\n", "plt.show()"]}, {"cell_type": "markdown", "id": "19a861f0", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.142508, "end_time": "2021-09-16T12:37:38.074173", "exception": false, "start_time": "2021-09-16T12:37:37.931665", "status": "completed"}, "tags": []}, "source": ["In terms of optimization, you can image that $w_1$ and $w_2$ are weight parameters, and the curvature represents the loss surface over the space of $w_1$ and $w_2$.\n", "Note that in typical networks, we have many, many more parameters than two, and such curvatures can occur in multi-dimensional spaces as well.\n", "\n", "Ideally, our optimization algorithm would find the center of the ravine and focuses on optimizing the parameters towards the direction of $w_2$.\n", "However, if we encounter a point along the ridges, the gradient is much greater in $w_1$ than $w_2$, and we might end up jumping from one side to the other.\n", "Due to the large gradients, we would have to reduce our learning rate slowing down learning significantly.\n", "\n", "To test our algorithms, we can implement a simple function to train two parameters on such a surface:"]}, {"cell_type": "code", "execution_count": 31, "id": "662b6ead", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:38.364889Z", "iopub.status.busy": "2021-09-16T12:37:38.364415Z", "iopub.status.idle": "2021-09-16T12:37:38.366481Z", "shell.execute_reply": "2021-09-16T12:37:38.366082Z"}, "papermill": {"duration": 0.148481, "end_time": "2021-09-16T12:37:38.366588", "exception": false, "start_time": "2021-09-16T12:37:38.218107", "status": "completed"}, "tags": []}, "outputs": [], "source": ["def train_curve(optimizer_func, curve_func=pathological_curve_loss, num_updates=100, init=[5, 5]):\n", " \"\"\"\n", " Args:\n", " optimizer_func: Constructor of the optimizer to use. Should only take a parameter list\n", " curve_func: Loss function (e.g. pathological curvature)\n", " num_updates: Number of updates/steps to take when optimizing\n", " init: Initial values of parameters. Must be a list/tuple with two elements representing w_1 and w_2\n", " Returns:\n", " Numpy array of shape [num_updates, 3] with [t,:2] being the parameter values at step t, and [t,2] the loss at t.\n", " \"\"\"\n", " weights = nn.Parameter(torch.FloatTensor(init), requires_grad=True)\n", " optim = optimizer_func([weights])\n", "\n", " list_points = []\n", " for _ in range(num_updates):\n", " loss = curve_func(weights[0], weights[1])\n", " list_points.append(torch.cat([weights.data.detach(), loss.unsqueeze(dim=0).detach()], dim=0))\n", " optim.zero_grad()\n", " loss.backward()\n", " optim.step()\n", " points = torch.stack(list_points, dim=0).numpy()\n", " return points"]}, {"cell_type": "markdown", "id": "41e1a6e8", "metadata": {"papermill": {"duration": 0.141299, "end_time": "2021-09-16T12:37:38.655636", "exception": false, "start_time": "2021-09-16T12:37:38.514337", "status": "completed"}, "tags": []}, "source": ["Next, let's apply the different optimizers on our curvature.\n", "Note that we set a much higher learning rate for the optimization algorithms as you would in a standard neural network.\n", "This is because we only have 2 parameters instead of tens of thousands or even millions."]}, {"cell_type": "code", "execution_count": 32, "id": "1a61ad62", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:38.942262Z", "iopub.status.busy": "2021-09-16T12:37:38.940315Z", "iopub.status.idle": "2021-09-16T12:37:39.029692Z", "shell.execute_reply": "2021-09-16T12:37:39.029221Z"}, "papermill": {"duration": 0.232447, "end_time": "2021-09-16T12:37:39.029820", "exception": false, "start_time": "2021-09-16T12:37:38.797373", "status": "completed"}, "tags": []}, "outputs": [], "source": ["SGD_points = train_curve(lambda params: SGD(params, lr=10))\n", "SGDMom_points = train_curve(lambda params: SGDMomentum(params, lr=10, momentum=0.9))\n", "Adam_points = train_curve(lambda params: Adam(params, lr=1))"]}, {"cell_type": "markdown", "id": "b4446416", "metadata": {"papermill": {"duration": 0.141585, "end_time": "2021-09-16T12:37:39.314123", "exception": false, "start_time": "2021-09-16T12:37:39.172538", "status": "completed"}, "tags": []}, "source": ["To understand best how the different algorithms worked, we visualize the update step as a line plot through the loss surface.\n", "We will stick with a 2D representation for readability."]}, {"cell_type": "code", "execution_count": 33, "id": "975c9184", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:39.614947Z", "iopub.status.busy": "2021-09-16T12:37:39.612301Z", "iopub.status.idle": "2021-09-16T12:37:39.981201Z", "shell.execute_reply": "2021-09-16T12:37:39.981582Z"}, "papermill": {"duration": 0.52499, "end_time": "2021-09-16T12:37:39.981743", "exception": false, "start_time": "2021-09-16T12:37:39.456753", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:39.738497\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["all_points = np.concatenate([SGD_points, SGDMom_points, Adam_points], axis=0)\n", "ax = plot_curve(\n", " pathological_curve_loss,\n", " x_range=(-np.absolute(all_points[:, 0]).max(), np.absolute(all_points[:, 0]).max()),\n", " y_range=(all_points[:, 1].min(), all_points[:, 1].max()),\n", " plot_3d=False,\n", ")\n", "ax.plot(SGD_points[:, 0], SGD_points[:, 1], color=\"red\", marker=\"o\", zorder=1, label=\"SGD\")\n", "ax.plot(SGDMom_points[:, 0], SGDMom_points[:, 1], color=\"blue\", marker=\"o\", zorder=2, label=\"SGDMom\")\n", "ax.plot(Adam_points[:, 0], Adam_points[:, 1], color=\"grey\", marker=\"o\", zorder=3, label=\"Adam\")\n", "plt.legend()\n", "plt.show()"]}, {"cell_type": "markdown", "id": "bb33cfd9", "metadata": {"papermill": {"duration": 0.144924, "end_time": "2021-09-16T12:37:40.273634", "exception": false, "start_time": "2021-09-16T12:37:40.128710", "status": "completed"}, "tags": []}, "source": ["We can clearly see that SGD is not able to find the center of the optimization curve and has a problem converging due to the steep gradients in $w_1$.\n", "In contrast, Adam and SGD with momentum nicely converge as the changing direction of $w_1$ is canceling itself out.\n", "On such surfaces, it is crucial to use momentum."]}, {"cell_type": "markdown", "id": "e016c2c2", "metadata": {"lines_to_next_cell": 2, "papermill": {"duration": 0.143536, "end_time": "2021-09-16T12:37:40.562572", "exception": false, "start_time": "2021-09-16T12:37:40.419036", "status": "completed"}, "tags": []}, "source": ["### Steep optima\n", "\n", "A second type of challenging loss surfaces are steep optima.\n", "In those, we have a larger part of the surface having very small gradients while around the optimum, we have very large gradients.\n", "For instance, take the following loss surfaces:"]}, {"cell_type": "code", "execution_count": 34, "id": "cd6871bc", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:40.872544Z", "iopub.status.busy": "2021-09-16T12:37:40.865830Z", "iopub.status.idle": "2021-09-16T12:37:42.121143Z", "shell.execute_reply": "2021-09-16T12:37:42.121523Z"}, "papermill": {"duration": 1.413284, "end_time": "2021-09-16T12:37:42.121671", "exception": false, "start_time": "2021-09-16T12:37:40.708387", "status": "completed"}, "tags": []}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["/tmp/ipykernel_879/1102210584.py:5: MatplotlibDeprecationWarning: Calling gca() with keyword arguments was deprecated in Matplotlib 3.4. Starting two minor releases later, gca() will take no keyword arguments. The gca() function should only be used to get the current axes, or if no axes exist, create new axes with default keyword arguments. To create a new axes with non-default arguments, use plt.axes() or plt.subplot().\n", " ax = fig.gca(projection=\"3d\") if plot_3d else fig.gca()\n"]}, {"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:41.115819\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["def bivar_gaussian(w1, w2, x_mean=0.0, y_mean=0.0, x_sig=1.0, y_sig=1.0):\n", " norm = 1 / (2 * np.pi * x_sig * y_sig)\n", " x_exp = (-1 * (w1 - x_mean) ** 2) / (2 * x_sig ** 2)\n", " y_exp = (-1 * (w2 - y_mean) ** 2) / (2 * y_sig ** 2)\n", " return norm * torch.exp(x_exp + y_exp)\n", "\n", "\n", "def comb_func(w1, w2):\n", " z = -bivar_gaussian(w1, w2, x_mean=1.0, y_mean=-0.5, x_sig=0.2, y_sig=0.2)\n", " z -= bivar_gaussian(w1, w2, x_mean=-1.0, y_mean=0.5, x_sig=0.2, y_sig=0.2)\n", " z -= bivar_gaussian(w1, w2, x_mean=-0.5, y_mean=-0.8, x_sig=0.2, y_sig=0.2)\n", " return z\n", "\n", "\n", "_ = plot_curve(comb_func, x_range=(-2, 2), y_range=(-2, 2), plot_3d=True, title=\"Steep optima\")"]}, {"cell_type": "markdown", "id": "c8f39870", "metadata": {"papermill": {"duration": 0.178093, "end_time": "2021-09-16T12:37:42.477403", "exception": false, "start_time": "2021-09-16T12:37:42.299310", "status": "completed"}, "tags": []}, "source": ["Most of the loss surface has very little to no gradients.\n", "However, close to the optima, we have very steep gradients.\n", "To reach the minimum when starting in a region with lower gradients, we expect an adaptive learning rate to be crucial.\n", "To verify this hypothesis, we can run our three optimizers on the surface:"]}, {"cell_type": "code", "execution_count": 35, "id": "6c6e55f0", "metadata": {"execution": {"iopub.execute_input": "2021-09-16T12:37:42.839406Z", "iopub.status.busy": "2021-09-16T12:37:42.838932Z", "iopub.status.idle": "2021-09-16T12:37:43.565142Z", "shell.execute_reply": "2021-09-16T12:37:43.565521Z"}, "papermill": {"duration": 0.910446, "end_time": "2021-09-16T12:37:43.565669", "exception": false, "start_time": "2021-09-16T12:37:42.655223", "status": "completed"}, "tags": []}, "outputs": [{"data": {"application/pdf": "\n", "image/svg+xml": ["\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-09-16T14:37:43.302510\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.4.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n"], "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["SGD_points = train_curve(lambda params: SGD(params, lr=0.5), comb_func, init=[0, 0])\n", "SGDMom_points = train_curve(lambda params: SGDMomentum(params, lr=1, momentum=0.9), comb_func, init=[0, 0])\n", "Adam_points = train_curve(lambda params: Adam(params, lr=0.2), comb_func, init=[0, 0])\n", "\n", "all_points = np.concatenate([SGD_points, SGDMom_points, Adam_points], axis=0)\n", "ax = plot_curve(comb_func, x_range=(-2, 2), y_range=(-2, 2), plot_3d=False, title=\"Steep optima\")\n", "ax.plot(SGD_points[:, 0], SGD_points[:, 1], color=\"red\", marker=\"o\", zorder=3, label=\"SGD\", alpha=0.7)\n", "ax.plot(SGDMom_points[:, 0], SGDMom_points[:, 1], color=\"blue\", marker=\"o\", zorder=2, label=\"SGDMom\", alpha=0.7)\n", "ax.plot(Adam_points[:, 0], Adam_points[:, 1], color=\"grey\", marker=\"o\", zorder=1, label=\"Adam\", alpha=0.7)\n", "ax.set_xlim(-2, 2)\n", "ax.set_ylim(-2, 2)\n", "plt.legend()\n", "plt.show()"]}, {"cell_type": "markdown", "id": "ab9847ca", "metadata": {"papermill": {"duration": 0.180096, "end_time": "2021-09-16T12:37:43.927376", "exception": false, "start_time": "2021-09-16T12:37:43.747280", "status": "completed"}, "tags": []}, "source": ["SGD first takes very small steps until it touches the border of the optimum.\n", "First reaching a point around $(-0.75,-0.5)$, the gradient direction has changed and pushes the parameters to $(0.8,0.5)$ from which SGD cannot recover anymore (only with many, many steps).\n", "A similar problem has SGD with momentum, only that it continues the direction of the touch of the optimum.\n", "The gradients from this time step are so much larger than any other point that the momentum $m_t$ is overpowered by it.\n", "Finally, Adam is able to converge in the optimum showing the importance of adaptive learning rates."]}, {"cell_type": "markdown", "id": "4fadc8da", "metadata": {"papermill": {"duration": 0.179249, "end_time": "2021-09-16T12:37:44.285856", "exception": false, "start_time": "2021-09-16T12:37:44.106607", "status": "completed"}, "tags": []}, "source": ["### What optimizer to take\n", "\n", "After seeing the results on optimization, what is our conclusion?\n", "Should we always use Adam and never look at SGD anymore?\n", "The short answer: no.\n", "There are many papers saying that in certain situations, SGD (with momentum) generalizes better where Adam often tends to overfit [5,6].\n", "This is related to the idea of finding wider optima.\n", "For instance, see the illustration of different optima below (credit: [Keskar et al., 2017](https://arxiv.org/pdf/1609.04836.pdf)):\n", "\n", "
\n", "\n", "The black line represents the training loss surface, while the dotted red line is the test loss.\n", "Finding sharp, narrow minima can be helpful for finding the minimal training loss.\n", "However, this doesn't mean that it also minimizes the test loss as especially flat minima have shown to generalize better.\n", "You can imagine that the test dataset has a slightly shifted loss surface due to the different examples than in the training set.\n", "A small change can have a significant influence for sharp minima, while flat minima are generally more robust to this change.\n", "\n", "In the next tutorial, we will see that some network types can still be better optimized with SGD and learning rate scheduling than Adam.\n", "Nevertheless, Adam is the most commonly used optimizer in Deep Learning\n", "as it usually performs better than other optimizers, especially for deep\n", "networks."]}, {"cell_type": "markdown", "id": "ea6d2081", "metadata": {"papermill": {"duration": 0.180966, "end_time": "2021-09-16T12:37:44.645609", "exception": false, "start_time": "2021-09-16T12:37:44.464643", "status": "completed"}, "tags": []}, "source": ["## Conclusion\n", "\n", "In this tutorial, we have looked at initialization and optimization techniques for neural networks.\n", "We have seen that a good initialization has to balance the preservation of the gradient variance as well as the activation variance.\n", "This can be achieved with the Xavier initialization for tanh-based networks, and the Kaiming initialization for ReLU-based networks.\n", "In optimization, concepts like momentum and adaptive learning rate can help with challenging loss surfaces but don't guarantee an increase in performance for neural networks.\n", "\n", "\n", "## References\n", "\n", "[1] Glorot, Xavier, and Yoshua Bengio.\n", "\"Understanding the difficulty of training deep feedforward neural networks.\"\n", "Proceedings of the thirteenth international conference on artificial intelligence and statistics.\n", "2010.\n", "[link](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf)\n", "\n", "[2] He, Kaiming, et al.\n", "\"Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.\"\n", "Proceedings of the IEEE international conference on computer vision.\n", "2015.\n", "[link](https://www.cv-foundation.org/openaccess/content_iccv_2015/html/He_Delving_Deep_into_ICCV_2015_paper.html)\n", "\n", "[3] Kingma, Diederik P. & Ba, Jimmy.\n", "\"Adam: A Method for Stochastic Optimization.\"\n", "Proceedings of the third international conference for learning representations (ICLR).\n", "2015.\n", "[link](https://arxiv.org/abs/1412.6980)\n", "\n", "[4] Keskar, Nitish Shirish, et al.\n", "\"On large-batch training for deep learning: Generalization gap and sharp minima.\"\n", "Proceedings of the fifth international conference for learning representations (ICLR).\n", "2017.\n", "[link](https://arxiv.org/abs/1609.04836)\n", "\n", "[5] Wilson, Ashia C., et al.\n", "\"The Marginal Value of Adaptive Gradient Methods in Machine Learning.\"\n", "Advances in neural information processing systems.\n", "2017.\n", "[link](https://papers.nips.cc/paper/7003-the-marginal-value-of-adaptive-gradient-methods-in-machine-learning.pdf)\n", "\n", "[6] Ruder, Sebastian.\n", "\"An overview of gradient descent optimization algorithms.\"\n", "arXiv preprint.\n", "2017.\n", "[link](https://arxiv.org/abs/1609.04747)"]}, {"cell_type": "markdown", "id": "89adecb4", "metadata": {"papermill": {"duration": 0.178588, "end_time": "2021-09-16T12:37:45.003498", "exception": false, "start_time": "2021-09-16T12:37:44.824910", "status": "completed"}, "tags": []}, "source": ["## Congratulations - Time to Join the Community!\n", "\n", "Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the Lightning\n", "movement, you can do so in the following ways!\n", "\n", "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool\n", "tools we're building.\n", "\n", "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-pw5v393p-qRaDgEk24~EjiZNBpSQFgQ)!\n", "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself\n", "and share your interests in `#general` channel\n", "\n", "\n", "### Contributions !\n", "The best way to contribute to our community is to become a code contributor! At any time you can go to\n", "[Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/lightning-bolts)\n", "GitHub Issues page and filter for \"good first issue\".\n", "\n", "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", "* [Bolt good first issue](https://github.com/PyTorchLightning/lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", "* You can also contribute your own notebooks with useful examples !\n", "\n", "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", "\n", "![Pytorch Lightning](){height=\"60px\" width=\"240px\"}"]}, {"cell_type": "raw", "metadata": {"raw_mimetype": "text/restructuredtext"}, "source": [".. customcarditem::\n", " :header: Tutorial 3: Initialization and Optimization\n", " :card_description: In this tutorial, we will review techniques for optimization and initialization of neural networks. When increasing the depth of neural networks, there are various challenges...\n", " :tags: Image,Initialization,Optimizers,GPU/TPU,UvA-DL-Course\n", " :image: _static/images/course_UvA-DL/03-initialization-and-optimization.jpg"]}], "metadata": {"jupytext": {"cell_metadata_filter": "colab_type,id,colab,-all", "formats": "ipynb,py:percent", "main_language": "python"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7"}, "papermill": {"default_parameters": {}, "duration": 101.670054, "end_time": "2021-09-16T12:37:45.790248", "environment_variables": {}, "exception": null, "input_path": "course_UvA-DL/03-initialization-and-optimization/Initialization_and_Optimization.ipynb", "output_path": ".notebooks/course_UvA-DL/03-initialization-and-optimization.ipynb", "parameters": {}, "start_time": "2021-09-16T12:36:04.120194", "version": "2.3.3"}}, "nbformat": 4, "nbformat_minor": 5}