{"id":5649059,"date":"2023-10-12T13:04:09","date_gmt":"2023-10-12T17:04:09","guid":{"rendered":"https:\/\/lightning.ai\/pages\/?p=5649059"},"modified":"2024-04-02T13:21:55","modified_gmt":"2024-04-02T17:21:55","slug":"lightning-2.1-train-bigger-better-faster","status":"publish","type":"post","link":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/","title":{"rendered":"Lightning 2.1: Train Bigger, Better, Faster"},"content":{"rendered":"<h1><strong>Lightning 2.1: Train Bigger, Better, Faster<\/strong><\/h1>\n<p><a href=\"https:\/\/lightning.ai\/\">Lightning AI<\/a> is excited to announce the release of Lightning 2.1 \u26a1. The theme this time around is \u201cbigger, better, faster\u201d.<\/p>\n<p><strong>Bigger<\/strong>\u00a0because training large multi-billion parameter models has gotten even more efficient thanks to FSDP, efficient initialization and sharded checkpointing improvements.<\/p>\n<p><strong>Better<\/strong>\u00a0because it\u2019s easier than ever to scale models without making substantial code changes or installing third-party packages.<\/p>\n<p><strong>Faster<\/strong>\u00a0because it leverages the latest hardware features to speed up training in low-bit precision thanks to new precision plugins like <a href=\"https:\/\/github.com\/TimDettmers\/bitsandbytes\">bitsandbytes<\/a> and <a href=\"https:\/\/docs.nvidia.com\/deeplearning\/transformer-engine\">transformer engine<\/a>.<\/p>\n<p>All of these goodies are available both in <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\">PyTorch Lightning<\/a> and <a href=\"https:\/\/lightning.ai\/docs\/fabric\/stable\">Lightning Fabric<\/a>. Don\u2019t know what Fabric is? It\u2019s the latest addition to Lightning\u2019s family of tools &#8211; a fast and lightweight way to scale PyTorch models without boilerplate. You can convert PyTorch code to Fabric in just 5 lines and get access to SOTA distributed training features (DDP, FSDP, DeepSpeed, mixed precision and more) while maintaining full control over your training loop.<\/p>\n<h2>Upgrade to 2.1<\/h2>\n<p>Here is how you upgrade:<br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">pip install -U lightning<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\nOr if you\u2019re using the older pytorch-lightning package:<br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">pip install -U pytorch-lightning<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\nUpgrading from 2.0 to 2.1 won\u2019t require any code changes. If you\u2019re upgrading from a version prior to 2.0, follow our <a href=\"https:\/\/lightning.ai\/docs\/pytorch\/stable\/upgrade\/migration_guide.html\">migration guide<\/a>.<\/p>\n<p>Here are the big highlights we want you to try out in the new release.<\/p>\n<h2>Improvements\u00a0To Large-Scale Training With FSDP<\/h2>\n<p>With FSDP, you can train large billion-parameter models that aren\u2019t able to fit in a single GPU or even a single machine. In this release, the FSDP strategy gets substantial improvements and new features: It is now more user-friendly to configure, has memory management and speed improvements, and we have a brand new end-to-end user guide with best practices (<a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/advanced\/model_parallel\/fsdp.html\">Trainer<\/a>,\u00a0<a href=\"https:\/\/lightning.ai\/docs\/fabric\/latest\/advanced\/model_parallel\/fsdp.html\">Fabric<\/a>).<\/p>\n<h3><strong>Efficient Saving and Loading of Large Checkpoints<\/strong><\/h3>\n<p>When training large billion-parameter models with FSDP, saving and resuming training, or even just loading model parameters for finetuning can be challenging, as users are are often plagued by out-of-memory errors and speed bottlenecks. Starting with saving checkpoints, we added support for distributed\/sharded checkpoints, enabled through the setting\u00a0<code>state_dict_type<\/code>\u00a0in the strategy.<\/p>\n<p><strong>Trainer:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.pytorch.strategies import FSDPStrategy\r\n\r\n# Default used by the strategy\r\nstrategy = FSDPStrategy(state_dict_type=\"full\")\r\n\r\n# Enable saving distributed checkpoints\r\nstrategy = FSDPStrategy(state_dict_type=\"sharded\")\r\n\r\ntrainer = L.Trainer(strategy=strategy, ...)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p><strong>Fabric:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.fabric.strategies import FSDPStrategy\r\n\r\n# Saving distributed checkpoints is the default\r\nstrategy = FSDPStrategy(state_dict_type=\"sharded\")\r\n\r\n# Save consolidated (single file) checkpoints\r\nstrategy = FSDPStrategy(state_dict_type=\"full\")\r\n\r\nfabric = L.Fabric(strategy=strategy, ...)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>Distributed checkpoints are the fastest and most memory efficient way to save the state of very large models. The distributed checkpoint format also makes it efficient to load these checkpoints back for resuming training in parallel, and it reduces the impact on CPU memory usage significantly. Furthermore, we\u2019ve also introduced lazy-loading for non-distributed checkpoints which greatly reduces the impact on CPU memory usage when loading a consolidated (single-file) checkpoint (e.g. for finetuning). Learn more about these features in our FSDP guides (<a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/advanced\/model_parallel\/fsdp.html\">Trainer<\/a>, <a href=\"https:\/\/lightning.ai\/docs\/fabric\/latest\/advanced\/model_parallel\/fsdp.html\">Fabric<\/a>).<\/p>\n<h3><strong>Fast and Memory-Optimized Initialization<\/strong><\/h3>\n<p>A major challenge that users face when working with large models such as LLMs is dealing with the extreme memory requirements. Even something as simple as instantiating a model becomes non-trivial if the model is so large it won\u2019t fit in a single GPU or even a single machine. In Lightning 2.1, we are introducing empty-weights initialization through the\u00a0<code>Fabric.init_module()<\/code>\u00a0and\u00a0<code>Trainer.init_module()<\/code>\/<code>LightningModule.configure_model()<\/code> methods.<\/p>\n<p><strong>Trainer:<\/strong><br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">\n\n<pre>import lightning as L\r\n\r\nclass MyModel(L.LightningModule):\r\n    def __init__(self):\r\n        super().__init__()\r\n        # Delay initialization of model to `configure_model()`\r\n\r\n    def configure_model(self):\r\n        # Model initialized in correct precision and weights on meta-device\r\n        self.model = ...\r\n\r\n    ...\r\n\r\ntrainer = L.Trainer(strategy=\"fsdp\", ...)\r\ntrainer.fit(model)\r\n<\/pre>\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\n<strong>Fabric:<\/strong><br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">\n\n<pre>import lightning as L\r\n\r\nfabric = L.Fabric(strategy=\"fsdp\", ...)\r\n\r\n# Model initialized in correct precision and weights on meta-device\r\nwith fabric.init_module(empty_init=True):\r\n    model = ...\r\n\r\n# You can also initialize buffers and tensors directly on device and dtype\r\nwith fabric.init_tensor():\r\n    model.mask.create()\r\n    model.kv_cache.create()\r\n    x = torch.randn(4, 128)\r\n\r\n# Materialization and sharding of model happens inside here\r\nmodel = fabric.setup(model)\r\n<\/pre>\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\nRead more about this new feature and its other benefits in our docs (<a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/advanced\/model_init.html\">Trainer<\/a>,\u00a0<a href=\"https:\/\/lightning.ai\/docs\/fabric\/latest\/advanced\/model_init.html\">Fabric<\/a>).<\/p>\n<h3><strong>User-Friendly Configuration<\/strong><\/h3>\n<p>We made it super easy to configure the sharding- and activation-checkpointing policy when you want to auto-wrap particular layers of your model for advanced control.<\/p>\n<p><strong>Before:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.pytorch.strategies import FSDPStrategy\r\nfrom torch.distributed.fsdp.wrap import ModuleWrapPolicy\r\n\r\nstrategy = FSDPStrategy(auto_wrap_policy=ModuleWrapPolicy({MyTransformerBlock}))\r\ntrainer = L.Trainer(strategy=strategy, ...)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p><strong>After:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.pytorch.strategies import FSDPStrategy\r\n\r\nstrategy = FSDPStrategy(auto_wrap_policy={MyTransformerBlock})\r\ntrainer = L.Trainer(strategy=strategy, ...)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<h2><strong>True Half-Precision<\/strong><\/h2>\n<p>Lightning now supports true half-precision for training and inference with all built-in strategies. With this setting, the memory required to store the model weights is only half of what is normally needed when running with float32. In addition, you get the same speed benefits as mixed precision training (<code>precision=\"16-mixed\"<\/code>) has:<br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">\n\n<pre>import lightning as L\r\n\r\n# default\r\ntrainer = L.Trainer(precision=\"32-true\")\r\n\r\n# train with model weights in `torch.float16`\r\ntrainer = L.Trainer(precision=\"16-true\")\r\n\r\n# train with model weights in `torch.bfloat16`\r\n# (if hardware supports it)\r\ntrainer = L.Trainer(precision=\"bf16-true\")\r\n<\/pre>\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\nThe same settings are also available in Fabric! We recommend to try bfloat16 training (<code>precision=\"bf16-true\"<\/code>) as it is often more numerically stable than regular 16-bit precision (<code>precision=\"16-true\"<\/code>).<\/p>\n<h2><strong>Bitsandbytes Quantization<\/strong><\/h2>\n<p>With the new\u00a0<a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/common\/precision_intermediate.html#quantization-via-bitsandbytes\">Bitsandbytes precision plugin<\/a>, you can now quantize your model for significant memory savings during training, finetuning, or inference with a selection of several state-of-the-art quantization algorithms (int8, fp4, nf4 and more). For the first time, Trainer and Fabric make\u00a0<a href=\"https:\/\/github.com\/TimDettmers\/bitsandbytes\">bitsandbytes<\/a>\u00a0easy to use for general models.<\/p>\n<p><strong>Trainer:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.pytorch.plugins import BitsandbytesPrecisionPlugin\r\n\r\n# this will pick out the compute dtype automatically, by default `bfloat16`\r\nprecision = BitsandbytesPrecisionPlugin(\"nf4-dq\")\r\ntrainer = L.Trainer(plugins=precision)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p><strong>Fabric:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\nfrom lightning.fabric.plugins import BitsandbytesPrecision\r\n\r\n# this will pick out the compute dtype automatically, by default `bfloat16`\r\nprecision = BitsandbytesPrecision(\"nf4-dq\")\r\ntrainer = L.Fabric(plugins=precision)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<h2><strong>Transformer Engine<\/strong><\/h2>\n<p>The\u00a0<a href=\"https:\/\/docs.nvidia.com\/deeplearning\/transformer-engine\">Transformer Engine by NVIDIA<\/a>\u00a0is a library for accelerating transformer layers on the new Hopper (H100) generation of GPUs. With the integration in Lightning Trainer and Fabric, you have easy access to the 8-bit mixed precision for significant speed ups:<\/p>\n<p><strong>Trainer:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\n# Select 8-bit mixed precision via TransformerEngine, with model weights in float16\r\ntrainer = L.Trainer(precision=\"transformer-engine-float16\")\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p><strong>Fabric:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\n# Select 8-bit mixed precision via TransformerEngine, with model weights in float16\r\nfabric = L.Fabric(precision=\"transformer-engine-float16\")\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>More configuration options are available through the respective plugins in\u00a0<a href=\"https:\/\/lightning.ai\/docs\/pytorch\/latest\/common\/precision_intermediate.html#float8-mixed-precision-via-nvidia-s-transformerengine\">Trainer<\/a>\u00a0and\u00a0<a href=\"https:\/\/lightning.ai\/docs\/fabric\/latest\/fundamentals\/precision.html#float8-mixed-precision-via-nvidia-s-transformerengine\">Fabric<\/a>.<\/p>\n<h2><strong>Lightning on TPU Goes Brrr<\/strong><\/h2>\n<p>Lightning 2.1 runs on the latest generation of TPU hardware on Google Cloud!\u00a0TPU-v4 and TPU-v5 are now fully supported both in Fabric and Trainer and run using the new PjRT runtime by default. PjRT is the runtime used by Jax and has shown an average improvement of 35% on benchmarks.<\/p>\n<p><strong>Trainer:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\ntrainer = L.Trainer(accelerator=\"tpu\", devices=8)\r\nmodel = MyModel()\r\ntrainer.fit(model)  # uses PjRT if available\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p><strong>Fabric:<\/strong><\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\ndef train(fabric):\r\n    ...\r\n\r\nfabric = L.Fabric(accelerator=\"tpu\")\r\nfabric.launch(train)  # uses PjRT if available\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>And what\u2019s even more exciting, you can now scale massive multi-billion parameter models on TPUs using FSDP.<br \/>\n<pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">\n\n<pre>import lightning as L\r\nfrom lightning.fabric.strategies import XLAFSDPStrategy\r\n\r\nstrategy = XLAFSDPStrategy(\r\n    # Most arguments from the PyTorch native FSDP strategy are also available here!\r\n    auto_wrap_policy={Block},\r\n    activation_checkpointing_policy={Block},\r\n    state_dict_type=\"full\",\r\n    sequential_save=True,\r\n)\r\n\r\nfabric = L.Fabric(devices=8, strategy=strategy)\r\nfabric.launch(finetune)\r\n<\/pre>\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><br \/>\nYou can find a full end-to-end finetuning example script in our\u00a0<a href=\"https:\/\/github.com\/Lightning-AI\/lit-gpt\/blob\/main\/xla\/finetune\/adapter.py\">Lit-GPT repository<\/a>. The new XLA-FSDP strategy is experimental and currently only available in Fabric. Support in the Trainer will follow in the future.<\/p>\n<h2><strong>Granular Control Over Checkpoints in Fabric<\/strong><\/h2>\n<p>Several improvements for checkpoint saving and loading have landed in Fabric, enabling more fine-grained control over what is saved\/loaded while reducing boilerplate code:<\/p>\n<p>There is a new\u00a0<code>Fabric.load_raw()<\/code>\u00a0method with which you can load model- or optimizer state-dicts saved externally by a non-Fabric application (e.g., raw PyTorch):<\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\nfabric = L.Fabric()\r\nmodel = MyModel()\r\n\r\n# A model weights file saved by your friend who doesn't use Fabric\r\nfabric.load_raw(\"path\/to\/model.pt\", model)\r\n\r\n# Equivalent to this:\r\n# model.load_state_dict(torch.load(\"path\/to\/model.pt\"))\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>Then there is new parameter\u00a0<code>Fabric.load(..., strict=True|False)<\/code>\u00a0to disable strict loading:<\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\nfabric = L.Fabric()\r\nmodel = MyModel()\r\nstate = {\"model\": model}\r\n\r\n# strict loading is the default\r\nfabric.load(\"path\/to\/checkpoint.ckpt\", state, strict=True)\r\n\r\n# disable strict loading\r\nfabric.load(\"path\/to\/checkpoint.ckpt\", state, strict=False)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>Finally, a new parameter\u00a0<code>Fabric.save(..., filter=...)<\/code>\u00a0that enables you to exclude certain parameters of your model without writing boilerplate code for it<\/p>\n<pre><pre class=\"code-shortcode dark-theme window- collapse-false \" style=\"--height:falsepx\"><code class=\"language-python\">import lightning as L\r\n\r\nfabric = L.Fabric()\r\nmodel, optimizer = ...\r\n\r\nstate = {\"model\": model, \"optimizer\": optimizer, \"foo\": 123}\r\n\r\n# save only the weights that match a pattern\r\nfilter = {\"model\": lambda k, v: \"weight\" in k}\r\nfabric.save(\"path\/to\/checkpoint.ckpt\", state, filter=filter)\r\n<\/code><div class=\"copy-button\"><button class=\"expand-button\">Expand<\/button><button class=\"copy\">Copy<\/button><\/div><\/pre><\/pre>\n<p>You can read more about the new options in our\u00a0<a href=\"https:\/\/lightning.ai\/docs\/fabric\/stable\/guide\/checkpoint\/checkpoint.html\">checkpoint guide<\/a>.<\/p>\n<h2>Conclusion<\/h2>\n<p>Our vision here at Lightning is to make deep learning accessible to everyone, enabling both beginners and experts to contribute to the advancement of the state of the art in AI research, or to just build cool stuff. With 2.1, we\u2019re putting new tools for making models large and efficient to train in the hands of our users, so that they can invest the time they used to spend in debugging boilerplate code elsewhere.<\/p>\n<p>Lightning is built by the community, for the community. We want to thank the 75+ developers who have contributed code to 2.1, and hundreds of users who gave us feedback.<\/p>\n<p>Make sure to join our <a href=\"https:\/\/discord.gg\/XncpTy7DSt\">Discord<\/a> if you have any questions or want to chat about Lightning.<\/p>\n<p><!-- notionvc: e2bc4c41-67ac-4ac8-84f3-2b7c898d659e --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lightning 2.1: Train Bigger, Better, Faster Lightning AI is excited to announce the release of Lightning 2.1 \u26a1. The theme this time around is \u201cbigger, better, faster\u201d. Bigger\u00a0because training large multi-billion parameter models has gotten even more efficient thanks to FSDP, efficient initialization and sharded checkpointing improvements. Better\u00a0because it\u2019s easier than ever to scale models<a class=\"excerpt-read-more\" href=\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\" title=\"ReadLightning 2.1: Train Bigger, Better, Faster\">&#8230; Read more &raquo;<\/a><\/p>\n","protected":false},"author":16,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[101,106,104],"tags":[96,31,191,61,188,97,51,62],"glossary":[],"acf":{"additional_authors":false,"mathjax":false,"default_editor":true,"show_table_of_contents":false,"hide_from_archive":false,"content_type":"Blog Post","sticky":false,"code_embed":true,"custom_styles":"body .copy-button {\r\n    display: none !important;\r\n}","tabs":false,"code_shortcode":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Lightning 2.1: Train Bigger, Better, Faster - Lightning AI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Lightning 2.1: Train Bigger, Better, Faster - Lightning AI\" \/>\n<meta property=\"og:description\" content=\"Lightning 2.1: Train Bigger, Better, Faster Lightning AI is excited to announce the release of Lightning 2.1 \u26a1. The theme this time around is \u201cbigger, better, faster\u201d. Bigger\u00a0because training large multi-billion parameter models has gotten even more efficient thanks to FSDP, efficient initialization and sharded checkpointing improvements. Better\u00a0because it\u2019s easier than ever to scale models... Read more &raquo;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\" \/>\n<meta property=\"og:site_name\" content=\"Lightning AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-12T17:04:09+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-02T17:21:55+00:00\" \/>\n<meta name=\"author\" content=\"JP Hennessy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:site\" content=\"@LightningAI\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"JP Hennessy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\"},\"author\":{\"name\":\"JP Hennessy\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\"},\"headline\":\"Lightning 2.1: Train Bigger, Better, Faster\",\"datePublished\":\"2023-10-12T17:04:09+00:00\",\"dateModified\":\"2024-04-02T17:21:55+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\"},\"wordCount\":1077,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"keywords\":[\"ai\",\"deep learning\",\"large language models\",\"lightning\",\"LLMs\",\"ml\",\"pytorch\",\"pytorch lightning\"],\"articleSection\":[\"Announcements\",\"Community\",\"Lightning Releases\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\",\"url\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\",\"name\":\"Lightning 2.1: Train Bigger, Better, Faster - Lightning AI\",\"isPartOf\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#website\"},\"datePublished\":\"2023-10-12T17:04:09+00:00\",\"dateModified\":\"2024-04-02T17:21:55+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/lightning.ai\/pages\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lightning 2.1: Train Bigger, Better, Faster\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/lightning.ai\/pages\/#website\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"name\":\"Lightning AI\",\"description\":\"The platform for teams to build AI.\",\"publisher\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/lightning.ai\/pages\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/lightning.ai\/pages\/#organization\",\"name\":\"Lightning AI\",\"url\":\"https:\/\/lightning.ai\/pages\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"contentUrl\":\"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png\",\"width\":1744,\"height\":856,\"caption\":\"Lightning AI\"},\"image\":{\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/LightningAI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6\",\"name\":\"JP Hennessy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g\",\"caption\":\"JP Hennessy\"},\"url\":\"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Lightning 2.1: Train Bigger, Better, Faster - Lightning AI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/","og_locale":"en_US","og_type":"article","og_title":"Lightning 2.1: Train Bigger, Better, Faster - Lightning AI","og_description":"Lightning 2.1: Train Bigger, Better, Faster Lightning AI is excited to announce the release of Lightning 2.1 \u26a1. The theme this time around is \u201cbigger, better, faster\u201d. Bigger\u00a0because training large multi-billion parameter models has gotten even more efficient thanks to FSDP, efficient initialization and sharded checkpointing improvements. Better\u00a0because it\u2019s easier than ever to scale models... Read more &raquo;","og_url":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/","og_site_name":"Lightning AI","article_published_time":"2023-10-12T17:04:09+00:00","article_modified_time":"2024-04-02T17:21:55+00:00","author":"JP Hennessy","twitter_card":"summary_large_image","twitter_creator":"@LightningAI","twitter_site":"@LightningAI","twitter_misc":{"Written by":"JP Hennessy","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#article","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/"},"author":{"name":"JP Hennessy","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6"},"headline":"Lightning 2.1: Train Bigger, Better, Faster","datePublished":"2023-10-12T17:04:09+00:00","dateModified":"2024-04-02T17:21:55+00:00","mainEntityOfPage":{"@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/"},"wordCount":1077,"commentCount":0,"publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"keywords":["ai","deep learning","large language models","lightning","LLMs","ml","pytorch","pytorch lightning"],"articleSection":["Announcements","Community","Lightning Releases"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/","url":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/","name":"Lightning 2.1: Train Bigger, Better, Faster - Lightning AI","isPartOf":{"@id":"https:\/\/lightning.ai\/pages\/#website"},"datePublished":"2023-10-12T17:04:09+00:00","dateModified":"2024-04-02T17:21:55+00:00","breadcrumb":{"@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/lightning.ai\/pages\/community\/lightning-releases\/lightning-2.1-train-bigger-better-faster\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/lightning.ai\/pages\/"},{"@type":"ListItem","position":2,"name":"Lightning 2.1: Train Bigger, Better, Faster"}]},{"@type":"WebSite","@id":"https:\/\/lightning.ai\/pages\/#website","url":"https:\/\/lightning.ai\/pages\/","name":"Lightning AI","description":"The platform for teams to build AI.","publisher":{"@id":"https:\/\/lightning.ai\/pages\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/lightning.ai\/pages\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/lightning.ai\/pages\/#organization","name":"Lightning AI","url":"https:\/\/lightning.ai\/pages\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/","url":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","contentUrl":"https:\/\/lightningaidev.wpengine.com\/wp-content\/uploads\/2023\/02\/image-17.png","width":1744,"height":856,"caption":"Lightning AI"},"image":{"@id":"https:\/\/lightning.ai\/pages\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/LightningAI"]},{"@type":"Person","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/2518f4d5541f8e98016f6289169141a6","name":"JP Hennessy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/lightning.ai\/pages\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28ade268218ae45f723b0b62499f527a?s=96&d=mm&r=g","caption":"JP Hennessy"},"url":"https:\/\/lightning.ai\/pages\/author\/jplightning-ai\/"}]}},"_links":{"self":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5649059"}],"collection":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/comments?post=5649059"}],"version-history":[{"count":0,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/posts\/5649059\/revisions"}],"wp:attachment":[{"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/media?parent=5649059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/categories?post=5649059"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/tags?post=5649059"},{"taxonomy":"glossary","embeddable":true,"href":"https:\/\/lightning.ai\/pages\/wp-json\/wp\/v2\/glossary?post=5649059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}