.. _remote_fs: ################## Remote Filesystems ################## PyTorch Lightning enables working with data from a variety of filesystems, including local filesystems and several cloud storage providers such as `S3 `_ on `AWS `_, `GCS `_ on `Google Cloud `_, or `ADL `_ on `Azure `_. This applies to saving and writing checkpoints, as well as for logging. Working with different filesystems can be accomplished by appending a protocol like "s3:/" to file paths for writing and reading data. .. code-block:: python # `default_root_dir` is the default path used for logs and checkpoints trainer = Trainer(default_root_dir="s3://my_bucket/data/") trainer.fit(model) For logging, remote filesystem support depends on the particular logger integration being used. Consult :ref:`the documentation of the individual logger ` for more details. .. code-block:: python from lightning.pytorch.loggers import TensorBoardLogger logger = TensorBoardLogger(save_dir="s3://my_bucket/logs/") trainer = Trainer(logger=logger) trainer.fit(model) Additionally, you could also resume training with a checkpoint stored at a remote filesystem. .. code-block:: python trainer = Trainer(default_root_dir=tmpdir, max_steps=3) trainer.fit(model, ckpt_path="s3://my_bucket/ckpts/classifier.ckpt") PyTorch Lightning uses `fsspec `_ internally to handle all filesystem operations. The most common filesystems supported by Lightning are: * Local filesystem: ``file://`` - It's the default and doesn't need any protocol to be used. It's installed by default in Lightning. * Amazon S3: ``s3://`` - Amazon S3 remote binary store, using the library `s3fs `__. Run ``pip install fsspec[s3]`` to install it. * Google Cloud Storage: ``gcs://`` or ``gs://`` - Google Cloud Storage, using `gcsfs `__. Run ``pip install fsspec[gcs]`` to install it. * Microsoft Azure Storage: ``adl://``, ``abfs://`` or ``az://`` - Microsoft Azure Storage, using `adlfs `__. Run ``pip install fsspec[adl]`` to install it. * Hadoop File System: ``hdfs://`` - Hadoop Distributed File System. This uses `PyArrow `__ as the backend. Run ``pip install fsspec[hdfs]`` to install it. You could learn more about the available filesystems with: .. code-block:: python from fsspec.registry import known_implementations print(known_implementations) You could also look into :ref:`CheckpointIO Plugin ` for more details on how to customize saving and loading checkpoints.