Logger not act as expected (log to a new version)

ChristLBUPT · July 12, 2023, 12:41pm

I’m new to pytorch-lightning, while this problem has tortured me for long without finding solutions or similar problems via searching on Google:

When I’m not specifying the logger argument of Trainer while specifying default_root_dir, pl acts well creating new version_XXX directories when running a new experiment. But ever since I specify a logger (for example, as the CLI yaml config file shown as follows):

  logger: 
    - class_path: CSVLogger
      init_args: 
        save_dir:  'runs/lightningdemo/fashionmnist_1/'

Every time I run a new experiment, pl will continue logging in the previous version_XXX directory, and it CAN’T run unless I pass save_config_callback=None or save_config_kwargs={"overwrite": True}

I expected pl to create a new version_XXX directory when proceeding new experiments, but it literally fails to do that. Does anyone have the concept about what’s wrong with my code and settings?

awaelchli · July 12, 2023, 1:26pm

Hey @ChristLBUPT

This was fixed recently in https://github.com/Lightning-AI/lightning/pull/17139
If you upgrade Lightning to the latest version (2.0.5), this should be resolved.
Alternatively, you can also use the TensorBoardLogger.
Let me know if that resolves the issue.

ChristLBUPT · July 18, 2023, 6:14am

Many thanks! Upgrading solves this issue! But I still have a little confusion:

When calling trainer.log_dir, it seems that only if the first logger is a TensorBoardLogger will the experiment and version names be taken into consideration (as shown by the following code). This confused me a little since I used CSVLogger as the first logger of my trainer and found the config.yaml file was saved directly into the logger’s save_dir.

    def log_dir(self) -> Optional[str]:
        """The directory for the current experiment. Use this to save images to, etc...

        .. code-block:: python

            def training_step(self, batch, batch_idx):
                img = ...
                save_img(img, self.trainer.log_dir)
        """
        if len(self.loggers) > 0:
            if not isinstance(self.loggers[0], TensorBoardLogger):
                dirpath = self.loggers[0].save_dir
            else:
                dirpath = self.loggers[0].log_dir
        else:
            dirpath = self.default_root_dir

        dirpath = self.strategy.broadcast(dirpath)
        return dirpath

Topic		Replies	Views
Logger and log dir names	1	6913	September 25, 2020
Saving checkpoints inside version folder callbacks	2	644	July 3, 2023
Default Logging + MLFlow	3	2122	January 12, 2021
Multiple Loggers and DDP	2	2177	October 14, 2020
Weird behavior in lightning logging	3	776	February 6, 2023

Logger not act as expected (log to a new version)

Related topics