I’m new to pytorch-lightning, while this problem has tortured me for long without finding solutions or similar problems via searching on Google:
When I’m not specifying the logger
argument of Trainer
while specifying default_root_dir
, pl acts well creating new version_XXX
directories when running a new experiment. But ever since I specify a logger (for example, as the CLI yaml config file shown as follows):
logger:
- class_path: CSVLogger
init_args:
save_dir: 'runs/lightningdemo/fashionmnist_1/'
Every time I run a new experiment, pl will continue logging in the previous version_XXX
directory, and it CAN’T run unless I pass save_config_callback=None
or save_config_kwargs={"overwrite": True}
I expected pl to create a new version_XXX
directory when proceeding new experiments, but it literally fails to do that. Does anyone have the concept about what’s wrong with my code and settings?
Hey @ChristLBUPT
This was fixed recently in https://github.com/Lightning-AI/lightning/pull/17139
If you upgrade Lightning to the latest version (2.0.5), this should be resolved.
Alternatively, you can also use the TensorBoardLogger.
Let me know if that resolves the issue.
Many thanks! Upgrading solves this issue! But I still have a little confusion:
When calling trainer.log_dir
, it seems that only if the first logger is a TensorBoardLogger
will the experiment and version names be taken into consideration (as shown by the following code). This confused me a little since I used CSVLogger
as the first logger of my trainer and found the config.yaml
file was saved directly into the logger’s save_dir.
def log_dir(self) -> Optional[str]:
"""The directory for the current experiment. Use this to save images to, etc...
.. code-block:: python
def training_step(self, batch, batch_idx):
img = ...
save_img(img, self.trainer.log_dir)
"""
if len(self.loggers) > 0:
if not isinstance(self.loggers[0], TensorBoardLogger):
dirpath = self.loggers[0].save_dir
else:
dirpath = self.loggers[0].log_dir
else:
dirpath = self.default_root_dir
dirpath = self.strategy.broadcast(dirpath)
return dirpath