I’m new to pytorch-lightning, while this problem has tortured me for long without finding solutions or similar problems via searching on Google:
When I’m not specifying the
logger argument of
Trainer while specifying
default_root_dir, pl acts well creating new
version_XXX directories when running a new experiment. But ever since I specify a logger (for example, as the CLI yaml config file shown as follows):
- class_path: CSVLogger
Every time I run a new experiment, pl will continue logging in the previous
version_XXX directory, and it CAN’T run unless I pass
I expected pl to create a new
version_XXX directory when proceeding new experiments, but it literally fails to do that. Does anyone have the concept about what’s wrong with my code and settings?
This was fixed recently in https://github.com/Lightning-AI/lightning/pull/17139
If you upgrade Lightning to the latest version (2.0.5), this should be resolved.
Alternatively, you can also use the TensorBoardLogger.
Let me know if that resolves the issue.
Many thanks! Upgrading solves this issue! But I still have a little confusion:
trainer.log_dir, it seems that only if the first logger is a
TensorBoardLogger will the experiment and version names be taken into consideration (as shown by the following code). This confused me a little since I used
CSVLogger as the first logger of my trainer and found the
config.yaml file was saved directly into the logger’s save_dir.
def log_dir(self) -> Optional[str]:
"""The directory for the current experiment. Use this to save images to, etc...
.. code-block:: python
def training_step(self, batch, batch_idx):
img = ...
if len(self.loggers) > 0:
if not isinstance(self.loggers, TensorBoardLogger):
dirpath = self.loggers.save_dir
dirpath = self.loggers.log_dir
dirpath = self.default_root_dir
dirpath = self.strategy.broadcast(dirpath)