What does this _TunerExitException error mean?

ManieTadayon · September 20, 2023, 5:11pm

I have been using Pytorch lightening successfully to setup a trainer and perform training and validation, now I use my class to define my neural network to notebook and setup the trainer for a single node (1 GPU, strategy is ‘auto’, etc) but I get this error when running training and learning_rate finder:

Can someone explain what this error even means? why would I get this error all the sudden? what parameters I can change to get rid of this error?
Thanks

awaelchli · September 20, 2023, 5:30pm

This exception should only be raised internally to stop the tuner. The fact that you saw it probably means it is a bug. Do you have a code example that reproduces this? Can’t say much here at this point except that it shouldn’t happen.

ManieTadayon · September 20, 2023, 5:37pm

May I know what code you are interested in, Is this related to where I am running my code (like is it inside the notebook or IDE)? This is the code for trainer part:

def create_trainer(self,epochs, log_name = None,log_model = True,loss_monitor = 'val_loss', min_delta = 0.001, grad_clip = 0.001,num_nodes=1, strategy:str='auto',devices='auto',enable_checkpointing = True, default_root_dir=None):
      '''
      This function defines the trainer and configure the checkpoint and logger.
      
      Args:
          params(Dict): Dictionary for logger, callback, gradient clipping, etc.
      '''
      if(self.logger_device == 'TensorBoard'):
          logger = TensorBoardLogger(path, version = 1, name=log_name)
      else:
          logger = MLFlowLogger(experiment_name=log_name, log_model=log_model)

      early_stop_callback = EarlyStopping(monitor=loss_monitor, min_delta=min_delta, patience=2, verbose=True, mode="min")

      device = 'gpu' if torch.cuda.is_available() else 'cpu'
      
      pl.seed_everything(22)
      self.trainer = pl.Trainer(
      max_epochs= epochs,
      accelerator=device,
      enable_model_summary=True,
      gradient_clip_val= grad_clip,
      callbacks=[early_stop_callback],
      logger = logger,
      num_nodes = num_nodes,
      strategy = strategy,
      devices = devices,
      enable_checkpointing = enable_checkpointing,
      default_root_dir = default_root_dir
      )

Please let me know which code you want:
This is lr_finder code:

def lr_optimizer(self, min_lr:int = 1e-5, max_lr:int = 1, stop_threshold:float = None):
      '''
      Function to automatically finds the learning rate.
      '''
      self.res = Tuner(self.trainer).lr_find(
      self.net,
      train_dataloaders=self.train_dataloader,
      val_dataloaders=self.val_dataloader,
      max_lr=max_lr,
      min_lr=min_lr,
      early_stop_threshold = stop_threshold
      )
      self.learning_rate = self.res.suggestion()
      print(f"suggested learning rate: {self.res.suggestion()}")
      fig = self.res.plot(show=True, suggest=True)
      fig.show()

ManieTadayon · September 20, 2023, 6:31pm

May I know what you check in the code? Like what. possibly can cause this? Is it type of data, is it structure of model, etc?

awaelchli · September 20, 2023, 7:27pm

I’m not sure how this can happen. Your settings look good. Which version of Lightning are you using?

ManieTadayon · September 20, 2023, 9:12pm

It is the latest version 2.0. This code was working on VSCode (when the class is written there and testing it in notebook), the error appears, when I move the whole class and class instantiation in the notebook and wondering if this triggers anything.

shirondrusinsky · March 6, 2024, 9:35pm

I also see this error in version 2.1.0. It appears as if the lr_find method finishes, as I see a learning rate value selected in the message at the top of the image, but the Exception is raised for unclear reasons

anhtuan23 · October 11, 2024, 4:52am

I got this error when mixing import pytorch_lightning with import lightning.pytorch

lgdhyungonkim · December 23, 2024, 1:52am

Thank you. @anhtuan23

when importing Tuner,
write ‘from lightning.pytorch.tuner import Tuner’
instead of ‘from pytorch_lightning.tuner.tuning import Tuner’

Topic		Replies	Views
Error while fitting the Trainer Trainer	0	1955	June 8, 2021
F1 score output tensor does not require grad and does not have a grad_fn	0	783	March 4, 2021
Pytorch Lightning for prediction LightningModule	0	1710	August 3, 2021
Error while calling Trainer.Fit()	2	1877	March 23, 2023
Cannot Access Checkpoint file while using trainer.fit implementation help	8	1169	November 7, 2022

What does this _TunerExitException error mean?

Related topics