What does this _TunerExitException error mean?

I have been using Pytorch lightening successfully to setup a trainer and perform training and validation, now I use my class to define my neural network to notebook and setup the trainer for a single node (1 GPU, strategy is ‘auto’, etc) but I get this error when running training and learning_rate finder:

Can someone explain what this error even means? why would I get this error all the sudden? what parameters I can change to get rid of this error?
Thanks

This exception should only be raised internally to stop the tuner. The fact that you saw it probably means it is a bug. Do you have a code example that reproduces this? Can’t say much here at this point except that it shouldn’t happen.

May I know what code you are interested in, Is this related to where I am running my code (like is it inside the notebook or IDE)? This is the code for trainer part:

def create_trainer(self,epochs, log_name = None,log_model = True,loss_monitor = 'val_loss', min_delta = 0.001, grad_clip = 0.001,num_nodes=1, strategy:str='auto',devices='auto',enable_checkpointing = True, default_root_dir=None):
      '''
      This function defines the trainer and configure the checkpoint and logger.
      
      Args:
          params(Dict): Dictionary for logger, callback, gradient clipping, etc.
      '''
      if(self.logger_device == 'TensorBoard'):
          logger = TensorBoardLogger(path, version = 1, name=log_name)
      else:
          logger = MLFlowLogger(experiment_name=log_name, log_model=log_model)

      early_stop_callback = EarlyStopping(monitor=loss_monitor, min_delta=min_delta, patience=2, verbose=True, mode="min")

      device = 'gpu' if torch.cuda.is_available() else 'cpu'
      
      pl.seed_everything(22)
      self.trainer = pl.Trainer(
      max_epochs= epochs,
      accelerator=device,
      enable_model_summary=True,
      gradient_clip_val= grad_clip,
      callbacks=[early_stop_callback],
      logger = logger,
      num_nodes = num_nodes,
      strategy = strategy,
      devices = devices,
      enable_checkpointing = enable_checkpointing,
      default_root_dir = default_root_dir
      )

Please let me know which code you want:
This is lr_finder code:

def lr_optimizer(self, min_lr:int = 1e-5, max_lr:int = 1, stop_threshold:float = None):
      '''
      Function to automatically finds the learning rate.
      '''
      self.res = Tuner(self.trainer).lr_find(
      self.net,
      train_dataloaders=self.train_dataloader,
      val_dataloaders=self.val_dataloader,
      max_lr=max_lr,
      min_lr=min_lr,
      early_stop_threshold = stop_threshold
      )
      self.learning_rate = self.res.suggestion()
      print(f"suggested learning rate: {self.res.suggestion()}")
      fig = self.res.plot(show=True, suggest=True)
      fig.show()

May I know what you check in the code? Like what. possibly can cause this? Is it type of data, is it structure of model, etc?

I’m not sure how this can happen. Your settings look good. Which version of Lightning are you using?

It is the latest version 2.0. This code was working on VSCode (when the class is written there and testing it in notebook), the error appears, when I move the whole class and class instantiation in the notebook and wondering if this triggers anything.

I also see this error in version 2.1.0. It appears as if the lr_find method finishes, as I see a learning rate value selected in the message at the top of the image, but the Exception is raised for unclear reasons

I got this error when mixing import pytorch_lightning with import lightning.pytorch

1 Like

Thank you. @anhtuan23

when importing Tuner,
write ‘from lightning.pytorch.tuner import Tuner’
instead of ‘from pytorch_lightning.tuner.tuning import Tuner’