Training_step missing but implemented

lordthistle · July 1, 2021, 7:52am

I have seen other have posted a similar question, but there were not answers.

I am getting the error message:

pytorch_lightning.utilities.exceptions.MisconfigurationException: No training_step()method defined. LightningTrainerexpects as minimum atraining_step(), train_dataloader()andconfigure_optimizers() to be defined.

but all of the previous methods look implemented to me. Unfortunately, I cannot share a minimal working example due to the way the system is implemented, but I will try and describe what I did.

The LightningModule is defined as follows:

class VGG(pl.LightningModule):
def __init__(self, config, in_channels=1, n_classes=118,  logger=None):
        super().__init__()
        self.config = config
        [...]
        self.conv_layers = self._create_conv_layes(spec)
        self.output = nn.Sequential(nn.Linear(512 * 7 * 7, 4096), nn.ReLU(), nn.Dropout(p=0.5),
                                 nn.Linear(4096, 4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096,n_classes))


def forward(self, x):
    x = self.conv_layers(x)
    x = x.reshape(x.shape[0], -1)
    x = self.output(x)
    return x

def configure_optimizers(self):
    optimizer = torch.optim.SGD(self.model.parameters(), lr=self.config["learning_rate"], momentum=self.config["momentum"])
    lr_scheduler = ReduceLROnPlateau(optimizer, 'min', factor=0.05, patience=5, cooldown=0, verbose=True)
    return optimizer, lr_scheduler


def training_step(self, batch, batch_idx):
    x, y = batch
    y_hat = self.forward(x)
    loss =  F.cross_entropy(y_hat, y)
    return {'loss':loss}

[....]

I then use a Trainer and a DataModule:

trainer = pl.Trainer(gpus=1, max_epochs=config["n_epochs"], progress_bar_refresh_rate=5)
datamodule = MyDataModule(config, logger=log)
trainer.fit(model, datamodule)

I know the code above is likely not enough, but may be it makes you able to suggest me what I should be looking at to solve the problem?

I have seen that the error is cause by the check in the method:
def is_overridden(method_name: str, model: Union[LightningModule, LightningDataModule]) → bool:
in model_helpers.py:

def is_overridden(method_name: str, model: Union[LightningModule, LightningDataModule]) -> bool:
    # if you pass DataModule instead of None or a LightningModule, we use LightningDataModule as super
    # TODO - refector this function to accept model_name, instance, parent so it makes more sense
    super_object = LightningModule if not isinstance(model, LightningDataModule) else LightningDataModule

    if not hasattr(model, method_name) or not hasattr(super_object, method_name):
        # in case of calling deprecated method
        return False
    instance_attr = getattr(model, method_name)
       
    if not instance_attr:
        return False
    super_attr = getattr(super_object, method_name)

    # when code pointers are different, it was implemented
    if hasattr(instance_attr, 'patch_loader_code'):
        # cannot pickle __code__ so cannot verify if PatchDataloader
        # exists which shows dataloader methods have been overwritten.
        # so, we hack it by using the string representation
        is_overridden = instance_attr.patch_loader_code != str(super_attr.__code__)
    else:
        is_overridden = instance_attr.__code__ is not super_attr.__code__
        print(f'last is_overriden should be True, but it is {is_overridden}')
    return is_overridden

The last print is output, so, basically, lightnings thinks I did not override the method. Why?

Topic		Replies	Views
Pytorch Lightning for prediction LightningModule	0	1726	August 3, 2021
Error while calling Trainer.Fit()	2	1939	March 23, 2023
No `training_step()` method defined Trainer	10	8729	January 9, 2022
Bug in the trainer.predict() Trainer	0	80	June 6, 2024
How to add a pretrain step in training_step	2	1096	May 25, 2022

Training_step missing but implemented

Related topics