I have seen other have posted a similar question, but there were not answers.
I am getting the error message:
pytorch_lightning.utilities.exceptions.MisconfigurationException: No
training_step()method defined. Lightning
Trainerexpects as minimum a
training_step(),
train_dataloader()and
configure_optimizers() to be defined.
but all of the previous methods look implemented to me. Unfortunately, I cannot share a minimal working example due to the way the system is implemented, but I will try and describe what I did.
The LightningModule is defined as follows:
class VGG(pl.LightningModule):
def __init__(self, config, in_channels=1, n_classes=118, logger=None):
super().__init__()
self.config = config
[...]
self.conv_layers = self._create_conv_layes(spec)
self.output = nn.Sequential(nn.Linear(512 * 7 * 7, 4096), nn.ReLU(), nn.Dropout(p=0.5),
nn.Linear(4096, 4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096,n_classes))
def forward(self, x):
x = self.conv_layers(x)
x = x.reshape(x.shape[0], -1)
x = self.output(x)
return x
def configure_optimizers(self):
optimizer = torch.optim.SGD(self.model.parameters(), lr=self.config["learning_rate"], momentum=self.config["momentum"])
lr_scheduler = ReduceLROnPlateau(optimizer, 'min', factor=0.05, patience=5, cooldown=0, verbose=True)
return optimizer, lr_scheduler
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self.forward(x)
loss = F.cross_entropy(y_hat, y)
return {'loss':loss}
[....]
I then use a Trainer
and a DataModule
:
trainer = pl.Trainer(gpus=1, max_epochs=config["n_epochs"], progress_bar_refresh_rate=5)
datamodule = MyDataModule(config, logger=log)
trainer.fit(model, datamodule)
I know the code above is likely not enough, but may be it makes you able to suggest me what I should be looking at to solve the problem?
I have seen that the error is cause by the check in the method:
def is_overridden(method_name: str, model: Union[LightningModule, LightningDataModule]) → bool:
in model_helpers.py
:
def is_overridden(method_name: str, model: Union[LightningModule, LightningDataModule]) -> bool:
# if you pass DataModule instead of None or a LightningModule, we use LightningDataModule as super
# TODO - refector this function to accept model_name, instance, parent so it makes more sense
super_object = LightningModule if not isinstance(model, LightningDataModule) else LightningDataModule
if not hasattr(model, method_name) or not hasattr(super_object, method_name):
# in case of calling deprecated method
return False
instance_attr = getattr(model, method_name)
if not instance_attr:
return False
super_attr = getattr(super_object, method_name)
# when code pointers are different, it was implemented
if hasattr(instance_attr, 'patch_loader_code'):
# cannot pickle __code__ so cannot verify if PatchDataloader
# exists which shows dataloader methods have been overwritten.
# so, we hack it by using the string representation
is_overridden = instance_attr.patch_loader_code != str(super_attr.__code__)
else:
is_overridden = instance_attr.__code__ is not super_attr.__code__
print(f'last is_overriden should be True, but it is {is_overridden}')
return is_overridden
The last print is output, so, basically, lightnings thinks I did not override the method. Why?