Converting PyTorch implementation to PyTorch Lightning for Graph Neural Networks

WKuro · October 26, 2023, 3:15pm

I have a Graph Neural Network that operates on directed multigraph where the Data class is from torch_geometric. The data is under this following form:

Data(x=[420, 13], edge_index=[2, 1248], edge_attr=[1248, 2, 718], y=[420], train_mask=[420], test_mask=[420], val_mask=[420])

where both nodes and edges have attributes. I tried to convert the following training script to torch-lightning (taken from github of a paper):

# weighted loss preparation
train_class_ratio = dataset.y[dataset.train_mask].sum().item()/dataset.y[dataset.train_mask].shape[0]
train_class_weights = torch.Tensor([train_class_ratio,1-train_class_ratio]).to(device)

# training loop
start = time.time()
for epoch in range(epochs):
    optimizer.zero_grad()
    loss = F.nll_loss(model(data)[data.train_mask], data.y[data.train_mask], weight=train_class_weights)
    loss.backward()
    optimizer.step() 

# calculate final accuracy
model.eval()
test_acc = (
    model(data).max(dim=1)[1][data.test_mask].eq(data.y[data.test_mask]).sum().item()
    / data.test_mask.sum().item()
)

From my understanding, the whole dataset is fed into the training loop for each epoch (I could be very wrong about this). And since this is graph-structured data, I do not know how to implement a proper DataLoader for this script. So far this is my attempt:

model.py

class Model(pl.LightningModule):
    # model implementation
    def forward(self, data);
        ...
    def training_step(self, batch, batch_idx):
        data, target = batch.x, batch.y
        logits = self(data)
        loss = F.nll_loss(logits[data.train_mask], target[data.train_mask], 
        weight=self.params["train_class_weights"])
        self.log("train_loss", loss)
        return loss

    def test_step(self, batch, batch_idx):
        data, target = batch.x, batch.y
        logits = self(data)
        loss = F.nll_loss(logits[data.test_mask], target[data.test_mask])
        acc = (logits[data.test_mask].max(dim=1)[1] == target[
            data.test_mask]).sum().item() / data.test_mask.sum().item()
        self.log('test_loss', loss)
        self.log('test_acc', acc)

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.params["lr"], weight_decay=self.params["weight_decay"])
        return optimizer

training_script.ipynb

# load full dataset, define parameters
...

model = Model(dataset, params)
trainer = pl.Trainer(max_epochs=params["epoch"])
trainer.fit(lgcn_model)
trainer.test(lgcn_model)

Could you help me to implement the appropriate DataLoader for this case?

MisconfigurationException: train_dataloader must be implemented to be used with the Lightning Trainer

aniketmaurya · October 29, 2023, 3:51pm

PyTorch Geometric provides LightningDataset, which creates a datamodule. that can be used for training the LightningModule.

WKuro · October 29, 2023, 10:35pm

Can you show me how it is done based on the dataset I provided? Since the original training loop basically feeds the whole dataset as input, each train/val/test part is split based on the boolean masks (train_mask, val_mask, test_mask). I’m not familiar with such approach, and how to split them into 3 functions for train/val/test dataloaders.