We are working on upgrading the mlflow pytorch examples to be compatible with lightning 2.0.
Previously , we had
trainer.global_rank check to autolog only with rank 0 gpu (to avoid multiple runs).
To be compatible with lightning 2.0, i am upgrading the script to use LightningCLI - Script Link
When i am running the script in 4 gpus, we different mlflow runs get created. Since, the script uses LightningCLI, it no longer has the access to trainer object.
How to invoke the
mlflow.pytorch.autolog with rank 0 gpu alone (to avoid duplicate mlflow runs) ?