Manual Optimization with Deepspeed

kushalj001 · May 19, 2023, 6:16am

I have been using automatic optimization with deepspeed earlier and it worked very well for me. My training loop now is a bit complex (involves RL), so I cannot use automatic optimization. In my script, I am currently using deepspeed via DeepSpeedStrategy and my lightning module does manual optimization. There were some initial bugs about some parameters being on cpu instead of cuda, so I had my input put on self.device explicitly in my lightning module (I am not sure if manual optimization handles that anymore). Nevertheless, my script is running currently but is extremely slow with minimal GPU utilization. I cannot track down the exact problem since no error is being thrown. But I wanted to know if lightning supports deepspeed with manual optimization? If not, do you have any recommendations that would involve minimal code change from my current state (like native torch + native deepspeed)?
Thank you!

Topic		Replies	Views
Lack of documentation on deepspeed / fsdp DDP/GPU	0	752	April 24, 2023
Does PyTorch Lightning support Torch Elastic in FSDP DDP/GPU	1	323	January 21, 2024
DDP for `devices=1` and SingleDevice (`devices=1` and `strategy='auto'`) give different results DDP/GPU	0	144	May 10, 2024
Why might speed stay the same when moving from 1 GPU to 8 GPUs (DDP)? DDP/GPU	2	1409	September 6, 2020
Effective learning rate and batch size with Lightning in DDP DDP/GPU	19	13107	October 9, 2020

Manual Optimization with Deepspeed

Related topics