Is it possible to run part of the model in deepspeed/fsdp and rest in ddp

@andrasiani In FSDP, there is a possibility to manually wrap yes. Have you tried this?
https://lightning.ai/docs/pytorch/stable/advanced/model_parallel.html#manual-wrapping

Using the auto-wrap-policy, the same should be possible too. And yes, afaik it is normal that the top level module is wrapped with FSDP.

For deepspeed, I don’t think it is possible to control that, but I haven’t checked in detail.