Hello
I am trying to run inferencing via llama-2, I am unable to run it because my gpu isnt enough. I have multiple computers each having 32 ram. Is it possible to run a model parallel in this setting ?
Hello
I am trying to run inferencing via llama-2, I am unable to run it because my gpu isnt enough. I have multiple computers each having 32 ram. Is it possible to run a model parallel in this setting ?