What is the best practice to share a massive CPU tensor over multiple processes in pytorch-lightning DDP mode (read-only + single machine)? #8611
-
Hi everyone, I wonder what is the best practice to share a massive CPU tensor over multiple processes in pytorch-lightning DDP mode (read-only + single machine)? I think torch.Storage.from_file with I also tried to copy training data to /dev/shm (reference) and run DDP with 8 GPUs, but nothing is different. The memory usage when running with 8 GPUs is the same as before, but I tested with a single process, loading the dataset may occupy more than 1 GB of memory. Am I missing something here? For Thank you. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Dear @siahuat0727, Lightning doesn't support shared tensors yet, but there is some work being done around it. Best, |
Beta Was this translation helpful? Give feedback.
-
I found that |
Beta Was this translation helpful? Give feedback.
I found that
torch.Storage.from_file
suits my needs and it can reduce the memory usage in my Lightning DDP program.For the way to create a storage file, see here.