Open
Description
Problem Description
The current implementation doesn't consider servers with multiple GPUs. For scenarios where several cards, each with a lower VRAM are present, running CTGAN throws an out of memory.
The below trace is during a run where a job was triggered on a T4 GPU (common in cloud servers). The real dataset had 26 columns and 20k rows.
OutOfMemoryError: CUDA out of memory. Tried to allocate 3.46 GiB (GPU 0; 14.76 GiB total capacity; 10.49 GiB already allocated; 621.75 MiB free; 13.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Expected behavior
CTGAN should be able to leverage PyTorch's DataParallel module such that model and data parallelism can be facilitated for bigger batch sizes.