When I encountered to work on multiple projects at the same time, I was supported for using a workstation with 4 GPUs(GeForce 1080Ti, 11G). While delving in to find out the way to efficiently utilize given resources, I figured out that Tensorflow offers some options that enable us to assign multiple GPUs. Below contents describe how to assign GPUs corresponding to your intention, tasks, and models. Specifically, there could be 3 cases. First, multiple jobs(scripts) at different GPUs. Secondly, multiple models at each GPU. Finally, one big model to distributed GPUs.
1. Multi tasks –> Multi GPUs
When you have multiple projects(scripts) and each project requires to use GPU, you can assign each GPU to the corresponding project(script). Without declaration about the list of the visible device, Tensorflow captures all accessible GPUs. To prevent such holdings, assigning Configuration with GPU naming in Session is necessary. Suppose that model is already declared and training with specified epoch will be followed.
2. Multi models –> Multi GPUs
If implementing multiple models at each GPUs(e.g. ensemble) is required, each GPU can be allocated to models(graph). Additionally, if each graph shares lots of common parts such as input, output, dropout, and cost, declare graph with “scope” allow us to access each component of graph more efficiently at training parts. In order to show how to access shared component name across the different graph, I only changed the size of the node at the last layer of each graph and named components same.
Although runtime might vary depending on the size of batch and data, training graph parallel across GPUs shorten the time about 1 sec for each iteration rather than training each graph sequentially(e.g. MNIST data with 10 batch size is used).
3. One big model –> Distributed GPUs
Once the size of necessary model exceeds that of GPU memory, the model can be divided into multiple jobs where each of them are executed by different GPUs.
Wrap up
Hopefully, I wish to improve the performance of carrying out the code by properly applying described methods with multi-GPU.
On the other hand, although GPU seems to be powerful for computation-intensive works, it might slow down for the case of running conditional statements relative to CPU. Sp, optimally assigning each task or job to appropriate memory can be another option to boost the performance of work.