In this post, we explore some of the cutting edge tools coming out of the RAPIDS group at Nvidia. This highlights yet another use-case for the portability provided by the Gigantum Client - we're going to make it easy to try out an evolving code base that includes some fussy dependencies. This post revisits some skills we picked up in our previous post on Dask dashboards, so be sure to check that post if you're interested in parallel computing!
Today, we'll add another set of dashboards to our toolkit that give us insight into our GPU utilization:
We'll use the Jupyter Lab extension from the RAPIDS group to look at GPU utilization while training a model in PyTorch, as well as using the rapidly improving cudf and dask-cudf libraries to perform blazing fast Pandas-style CSV parsing and DataFrame computations.
The Gigantum Quick-start Script makes it easy to get started with the Gigantum Client on most cloud providers (or your personal GPU workstation). The instructions also show how to use SSH to access the remote client securely.
In the video below, we quickly walk through starting an AWS p2.xlarge instance. Whatever provider you use, be sure to provision 40GB+ for the drive! RAPIDS installation via conda currently downloads and unpacks a LOT of library files. Moreover, we've included two sets of data in this project to support exploration of both a machine vision task as well as parsing and computations on large CSV data files.
If you're not familiar with launching cloud GPU instances, please consult the official documentation for any of the major cloud providers! That said, the hardest part of using EC2 (as we do in the video) is navigating the incredible variety of options in the web console - so our video may give you enough guidance to get going.
The video also includes the steps for importing the Gigantum Project in the Client. For reference, the Project we're using is here: https://gigantum.com/tinydav/gpu-dashboards
Basic GPU Dashboards with PyTorch
The first dataset and notebook we'll explore is transfer learning for identification of bees vs. ants. We work this example out in more detail in our PyTorch transfer learning Project. Here, we use it as a fast but non-trivial way to drive our GPU and experiment with dashboards:
As you can see in the video, we're not coming anywhere near the throughput of the card I was using! We can use this information to determine if we have the memory available to increase batch sizes, or if there is unused compute bandwidth that could be used for other needs.
Parsing CSV files
CSV files are a compromise format. We accept the performance penalty of parsing the data in exchange for the ability for humans to easily read and correct issues that might arise. We saw in last weeks post on Dask Dashboards that the bulk of processing time for our chosen tasks was in fact reading and parsing the CSV files! Happily, GPUs aren't just for machine learning (and video games) anymore, the RAPIDS group has done great work implementing fast parsers in CUDA. But first, let's quickly revisit how well we can do with parallel computing.
Recap: parsing CSV files in parallel with Dask
We've included the same dataset and libraries from our Dask Project in our GPU Project this week. You can see that CSV reading and parsing is happening in parallel. On my machine, execution of a simple Dask graph takes about 1.7 seconds:
You should experiment with the dashboard elements that are the most useful for you, but it's not a bad exercise to try and set your Jupyter Lab tabs similar to how I've done. As you can see, you can mix and match dashboard widgets for GPU and Dask and anything else you care to include! I'm running the above on a local workstation, so the GPU is somewhat busy managing my graphics display.
Speedups with Nvidia RAPIDS' CUDF
With only one GPU, we actually get a small performance penalty from using Dask. Even still, the equivalent task completes in about 1/3 the time, and is still dominated by CSV parsing.
The whole point of this post is to motivate you to kick the tires on the GPU dashboards and exciting new core technologies from the RAPIDS group. We made this easy in our Project at https://gigantum.com/tinydav/gpu-dashboards. Check it out and let us know what you think!
Taking a step back, we've highlighted the fact that parsing your data files can dominate processing time in two successive blog posts. If you're going to work with the same files in multiple analyses, maybe you should only pay this cost once and save the files in a more performant binary format? But then you need to make sure everyone has the same libraries to read those files... Fortunately, the Gigantum Client makes it easy to get everyone on the same page with Code, Data, and Environment.