Gigantum automates the tracking of your code and data in Git / Git-LFS and reproducing your environment on different machines using Docker. Gigantum runs in Docker, and thus you can use it on pretty much any machine, including Windows. However, Docker has some performance penalties on pre-WSL2 Windows, and Gigantum inherited them. (While you'll rarely see it written out, WSL2 stands for Windows Subsystem for Linux 2.)
Most importantly, in comparison to running on Mac or Ubuntu, Gigantum on Windows had a performance penalty for file access. With WSL2, that is no longer true!
With WSL2, Docker can now run Containers on a real Linux kernel with minimal overhead. For Gigantum, this means that files can now live on a proper Linux filesystem, with no need for translation or "network" sharing to access files from inside a Container. So, as of today we say goodbye to one of the final pain points in Docker for Windows - reading and writing files is slow no more!
In addition, using a native Linux filesystem gets rid of pain points like mis-matched permission systems and gives you first class support for things like fast notifications of file changes. On top of that, Docker shares resources like RAM and CPU directly with Windows, so now you don't have to guess about how to split up your resources.
The rest of this post demonstrates some WSL2 related speedups, and provides instructions to get you started on WSL2.
Benchmarks were run using fio on the same laptop, using Docker for Windows in both "classic" Hyper-V and also WSL2 configurations, as well as Docker running on Ubuntu 18.04. There are several measures, but here's one for read bandwidth:
Note that the y-axis is log-scaled - these are BIG differences. You can also see that Ubuntu 18.04 is still winning for speed, but working in a WSL2 distro is much faster than any approach bind-mounting to the Windows 10 filesystem.
Gigantum on WSL2
Running Gigantum on WSL2 is relatively easy because Docker worked closely with Microsoft to ensure a nearly Linux-native experience leading up to the general release of the WSL2. Below we show you how to take advantage of these improvements.
You will need to hop on the command line to install WSL2 yourself, and then enable WSL2 in the Docker for Windows settings:
If you installed a WSL2 distribution before enabling WSL2 in Docker, then probably the first distribution you installed is your default. You can check this with `wsl --list` (don't include the back-quotes) and change the default with `wsl --set-default <distro name>`. You'll need to either enable the default WSL2 distro or the specific distro you want to use with Gigantum (it's OK to do both), again in the Docker for Windows settings:
Once you've done this, open a terminal with your preferred WSL2 distro (if you're not sure, Ubuntu is a fine choice). Then, you can install our command line tool as usual:
- `pip install gigantum` (or you can use pipx)
- `gigantum update`
- `gigantum start`
That's it! You can use the Gigantum Client as usual and be able to access your files in ~/gigantum from the command line for your WSL2 distro. Those files are also accessible from the Windows side at the `\\wsl$` network location:
Should you avoid WSL2?
To be clear, we have only started using Gigantum on WSL2 in the last few weeks, and as this approach is more widely used, there may be unexpected problems that disrupt your work or even result in data loss. Remember, we are in beta territory and your mileage may vary with WSL2!
Similarly, if you love Gigantum because it helps you avoid command-line terminals, then you should wait until we have WSL2 support in the Gigantum GUI. In the meantime, you'll have no trouble sharing projects with your early adopter colleagues who are willing to beta-test our WSL2 support.
Looking ahead to GPUs on WSL2
Even though WSL2 + Docker is still feeling pretty fresh, Microsoft, Nvidia, and Canonical have all published instructions on getting CUDA working on WSL2 including via Docker. We've already blogged about the ways that GPU support can radically boost the speed of not only machine learning, but also more common tasks like parsing large CSV files. Gaming laptops are about to become a lot more fun!
In the near future, we'll be evaluating support for GPUs on Windows in Gigantum. But if you figure it out before we post about it, be sure to share your progress with the community on our forum! And in the meantime, enjoy the already improved performance of using Gigantum on WSL2!