This bite-sized post is the first in a series that digs into using Git effectively from within Gigantum. We start with the the most basic thing, which is importing an external Git repository (or "repo") with some data. Gigantum does a lot of Git automation under the hood. While that automation provides nice features like version control by default and the Activity Feed, naive inclusion of a Git repos in your project can lead to some hiccups! So how can we use a dataset that's published on GitHub?
This post is an overview for reviewers that are using Gigantum to inspect code for a manuscript.
Gigantum is a browser base application that integrates with Jupyter & RStudio to streamline the creation and sharing of reproducible work in Python & R.
Below, we’ll sketch out a smart approach for using lots of CPU cores without breaking the bank: using your laptop when feasible along with a DIY approach to working on bigger cloud resources as needed. We’ll use Gigantum to automate Git and Docker, along with most details of our cloud environment. With the following approach, you can be up and running Dask on 32 CPU cores on DigitalOcean in about 10 minutes - look at those tasks fly in parallel!
At Gigantum, we are building an open-source tool for developing, executing, and sharing data science projects that automates the creation of versioned and containerized code. This way your work is always accessible, reproducible, and transparent. Our ultimate goal is to make science and data science more efficient and reproducible, and we want people to directly access and build on each other’s work without all of the technical hassles. You can learn more about Gigantum, try the Client in the cloud, or download and install it locally at our website: https://gigantum.com