Data from an External Git Repo in Gigantum

Posted by Dav Clark on Aug 7, 2020 4:45:35 PM

This bite-sized post is the first in a series that digs into using Git effectively from within Gigantum. We start with the the most basic thing, which is importing an external Git repository (or "repo") with some data. Gigantum does a lot of Git automation under the hood. While that automation provides nice features like version control by default and the Activity Feed, naive inclusion of a Git repos in your project can lead to some hiccups! So how can we use a dataset that's published on GitHub?

Read More

Topics: Data Science, Open Science, Git

Peer Review via Gigantum

Posted by The Gigantum Team on May 18, 2020 5:16:07 PM

This post is an overview for reviewers that are using Gigantum to inspect code for a manuscript.

Gigantum is a browser base application that integrates with Jupyter &  RStudio to streamline the creation and sharing of reproducible work in Python & R. 

Read More

Topics: Reproducibility, Open Science, Peer Review

Submitting Code via Gigantum

Posted by The Gigantum Team on May 18, 2020 4:18:42 PM

This post is an overview for how to use Gigantum to create and submit reproducible code.

Read More

Topics: Science, Reproducibility, Open Science

Scaling On the Cheap with Dask, Gigantum, and DigitalOcean

Posted by Dav Clark on May 14, 2020 7:13:44 AM

Below, we’ll sketch out a smart approach for using lots of CPU cores without breaking the bank: using your laptop when feasible along with a DIY approach to working on bigger cloud resources as needed. We’ll use Gigantum to automate Git and Docker, along with most details of our cloud environment. With the following approach, you can be up and running Dask on 32 CPU cores on DigitalOcean in about 10 minutes - look at those tasks fly in parallel!

Read More

Topics: Reproducibility, Data Science, Open Science

Extending Git Commit Metadata In Gigantum

Posted by Dean Kleissas - Co-founder and CTO at Gigantum on Jul 20, 2018 12:27:00 PM

At Gigantum, we are building an open-source tool for developing, executing, and sharing data science projects that automates the creation of versioned and containerized code. This way your work is always accessible, reproducible, and transparent. Our ultimate goal is to make science and data science more efficient and reproducible, and we want people to directly access and build on each other’s work without all of the technical hassles. You can learn more about Gigantum, try the Client in the cloud, or download and install it locally at our website:

Read More

Topics: Open Science, Git, Software