Blog

Data from an External Git Repo in Gigantum

Posted by Dav Clark on Aug 7, 2020 4:45:35 PM

This bite-sized post is the first in a series that digs into using Git effectively from within Gigantum. We start with the the most basic thing, which is importing an external Git repository (or "repo") with some data. Gigantum does a lot of Git automation under the hood. While that automation provides nice features like version control by default and the Activity Feed, naive inclusion of a Git repos in your project can lead to some hiccups! So how can we use a dataset that's published on GitHub?

Read More

Topics: Data Science, Open Science, Git

Webinar Recap: Data Science 2.0 and Scaling Distributed Teams

Posted by Tyler Whitehouse on Jun 30, 2020 3:09:15 PM

We did our first webinar on June 23, 2020, and we wanted to follow up with a brief post recapping the topics covered and giving access to a recording of the webinar.

In the webinar, Tyler Whitehouse (CEO) and Dean Kleissas (CTO) presented some slides and gave a product demo. The intent was to explain a bit about why decentralization is the best way to scale collaboration and productivity for teams on hybrid and multi-cloud environments.   

Broadly speaking, decentralization is the attempt to enable data scientists to work across a variety of devices and resources in a self-service fashion. It is a flexible approach that, if done properly, can eliminate the cost and practical problems of centralized approaches. The problem is that decentralization requires a lot of technical skill and diligence.

We have found that the key to scaling a decentralized approach is to provide lot of automation at the local level, not just in a managed cloud. Local automation drastically reduces the skill burden and the amount of time required to make decentralized approaches feasible. 

Read More

Topics: Data Science, Containers, Git, Jupyter, RStudio

Extending Git Commit Metadata In Gigantum

Posted by Dean Kleissas - Co-founder and CTO at Gigantum on Jul 20, 2018 12:27:00 PM

At Gigantum, we are building an open-source tool for developing, executing, and sharing data science projects that automates the creation of versioned and containerized code. This way your work is always accessible, reproducible, and transparent. Our ultimate goal is to make science and data science more efficient and reproducible, and we want people to directly access and build on each other’s work without all of the technical hassles. You can learn more about Gigantum, try the Client in the cloud, or download and install it locally at our website: https://gigantum.com

Read More

Topics: Open Science, Git, Software