Webinar Recap: Data Science 2.0 and Scaling Distributed Teams

Posted by Tyler Whitehouse on Jun 30, 2020 3:09:15 PM

We did our first webinar on June 23, 2020, and we wanted to follow up with a brief post recapping the topics covered and giving access to a recording of the webinar.

In the webinar, Tyler Whitehouse (CEO) and Dean Kleissas (CTO) presented some slides and gave a product demo. The intent was to explain a bit about why decentralization is the best way to scale collaboration and productivity for teams on hybrid and multi-cloud environments.   

Broadly speaking, decentralization is the attempt to enable data scientists to work across a variety of devices and resources in a self-service fashion. It is a flexible approach that, if done properly, can eliminate the cost and practical problems of centralized approaches. The problem is that decentralization requires a lot of technical skill and diligence.

We have found that the key to scaling a decentralized approach is to provide lot of automation at the local level, not just in a managed cloud. Local automation drastically reduces the skill burden and the amount of time required to make decentralized approaches feasible. 

Read More

Topics: Data Science, Containers, Git, Jupyter, RStudio

Making Reproducibility Reproducible

gigantum blog post 12

Reproducibility doesn’t have to be magic, anymore. This image is provided by Abstruse Goose under the Creative Commons License

TL;DR - We believe the following

  • Approaches to the transmission of scientific knowledge are currently broken, mainly due to the criticality of software in modern research.
  • Calling re-execution of static results “reproducibility” isn’t enough. Reproducibility should be functionally equivalent to collaboration.
  • Academic emphasis on best practices is ineffective and should switch to a product based approach that minimizes effort rather than maximizes it.
  • By focusing on the needs of the end user, people can actually improve how scientific knowledge is communicated and shared.
Read More

Topics: Science, Reproducibility, Data Science, Containers, Jupyter