Blog

Peer Review via Gigantum

Posted by The Gigantum Team on May 18, 2020 5:16:07 PM

This post is an overview for reviewers that are using Gigantum to inspect code for a manuscript.

Gigantum is a browser base application that integrates with Jupyter &  RStudio to streamline the creation and sharing of reproducible work in Python & R. 

Read More

Topics: Reproducibility, Open Science, Peer Review

Submitting Code via Gigantum

Posted by The Gigantum Team on May 18, 2020 4:18:42 PM

This post is an overview for how to use Gigantum to create and submit reproducible code.

Read More

Topics: Science, Reproducibility, Open Science

Portable & Reproducible, Across Virtual & Bare Metal

Posted by Dav Clark on May 14, 2020 7:13:44 AM

Working exclusively in a single cloud isn't possible for most people, and that is not just because it is expensive. Real work requires significantly flexibility around deployment.

For example, sensitive data typically can't go in the cloud. Or maybe each of your three clients uses a different cloud, or maybe you spend significant time on a laptop. 

It would be nice if things would "just work" wherever you want them to, but the barriers are many and large. Git & Docker skills are table stakes. Typos & hard coded variables rule the day. No matter how careful you are, stuff goes wrong. Maybe your collaborators don't have the same level of care and technical skill you do.

Who knows? The possibilities are endless.

Well, it used to be hard. There is a new container native system that moves reproducible work between machines (virtual or bare metal) with a few clicks.

No need to know Docker or Git. No need to be obsessive about best practices. No need to worry who is on what machine. 

We will demo it here using Dask and DigitalOcean for context. In the demo we:

  1. Create a 32-core Droplet (i.e. instance) on Digital Ocean
  2. Install the open source Gigantum Client on the Droplet
  3. Import a Dask Project from Gigantum Hub and run it
  4. Sync your work to Gigantum Hub to save it for later.
Read More

Topics: Reproducibility, Data Science, Open Science

Rebooting reproducibility: From re-execution to replication

Posted by Tyler Whitehouse, Dav Clark and Emmy Tsang on Jul 12, 2019 12:26:00 PM

6000x4000-5379227-mountain-rock-road-line-yellow-line-hill-red-rock-cliff-nature-national-park-state-park-nevada-open-road-adventure-travel-road-trip-creative-commons-images

Computational reproducibility should be trivial but it is not. Though code and data are increasingly shared, the community has realised that many other factors affect reproducibility, a typical example of which is the difficulty in reconstructing a work’s original library dependencies and software versions. The required level of detail documenting such aspects scales with the complexity of the problem, making the creation of user-friendly solutions very challenging.

Read More

Topics: Reproducibility, Data Science

Making Reproducibility Reproducible

gigantum blog post 12

Reproducibility doesn’t have to be magic, anymore. This image is provided by Abstruse Goose under the Creative Commons License

TL;DR - We believe the following

  • Approaches to the transmission of scientific knowledge are currently broken, mainly due to the criticality of software in modern research.
  • Calling re-execution of static results “reproducibility” isn’t enough. Reproducibility should be functionally equivalent to collaboration.
  • Academic emphasis on best practices is ineffective and should switch to a product based approach that minimizes effort rather than maximizes it.
  • By focusing on the needs of the end user, people can actually improve how scientific knowledge is communicated and shared.
Read More

Topics: Science, Reproducibility, Data Science, Containers, Jupyter