Making Reproducibility Reproducible

gigantum blog post 12

Reproducibility doesn’t have to be magic, anymore. This image is provided by Abstruse Goose under the Creative Commons License

TL;DR - We believe the following

  • Approaches to the transmission of scientific knowledge are currently broken, mainly due to the criticality of software in modern research.
  • Calling re-execution of static results “reproducibility” isn’t enough. Reproducibility should be functionally equivalent to collaboration.
  • Academic emphasis on best practices is ineffective and should switch to a product based approach that minimizes effort rather than maximizes it.
  • By focusing on the needs of the end user, people can actually improve how scientific knowledge is communicated and shared.

Reproducible Work

The transmission of scientific knowledge and techniques is broken, but you already know this because otherwise the title wouldn’t have caught your eye. Reproducibility is the poster child for this breakdown because it is a serious problem, but also perhaps because it is tangible. What is actually wrong with the transmission of modern science is much harder to put your finger on.

We don’t want to go all Walter Benjamin on you, but scientific interpretation in the age of digital reproduction needs a deep re-think. Since that is more appropriate for a dissertation, let’s just focus on reproducibility.

Historically, reproducibility was the manual re-execution and validation of a single result, something fundamental to science going back to the 17th century. Typically the job was to recreate physical effects or observations, and manual re-execution fit well with the written descriptions of scientific results.

As digital data collection emerged in the latter half of the 20th century, expectations for reproducibility added an archival and computational component. Theoretically, data and methods were enough to understand & re-execute a work, and the ease of digital transmission made reproducibility feasible.

But did it?

gigantum blog post 13
Science needs functional tools that promote the collective effort, not a bunch of busy work. Image is a Detroit Industry Mural by Diego Rivera.

Reproducible Work Environments

When we began Gigantum, we wanted reproducibility to be part of the daily experience, not just the result of a post hoc process. Reproducible work and reproducible environments are all good, but we wanted reproducible work environments, i.e. to have the fruits of daily work be reproducible without extra effort or thought. This seemed the best way to make reproducibility reproducible.

Pushing reproducibility down into the daily experience moves it closer to real collaboration. Basically, if I can reproduce your work in my environment and you can reproduce mine in yours, then we can work together. It also means that I can interrogate your work and your process in a way that is natural to me.

So, we focused on the connection between collaboration and reproduction, and we formulated some basic requirements that we think functional reproducibility should provide:

  1. Integrated versioning of code, data and environment;
  2. Decentralized work on various resources, from laptops to the cloud;
  3. Editing and execution in customized environments built by the user.

Basically, we think that the line between collaboration and reproducibility should be access & permissions, and that’s all. They should be functionally equivalent. The existence proof for this is that they are functionally equivalent for two super skilled users with enough time.

To go a step further, we wanted to eliminate the effort & skill needed to create reproducible work environments and to make this capacity broadly and automatically available. So we created something that does the following.

  1. Focuses on the user experience and tries not to change how people work;
  2. Runs locally on a laptop but provides cloud access for sharing and scale;
  3. Automates best practices and admin tasks to save labor and add skill.

The first requirement is important for adoption because learning curves or interrupting functioning work habits won’t promote use. The second respects users’ daily lives, budgets and the occasional need for scale. The third forces the platform to promote efficiency & level out asymmetries.


gigantum blog post 15

Topics: Science, Reproducibility, Data Science, Containers, Jupyter