As just an “end-user” of software for data analysis it is sometimes difficult to just catch up with what’s available, and I end-up discovering new things simply browsing the web looking for something specific, and ending up finding something else… that was the case for Docker a couple years ago.
Today (Dec 30,2019) I discovered a new tool: “Binder”
As I checked online this is not new as I found 2 youtube videos, one from 2016 and one from 2018 (See below.)
The gist of the purpose of Binder is to provide a platform for reproducible research that has a high leave of “practical reproducibility” i.e. end-users like me should be able to use that system….
One slide (see link below) form the Binder 2.0 presentation summarizes it well:
- technical reproducibility: making reproducible scientific results possible at all [From voice over: i.e. professionals in computer worlds can do it.]
- practical reproducibility: enabling others (and yourself) to reproduce results without difficulty [From voice over: i.e. scientists not particularly trained in computer methods.]
(See slides form the Binder 2.0 presentation at bit.ly/scipy-2018-binder – original is on Google Doc.)
This system is also very useful for creating tutorials – see below.
Documentation can be found at: mybinder.readthedocs.io
Creating a binder from a Github repository: mybinder.org
(Note: some experience with Jupyter Notebooks is more that useful.)
Some examples on the documentation: mybinder.readthedocs.io/en/latest/examples.html
A practical example that (mostly) works:
- Github repository: github.com/sofroniewn/tactile-coding (scroll down the page and click on the button ” launch binder“
Example for a Tutorial – (Pandas on Python)
Gleaned from this discussion page at news.ycombinator.com/item?id=18964007:
Binder is really amazing for Python/data science tutorial authors: I have a pandas tutorial on github, and instead of requiring everyone to install a bunch of Python libraries / set up a Docker container, now I can just link to
and people can try out the tutorial right away! Before Binder, running workshops always involved a TON of installation problems and it was a huge amount of work in advance to figure out how people on windows / mac / linux could all get the tools installed.
In the 2016 video the speaker also mentioned dat-data.com which defaults to dat.foundation for the peer-to-peer sharing of data.
This is an important step for actual useful public hub, but for now the files are not permanent unless “refreshed” by “pinning.”
- Sharing Reproducible Environments with Binder | SciPy 2016 | Andrew Osheroff
- Binder 2.0: Next Gen of Reproducible Scientific Environments w/ repo2docker & BinderH | SciPy 2018
Pingback: Visualizing gzipped compressed text files – DNA.today