Python for me has always been easy to learn, but hard to get working. There are a number of ways to install and manage packages – pip, anaconda, virtual environments, poetry – unfortunately, each code project has spent different amounts of effort tuning their install approach to suit one, maybe two of these paradigms. This means that the success of installing a complex python codebase depends heavily on the approach you took to install it, and it may or may not work. Anaconda, for example, which has been my tried-and-true approach to a uniform package management experience across operating systems for many years, has itself fallen victim to the increasing complexity of managing dependencies, or “dependency hell”.
This only gets harder as the connections between the operating system and programming language grow. As code projects start to depend on specific compiled binaries, environment flags, or specific network setups, the challenge of getting a working environment grows from a Python problem to a system problem. So, to support courses and to provide a uniform experience, I’ve been looking into how to use containers for sharing working systems. There are a number of good reasons to use Docker, including:
There are also some significant downsides to using it, including:
To mitigate some of these issues we can use interactive web apps like Jupyter that can facilitate interaction with containers. Docker’s compose file has also solved some of the configuration issues associated with running traditional docker images and containers, because you can put almost everything you need in a self-documented yaml file.
Finally, while tools like Google Colab may use similar approaches to provide a jupyter front-end to python over the web, there is no guarantee that those tools are going to be available and free in the longer term. I thought it would be a good idea to see what the state of Jupyter over Docker is, to see if I can commonize the Python experience for students and take some of the installation guesswork out of the process, while enabling as much of the traditional python programming experience as I can.
So to provide a starting point, here is a tutorial for setting up a docker container complete with Python, a web-based interface, and some starter packages. This can be augmented from here with your own packages and then shared with others. But getting a working jupyter interface is a good starting point for the tutorial. I will share other containers based on this in the future.