I wrote a blog on how to isolate your Python code, and briefly mentioned Docker at the end. My comments were that I hadn’t really used Docker, and wasn’t sure when to use it.
Aaron wrote a great comment, in which he called Docker “VirtualEnv on crack”. This amused and intrigued me, so I asked him if he could expand on his comment.
Note: Docker is not just limited to Python, but can be used to isolate anything on your system, from databases to webapps to multiple versions of the same programming language.
What is docker?
So if you’ve spent time hanging out on some of the super trendy internet message boards lately (I’m looking at you HN and Reddit) you’ll probably have heard about docker, and how it’s awesome, and how it’s going to change the world of programming and put all the sys-admins out of a job. But maybe you’ve been too busy playing with the latest bleeding edge version of Netflix’s latest open source product to find out more about docker.
So what is docker again?
Docker is a way of putting your apps in lightweight containers, built on top of some nifty tech in the Linux kernel (Sorry Windows, you can’t play for now). If you’ve ever built applications for OSX’s sandbox then this is a similar idea. You can control what your application has access to in terms of networking, resources, filesystems and so on.
So you can have your MySql database in one container, your Django app in another, and a Redis in a third. The containers can talk to each other, and you can always make sure you have the exact version of MySql/Django/Whatever you need.
Everything your app needs (above the kernel itself) is contained in your docker container. That includes the OS distro itself, all your libraries, environment variables, EVERYTHING.
Docker containers are built on a copy-on-write filesystem, this means that several different containers can share the same base, but if you add different dependencies, libraries and so on, a new container is created for your app.
Great, but what the hell IS docker?
Docker runs as a daemon (on Linux, a daemon is a background process) on your workstation and coordinates building docker images, running the images as containers, pulling/pushing images from a docker repo and keeping track of which containers are currently running on your machine.
Docker hub is a public repository (think github for docker containers) where you’ll find all sorts of useful pre-built images and can store your own. You can also host your own private docker repository to use within your organisation.
Why should I care?
So this all sounds great, but you’re probably thinking: how does this apply to me?
Contain all your dependencies (Not just the python ones)
Have you ever longed to have that other version of libx installed just for your app inside VirtualEnv without doing all that messy symlinking? With docker you can! Need to run your app on Ubuntu but you run RedHat in production? Dockerise it. Need to install different version of python for all your apps but you can’t mess with the production environment? Docker to the rescue. Have you written one of those really awesome tools for making sure every external application has its environment variables and libs managed as well as keeping everything up to date for your own internal applications? Well good on you, but Docker is probably a lot less work.
Docker isn’t a VM. You’re running on the host hardware, there’s no virtualisation layer. You don’t have to wait for a container to boot, it’s just there, waiting for you.
Immutable, (really) portable, apps and environments
When you create an application in a docker image, that app, all its dependencies and associated environment variables are now fixed in stone. It will run the same way on your machine as it does on a production machine, on a customer machine or on your laptop at home. (I’ll let that sink in for a second.)
The application you’ve created is fixed, if you change the code, you build another container and you can deploy that. Doesn’t work? Immediately roll back to your previous image. There are ways to mount code from the host inside a container for testing and developing, so you can see how your latest changes will perform in production too, before deploying.
With docker you don’t need a complicated deployment process, you just download an image from your repo and run it. Anywhere. And it’ll actually work, the same way it worked on your laptop.
Docker allows you to write Dockerfiles to define your image builds, think of them as something between a makefile and a bash script, and they define everything about your application image. So you’ve probably just written a basic dockerfile and built the image, maybe you even put some code inside it. Now.. commit that dockerfile to git. Congratulations! You just created a completely reproducible, versioned environment that defines everything from your operating system up to your test code. Again, I’ll let that sink in for a second.
Containers are isolated
When you start building with containers, you have to start thinking modularly. I mean REALLY modularly. If you want to put something in a container closed off from the outside, it better have everything it needs in there. If your applications are already fantastically written, decoupled and modular then great, give yourself a pat on the back, download docker and carry on. However, if some of your applications are coupled, and rely on a bunch of other external factors being true in order to function, then Docker is a good cure. It’ll probably be hard at first to mentally break apart your systems and start thinking modularly, but it’s definitely a Good Thing. Remember the time you read those programming books about clean code and decoupled architectures and you thought ‘Damn this is brilliant, all our code should be like this’. Well working in containers encourages you to create plugable, modular applications (or if you want to be hip: microservices). Embrace it.. and your QA team will thank you.
Ok I’m sold, now what?
So hopefully you’re starting to see the power of docker, and that it might be useful. The best place to start is on the docker website, they have great documentation and getting started guides. The docker mailing list is also helpful and welcoming if you have questions or problems.
And if you’re really feeling adventurous, have a look at some Cool New Things like CoreOS, Mesos and Kubernetes: soon your whole infrastructure could be decoupled from your hardware and running in distributed multi cloud hybrid datacentres across the globe (+10 points if they’re connected via inflatable WiFi balloons).
Aaron is a Python/C++ developer with several years experience working in R&D for VFX industry, find him on Twitter @rncry if you want to heckle him or talk about Star Wars.
Next Time: We will look at a practical example on how to get started with Docker, especially to replace VirtualEnv / Virtual Machines. Keep watching this space.
PS: Want a free 1+ hour video course, Introduction to Web Scraping and Data Analysis? Also get a free mini-book, Python: From Apprentice to Master.
I will never spam you. Unsubscribe anytime.