From b814c2f67c7966ef68ed401772856a295069d480 Mon Sep 17 00:00:00 2001 From: Danny Berger Date: Fri, 28 Feb 2014 14:36:45 -0700 Subject: [PATCH] distributed-docker-containers --- ...014-02-28-distributed-docker-containers.md | 240 ++++++++++++++++++ 1 file changed, 240 insertions(+) create mode 100644 blog/_posts/2014-02-28-distributed-docker-containers.md diff --git a/blog/_posts/2014-02-28-distributed-docker-containers.md b/blog/_posts/2014-02-28-distributed-docker-containers.md new file mode 100644 index 0000000..a51bb20 --- /dev/null +++ b/blog/_posts/2014-02-28-distributed-docker-containers.md @@ -0,0 +1,240 @@ +--- +title: Distributed Docker Containers +layout: post +tags: aws-ec2 docker nodejs scs-utils +description: A strategy for integrating Docker services across multiple hosts and data centers. +--- + +One thing I've been working with lately is [Docker][1]. You've probably seen it referenced in various tech articles +lately as the next greatest thing for cloud computing. Docker runs "containers" from base "images" which essentially +allow running many lightweight virtual machines on any recent, Linux-based system. Internally, the magic behind it is +[lxc][2], although Docker adds a lot more magic to improve and make it more usable. + +For a long time now I've used virtual machines for development - it allows me to better simulate how software runs out +on production servers. Historically, [Vagrant][3] + [VirtualBox][4]/[VMWare Fusion][5]/[EC2][6] have been great tools +for that, but they have limitations and they tend to drift a bit from production architecture. + + +## The Problem + +In trying to duplicate the production environments, it's not typically feasible for me to run more than one virtual +machine on my laptop. I could split my single local virtual machine to multiple EC2 instances; but then it becomes more +difficult to manage IP addresses for the various service dependencies as the instances get stopped/started between +working sessions (in addition to the extra costs). VPCs with private IP addresses do help with that a lot, as long as +there's a sane way to manage those resources. + +Another issue that comes up when combining services on a single host is dependency overlap. One example of this is +shared modules. Some newer features of nginx require a newer version of the openssl libraries. However, PHP doesn't +necessarily support the newer version of openssl without upgrading quite a few other components. While there may be +workarounds, the inconvenience of it all typically just prompts me to avoid working on that particular feature, +unfortunately. + +Ultimately, I want to have the same software and network stack that I use in a production environment, but in a +development environment and, if possible, locally on my laptop. + + +## The Alternatives + +This problem is certainly not unique, but a practical solution has been difficult for me to find. I've been +experimenting with a few different technologies over the years trying to solve this sort of thing. + +Vagrant is obviously the first practical solution. For me, it has been a functional solution for quite a while, but not +an optimal one. Like I mentioned before, it's a bit bulky when attempting to mimic non-trivial architectures on a +standard laptop. For a while now, I've been finding the motivation and time to migrate to a better setup. + +With the advent of Docker, many of my software requirements become much simpler. Each piece of software can run in its +own container and I don't have to worry about dependency overlap. Multiple containers are *significantly* cheaper than +trying to run multiple virtual machines. I could even reuse containers built on my development machine out on +production. One thing Docker doesn't effectively solve is service dependency. It can support them on a single host with +links, but not across multiple hosts. + +I've been keeping an eye out for other tools which may help solve these problems. Some of them are: + + * [decking][7] - seems to primarily build on top of Docker's built-in link functionality for service dependency within + a single host + * [etcd][11] - an excellent distributed, hierarchical key-value store; very useful for monitoring configuration values + and being notified when they change (related: [confd][22]) + * [fig][8] - seems like [Foreman][21], but geared for Docker containers + * [flynn][11] - originally I was very excited about this, however it still seems underdeveloped for the purposes of + service discovery of arbitrary services; I'm still very hopeful + * [serf][9] - a very new client for distributing data across a cluster and taking action on it. To me it seems like + more of a management tool (like half of the [mcollective][10] utility) + +Recently, I've been becoming more aquainted with [bosh][12], an interesting tool for managing large deployments along +with all their dependencies. To me, bosh always seems overly complicated for whatever I'd want to accomplish and has +quite a few bosh-specific practices to learn. Its resource and service management is very thorough, although it takes a +while to get comfortable with it. It seems more like an infrastructure management tool rather than a service management +tool, and I was hoping to keep those responsibilities separate and simpler. Ultimately, I think bosh could be made to +work... but I was still hoping for something different, lighter, and utilizing more common open source tools that I was +already familiar with. + + +## The Ideas + +I had a simple application in mind to roughly define my "[minimum viable product][13]": + + 0. run WordPress web application, a MySQL server, and a backup MySQL server as separate services + 0. runtime parity (between development and production) + 1. configure services the exact same way + 1. run services the exact same way + 1. depend on other services the exact same way + 0. architecture flexibility + 1. in production, run the services on three separate hosts across two separate data centers + 1. in development, run all services on a single virtual machine on my laptop + 0. service flexibility - be able to dynamically relocate services without manual reconfiguration and minimal downtime + * combine services into one or two hosts during quiet hours + * move a service to a more powerful instance during high load + 0. self-provisioning - when a container requires a particular volume or network, make sure it can be automatically + provisioned and de-provisioned + +First off, I knew I wanted to run the services inside of Docker containers. I can only imagine Docker's ubiquity will +continue to grow, and the ability to run completely arbitrary software anywhere with minimal host dependencies seemed +like a perfect, lightweight solution. + +I've used [Puppet][14] to configure servers and applications for a long time. While I dislike the overhead it requires +for smaller use cases, I really like the consistency and declarative nature that it provides. Since I'll continue to use +it for host server configuration, it's a small stretch to also use it for configuring the service runtimes. + +When it comes down to it, I think there are two main questions that a service must answer: + + * How should I work? and + * How do I connect with the rest of the world? + +The first question can be managed and configured via Puppet. Once a service is configured and compiled to run as +requested, it never needs to go through that process again. This approach lets compiled Docker images be consistently +reused across time and servers. + +The second question deals with pointing WordPress to the MySQL server, or pointing MySQL server to the data directory, +or running the MySQL backup server on a specific network segment. These decisions and connections have nothing to do +with how the service should work, so they can be changed as needed. So far, I have four main dependencies about how +these containers get connected: + + 0. volumes - giving containers a place to write persistent data (e.g. WordPress `wp-content/uploads` directory) + 0. provided services - a service that the container is running (e.g. `http` on `80/tcp`) + 0. required services - a service that the container needs (e.g. `mysql`) + 0. network - how the container is attached to the network + +I think these basic aspects effectively describe everything needed to manage a self-contained service. + + +## The Implementation + +The next step of an idea is to prototype it, and that's where I am today. There are several pieces that I've been +working on, but three general topics... + + +### Service Discovery + +One of the most interesting concepts is service discovery. I wanted containers to be able to connect with each other +across multiple hosts and data centers. I've been using DNS for host discovery and, while it works great it doesn't seem +entirely appropriate for "containerized" discovery. Through [`A`][23] records, DNS easily picks up on hosts changing, +but is not so good for dynamic ports. DNS [`SRV`][24] records seem *much* more appropriate with attributes for both +hostname and port, but `SRV` records are rarely used in internal APIs. + +Originally I was using etcd to register and discover services, but I found it to be inefficient for filtering services +and propagating changes. Instead, I created a specialized client/server protocol to handle the registration and +discovery process. In technical terms, the protocol works like the following... + +WordPress needs a database, so before it starts the container, it connects with the disco server: + + > **container**: Hi, I need a `mysql` service to talk to - who's available? + > **disco**: You should talk with `192.0.2.11:39313` - I'll keep you posted if it changes, but let me know if you no + > longer need it + +The results are injected as environment variables when the container is started and can use them however it likes. +WordPress obviously runs a web server, so, once the container is started, the container manager connects with disco: + + > **container**: Hi, I'm `wordpress` and I have an `http` service available at `192.0.2.12` on port `39212` + > **disco**: Nice to meet you; let me know if you no longer provide it + +Then things are running happily and you could ask the disco server where to find `wordpress/http` to pull it up in your +web browser. If the database server crashes and recovers elsewhere, a few things will happen. First, when disco realizes +MySQL is no longer available (either by a clean disconnect, heartbeat timeout, or socket disconnect), it notifies +everyone who is subscribed that the endpoint has been dropped: + + > **disco**: Looks like you were using `mysql`, but I'm sorry to tell you it's no longer available + > **container**: Thanks for letting me know + +The container manager then attaches to the container to run an update command letting it know about the change. The +command can take care of updating the runtime configuration and restarting the application server. + +Eventually the new MySQL server will come back online and register itself. Once registered, disco realizes that +WordPress is subscribed, so it lets it know: + + > **disco**: Great news, I have a new `mysql` endpoint for you at `192.0.2.14.39414` + > **container**: Excellent, thanks + +And it again runs the live update command, updating the environment and restarting the application server. + +The disco protocol has a few more features (like using a single server for more than one WordPress/MySQL setup, or +filtering services by arbitrary tags like availability zones to improve load balancing), but that's the general idea. + + +### Configuration Files + +I'm using YAML files to describe images and containers. They get compiled to a static version, and then cached based on +the image configuration. For example, take a look at this example [scs-wordpress][16] image manifest. It describes the +various connection points, docker details, and how it's configured. Now, take a look at the [Puppet manifests][17] which +enumerates all the configuration options which affect how the service will run. Finally, take a look at the +[sample config][18] which ties together what kind of image it needs to be able to run (configuration) and how that image +will be connected to the world. + + +### Self-Provisioning + +For each of the four dependency/connection types (volumes, service provider, service dependent, network), I'm trying to +make them suitable for local development and AWS EC2 deployment. For example: + + * AWS EC2 volumes can be auto-created, mounted, and attached to hosts for use by docker containers. This allows + services to drift across instances + * Likewise, I can also just use a local path for a volume and avoid an official network mount + * Various other strategies can be added for each dependency: + * nfs-volume: to attach a docker mount point to an external NFS mount + * aws-ec2-eni: to attach an ENI as the network interface for a docker container + +My goal is to provide a manifest configuration file to a machine and know that it will load up whatever it needs to run, +including recompiling the image from scratch if it's not available in any caches. + + +## The Prototype + +So, all those ideas are currently under development in my [`scs-utils`][20] repository. I've created a repository called +[`scs-example-blog`][19] which is a functional implementation of my original MVP. It provides a `Vagrantfile` for you to +easily try it out yourself and it goes through the process of getting the containers running on a single virtual machine, +accessing the services from the host, and then splitting them up across multiple virtual machines. It's more a tutorial +describing the steps - typically the service deployment would be managed by Puppet. + + +## The Conclusion + +All these ideasĀ are absolutely a work in progress and I'm still actively tweaking the implementation, but it was in a +functional state to briefly discuss the idea. So far it has been an excellent learning opportunity for Docker, custom +network protocols, and splitting some of the services I've previously been running into more reusable components. Even +if `scs-utils` isn't still what I'm using in 2 years, the refactoring it has motivated makes it significantly easier to +port into whatever more valuable tool surfaces further down the road. + + + [1]: https://www.docker.io/ + [2]: http://linuxcontainers.org/ + [3]: http://www.vagrantup.com/ + [4]: https://www.virtualbox.org/ + [5]: http://www.vmware.com/products/fusion + [6]: http://aws.amazon.com/ec2/ + [7]: http://decking.io/ + [8]: http://orchardup.github.io/fig/ + [9]: http://www.serfdom.io/ + [10]: http://puppetlabs.com/mcollective + [11]: https://flynn.io/ + [12]: http://docs.cloudfoundry.org/bosh/ + [13]: http://en.wikipedia.org/wiki/Minimum_viable_product + [14]: http://puppetlabs.com/puppet/puppet-open-source + [15]: https://github.com/coreos/etcd + [16]: https://github.com/dpb587/scs-wordpress/blob/3ba391d4f82da5c9642d88962e0bce32eb692add/scs/image.yaml + [17]: https://github.com/dpb587/scs-wordpress/tree/3ba391d4f82da5c9642d88962e0bce32eb692add/scs/puppet/scs/manifests + [18]: https://github.com/dpb587/scs-example-blog/blob/master/wordpress/manifest.yaml + [19]: https://github.com/dpb587/scs-example-blog + [20]: https://github.com/dpb587/scs-utils + [21]: http://ddollar.github.io/foreman/ + [22]: https://github.com/kelseyhightower/confd + [23]: http://en.wikipedia.org/wiki/A_record#A + [24]: http://en.wikipedia.org/wiki/SRV_record