Thursday, September 19, 2013

What are containers and how did they come about?

Today's announced collaboration between Red Hat and dotCloud, the company behind Docker, is exciting for a lot of reasons. As the release notes: "Docker and OpenShift currently leverage the same building blocks to implement containers, such as Linux kernel namespaces and resource management with Control Groups (cGroups). Red Hat Enterprise Linux Gears in OpenShift use Security-Enhanced Linux (SELinux) access control policies to provide secure multi-tenancy and reduce the risk of malicious applications or kernel exploits."

Among other areas of collaboration, we'll be working with dotCloud "to integrate Docker with OpenShift’s cartridge model for application orchestration. This integration will combine the power of Docker containers with OpenShift's ability to describe and manage multi-container applications, enabling customers to build more sophisticated applications with enhanced portability."

(See also the blog post from Docker here.)

But what are containers exactly? I'm down at LinuxCon/CloudOpen in New Orleans this week and I've seen a lot of interest in the various sessions that touch on containers. I've also see a fair bit of confusion and vagueness over what they are and what function they serve. In a way, I find this a bit surprising as the concept--and products based on that concept--has been around for almost a decade and progenitors go back even further. But I think it reflects just how thoroughly hypervisor-based virtualization has come to dominate discussions about partitioning physical systems into smaller chunks. Today is a far cry from the mid-2000s when I was writing research notes as an industry analyst about the "partitioning bazaar" which saw the offering of all manner of both hardware-based, software-based, and hybrid techniques--many implemented on large Unix servers. 

See: The Partitioning Bazaar (2002), New Containments for New Times (2005), and The Server Virtualization Bazaar, Circa 2007. Some of this blog post is adapted from material in those earlier notes. The original notes get into more of the historical background and context for those who are interested.

Hypervisor-based Virtualization

First let's consider hypervisor-based virtualization, aka hardware virtualization. Feel free to skip or skim this part. But I think the context will be useful for some.

Virtual Machines (VMs) are software abstractions, a way to fool operating systems and their applications into thinking that they have access to a real (i.e. physical) server when, in fact, they have access to only a virtualized portion of one. Each VM then has its own independent OS and applications, and is not even aware of any other VMs that may be running on the same box, other than through the usual network interactions that systems have (and certain other communication mechanisms that have evolved over time). Thus, the operating systems and applications from VMs are isolated from each other in (almost) the same manner as if they were running on separate physical servers. They’re created by a virtual machine monitor (VMM), often called a “hypervisor,” that sits on top of the hardware, where it creates and manages one or more VMs sitting on top of the hypervisor, and interfaces with the underlying hardware. (The hypervisor thereby provides many of the services provided by an operating system and in the case of, for example, KVM that operating system can even be a general purpose OS like Linux. For the purposes of this discussion, we're only considering native virtualization as opposed to host-based virtualization such as provided by VirtualBox which isn't really relevant in the server space.)

The result is multiple independent operating system instances running on a single physical server, each of which communicate with the rest of the world, including other instances on the same physical server, through a hypervisor. Historically, the reason server virtualization was so interesting was that it enabled server consolidation--the amalgamation of several underutilized physical servers into one virtualized one while keeping workloads isolated from each other. This last point was important because Windows workloads in particular often couldn't be just crammed together into a single operating system instance because of DLL hell, version dependencies, and other issues. The VM approach was also just a good fit for enterprises which often maintained lots of different operating systems an versions thereof. VMs let them largely keep on doing things the way they were used to--just with virtual servers instead of physical ones.

Over time, server virtualization also introduced features such as live migration which allowed moving running instances from machine to machine as well of a variety of other features going beyond consolidation. These features too generally reflected enterprise needs and stateful "system of record" workloads.

Where did containers come from?

Containers build from the basic *nix process model that forms the basis for separation. Although a process is not truly an independent environment, it does provide basic isolation and consistent interfaces. For example, each process has its own identity and security attributes, address space, copies of registers, and independent references to common system resources. These various features standardize communications between processes and help reduce the degree to which wayward processes and applications can affect the system as a whole.

*nix also builds in some basic resource management at the process level—including priority-based scheduling, augmented by things like the intrinsic function ulimit, which can be used to set maximum resources such as CPU cycles, file descriptors, and locked memory used by a process and its descendants. More recently, Control Groups (Cgroups) have significantly extended the resource management built into Linux--providing controls over CPU, memory, disk, and I/O use such as that often offered through add-on management products in the past.

The first example of isolating resource groups from each other probably dates to 1999 when the FreeBSD jail(2) function reused the chroot implementation, but blocked off the normal routes to escape chroot confinement. However, two different implementations garnered a fair bit of attention in the 2000s: one from SWsoft's Virtuozzo (the company is now called Parallels) and another in Sun's Solaris. The Solaris 10 implementation is probably what most popularized the "containers" term, which was their marketing name for OS virtualization. (Their technical docs used "zones" for the same thing.) IBM also introduced containers in AIX which were unique in that they allowed for moving running containers between systems.

Despite some interest in containers during a certain period, though, they never had a broad-based impact. SWsoft came with a history of success in the hosting space where their flavor of containers (Virtual Private Servers) had a fair degree of success. That's because (most hosting providers are typically highly cost sensitive and, as we'll see, containers enable very high density relative to hypervisor-style virtualization (among other differences). However, the push behind containers never moved them into new markets to any appreciable degree. In part, this was because of technical match with enterprise requirements. It was probably as much because of an industry tendency to standardize on particular approaches--and that ended up being VMware for server consolidation during the 2000s. 

What are containers?

Before getting into what's happening today, let's talk about what containers are from a technical perspective. I've tried to make this description relatively generic as opposed to getting into specific implementations such as OpenShift's.

Like partitions, a container presents the appearance of being a separate and independent OS image—a full system, really. But, like the workload groups that containers extend, there’s only one actual copy of an operating system running on a physical server. Are containers lightweight partitions or reinforced workload groups? That’s really a matter of definition and interpretation, because they have characteristics of each. It may help to think of them as “enhanced resource partitions” that effectively bridge the two categories.

Containers virtualize an OS; the applications running in each container believe that they have full, unshared access to their very own copy of that OS. This is analogous to what VMs do when they virtualize at a lower level, the hardware. In the case of containers, it’s the OS that does the virtualization and maintains the illusion.

Containers can be very low-overhead. Because they run atop a single copy of the operating system, they consume very few system resources such as memory and CPU cycles. In particular, they require far fewer resources than workload management approaches that require a full OS copy for each isolated instance. 

Containers tend to have lower management overhead, given that there’s but a single OS to be patched and kept current with security and bug fixes. Once a set of patches is applied and the system restarted, all containers automatically and immediately benefit. With other forms of partitioning such as hypervisor-based virtualization, each OS instance needs to be patched and updated separately, just as they would if they were on independent, physical servers. This is a critical benefit in hosting environments but it has often been seen as a negative in much more heterogeneous enterprise environments.

What containers don't do is provide much if any additional fault isolation for problems arising outside the process or group of processes being contained. If the operating system or underlying hardware goes, so go its containers—that is, every container running on the system. However, it's worth noting that, over the past decade, an enormous amount of work has gone into hardening the Linux kernel and its various subsystems. Furthermore, SELinux can be used to provide additional security isolation between processes. 

So, here are a few statements about containers that are (generally) true:

  • The containers on a single physical server (or virtual machine) run on a single OS kernel. The degree to which the contents of a given container can be customized is somewhat implementation dependent.
  • As a result, many patches will apply across the containers associated with an OS instance.
  • Management of resource allocation between containers is fast and low overhead because it's done by a single kernel managing its own process threads. 
  • Similarly the creation (and destruction) of containers is faster and lower overhead than booting a virtual machine.
  • Today, running containers cannot generally be moved from one running system to another.

Why today's interest?

In a nutshell: Because cloud computing--Platform-as-a-Service (PaaS) in particular--more closely resembles hosting providers than traditional enterprise IT.

A lot has to do with the nature of cloud workloads. I explored the differences between traditional "systems of record" and cloud-style "systems of engagement" in a new whitepaper. However, in brief, cloud-style workloads tend towards scale-out, stateless, and loosely coupled. They also tend to run on more homogeneous environments (alongside existing applications under hybrid cloud management) and use languages (Java, Python, Ruby, etc.) that are largely abstracted from the underlying operating system.

Some of the implications of these characteristics is that you don't generally need to protect the state of individual instances (using clustering or live migration). Nor do you typically have or want to have a highly disparate set of underlying OS images given that makes management harder. You also tend to have a large number of smaller and shorter-lived application instances. These are all a good match for containers.

PaaS amplifies all this in that it explicitly  abstracts away the underlying infrastructure and enables the rapid creation and deployment of applications with auto-scaling. This is a great match for containers, both because of the high densities and the rapid resource reallocation they enable (and, indeed, require). 

In short, it's no coincidence that containers have re-entered the conversation so strongly. They're a match for cloud broadly and PaaS in particular. This isn't to say that they're going to replace virtual machines. I'd argue rather that they're a great complement that needn't (and probably shouldn't) try to replicate the VM capabilities that were designed with different use cases in mind. 

No comments: