Tuesday, October 20, 2015

How open source is increasingly about ecosystems

Fish ecosystem by nerdqt87

When we talk about the innovation that communities bring to open source software, we often focus on how open source enables contributions and collaboration within communities. More contributors, collaborating with less friction.

However, as new computing architectures and approaches rapidly evolve for cloud computing, for big data, for the Internet-of-Things, it’s also becoming evident that the open source development model is extremely powerful because of the manner in which it allows innovations from multiple sources to be recombined and remixed in powerful ways. Consider the following examples. 

Containers are fundamentally enabled by Linux. All the security hardening, performance tuning, reliability engineering, and certifications that apply to a bare metal or virtualized world still apply in the containerized one. And, in fact, the operating system arguably shoulders an even greater responsibility for tasks such as resource or security isolation than when individual operating system instances provided a degree of inherent isolation. (Take a look at the fabulous Containers coloring book by Dan Walsh and Máirín Duffy for more info on container isolation.)

What’s made containers so interesting in their current incarnation—the basic concept dates back over a decade—is that they bring together work from communities such as Docker that are focused on packaging applications for containers and generally making containers easier to use with complementary innovations in the Linux kernel. It’s Linux security features and resource control such as Control Groups that provide the infrastructure foundation needed to safely take advantage of container application packaging and deployment flexibility. Project Atomic then brings together the tools and patterns of container-based application and service deployment. 

We see similar cross-pollination in the management and orchestration of containers across multiple physical hosts; Docker is mostly just concerned with management within a single operating system instance/host. One of the projects you’re starting to hear a lot about in the orchestration space is Kubernetes, which came out of Google’s internal container work. It aims to provide features such as high availability and replication, service discovery, and service aggregation. However, the complete orchestration, resource placement, and policy-based management of a complete containerized environment will inevitably draw from many different communities.

For example, a number of projects are working on ways to potentially complement Kubernetes by providing frameworks and ways for applications to interact with a scheduler. One such current project is Apache Mesos, which provides a higher level of abstraction  with APIs for resource management and scheduling across cloud environments. Other related projects include Apache Aurora, which Twitter employs as a service scheduler to schedule jobs onto Mesos. At a still higher level, cloud management platforms such as ManageIQ extend management across hybrid cloud environments and provide policy controls to control workload placement based on business rules as opposed to just technical considerations.

We see analogous mixing, matching, and remixing in storage and data. “Big Data” platforms increasingly combine a wide range of technologies from Hadoop MapReduce to Apache Spark to distributed storage projects such as Gluster and Ceph. Ceph is also the typical storage back-end for OpenStack—having first been integrated in OpenStack’s Folsom release to provide unified object and block storage. 

In general, OpenStack is a great example of how different, perhaps only somewhat-related open source communities can integrate and combine in powerful ways. I previously mentioned the software-defined storage aspect of OpenStack but OpenStack also embeds software-defined compute and software-defined networking (SDN). Networking’s an interesting case because it brings together a number of different communities including Open Daylight (a collaborative SDN project under the Linux Foundation), Open vSwitch (which can be used as a node for Open Daylight), and network function virtualization (NFV) projects that can then sit on top of Open Daylight—to create software-based firewalls, for example. 

It’s evident that, interesting as individual projects may be taken in isolation, what’s really accelerating today’s pace of change in software is the combinations of these many parts building on and amplifying each other. It’s a dynamic that just isn’t possible with proprietary software.

Links for 10-20-2015

Wednesday, October 14, 2015

VMs, Containers, and Microservices with Red Hat's Mark Lamourine


In this podcast, my Red Hat engineering colleague Mark Lamourine and I discuss where VMs fit in a containerized world and whether microservices are really the future of application architecture and design. Will organizations migrate existing applications to new containerized infrastructures and, if so, how might they go about doing so?

Listen to MP3 (0:17:50)

Listen to OGG (0:17:50)


Gordon Haff:  Hi, everyone. This is Gordon Haff in Cloud Product Strategy at Red Hat. I'm here with my much more technical colleague, Mark Lamourine. Today we're going to talk about containers and VMs.

I'm not going to give away the punch line here, but a lot of our discussion is, I think, going to ultimately circle around the idea of "How purist do you want to be in this discussion?"

Just as a level set, Mark, how do you think of containers and VMs? I hesitate to say containers versus VMs. But how do you think about their relationship?

Mark Lamourine:  It's actually characterized pretty nicely in the name. A virtual machine is just that, you're emulating a whole computer. It has all the parts, you get to behave as if it's a real computer. You can treat it in kind of the conventional way with an operating system and set up and configuration management.

A container is something where it's just much more limited. You don't expect to live in a container. It's something that serves your needs, has just enough of what you need for a period of time and then maybe you're done with it. When you're done, you set it aside and get another one.

Gordon:  They're both abstractions, containers and VMs are both abstractions at some level.

Mark:  There are situations now, at least, where you might want to choose one over the other. The most obvious one is a situation where you have a long lived process or a long‑lived server.

In the past, you would buy hardware and you'd set up these servers. More recently, you would set up a VM system, whether it's OpenStack or whatever. You'd put those machines there.

They tend to have a fairly long life. You apply configuration management and you update them periodically, and they probably have uptimes on the order of hundreds of days. If you've been in a really good shop, most places have one with many hundreds of days uptime for services like that, for very stable, unitary, monolithic services.

Those are still out there. Those kinds of services are still out there.

Containers are more suited, at this point, for more transient systems, situations where you've actually got good, either where you have a short term question, some service you want to set up for a brief period of time and tear down. Because that service is really going to calculate the answer to some query or pull out some big data and then you're going to shut it down and replace it.

Or other situations where you have a scaling problem. This is purely speculation, but I can imagine NASA, when they have their next Pluto flyby or whatever, needing to scale out and scale back. In that case, you know that those things are transient, so putting up full VMs for a web server that's just passing data through may not make sense.

On the other hand, the databases on the back end, those may need either real hardware or a virtual machine, because they're going, the data is going to stay. But the web servers may come and go based on the load.
I see containers and VMs still as both having a purpose.

Gordon:  Now, you used "At this point" a couple of times. What are some of the things, at least hypothetically, containers would need to do to get to the point where they could more completely, at least in principle, replace VMs?

Mark:  One of the things I'm uncomfortable at this point putting out about containers is that people talk about containers being old technology. While that's true in a strict sense, we've had a container like things even back as far as IBM mainframes and MVS.
It's just recently, in the last three or four years, become possible to use them everywhere, and to use them in ways we've never tried before and to build them up quickly and to build aggregations and combinations.

We're still learning how to do that. We're still learning how to take what, on a traditional host or VM, would be, you put several different related services inside one box or inside one VM and you configure them all to work together and they share the resources there.
In the container model, the tendency is to decompose those into parts. We're still not really good yet at providing the infrastructure to quickly, reliably and flexibly set up moderate to large complex containerized services.

There are exceptions, obviously, people like Google and others. There are applications that work now at a large scale. But not, I think, in kind of the generalized way that I envision.

When that becomes possible, when we learn the patterns for how to create complex applications to take a database container off the shelf and apply three parameters and connect it to a bunch of others and have it be HA automatically, then I think there might be a place for that.

The other area is the HA part, where you can afford to have, you could create a long‑lived service from transient containers. When you've got HA mechanisms well enough worked out that when you do an update, if you need to do an update to a single piece, you kill off a little bit, and you start up another bit with the more recent version. You gradually do a rolling update of the components and no one ever sees the service go down.

In that case, the service becomes long‑lived. The containers themselves are transient, but no one notices. When that pattern becomes established when we learn how to do that well, it's possible that more and more hardware or VM services will migrate to containers. But we're not there yet.

Gordon:  I'm going to come back to that point in just a moment. I think one other thing that's worth observing, and we certainly see this in terms of some of the patterns around OpenStack is, we very glibly talk about this idea of having cattle workloads. You just kill off one workload, doesn't really matter and so forth.

In fact, there's a fairly strong push to build in, for instance, enterprise virtualization types of functionality into something like OpenStack, for example, so you can do things like "Live migration." Because, in fact, it's easy to talk about cattle versus pets, workloads. But like many things, the real world is more complicated than simple metaphors.

Mark:  Yes, and I think that the difference is still knowledge. We talk about cattle, we talk about having these large independent, I don't care, parts.

Currently, it's still hard to build for a small company, perhaps. It's hard to build something that has the redundancy necessary to make that possible. In a fairly small company, you're either going to go to the cloud or if you're moderate‑sized, you're going to have something in‑house.

The effort of making something a distributed HA style service for your mail system or for whatever your core business is, it's still hard. It's easier to do it as a monolith, and as long as the costs associated with the monolith are lower than the costs associated with starting up a distributed service, an HA service, people are going to keep doing it.

When the patterns become well enough established that the HA part disappears down into the technology, that's when I think more of this kind of, it might really be cattle underneath.

Gordon:  Right. We see a lot of parallels here with things like parallel programming and so forth, is that when these patterns really have become well established, one of the key reasons they have been able to become well established is that the plumbing, so to speak, or the rocket science needed to really do these things is being submerged in underlying technology layers.

Mark:  That's actually what Docker is. That's really what Docker and Rocket both have done. They've taken all of the work of LXC or Solaris containers and they've figured out the patterns. They've made appropriate assumptions and then they've pushed them down below the level where an ordinary developer has to think about them.

Gordon:  Coming back to what we were talking about a few minutes ago, we were talking a little bit about distributed systems and monoliths versus distributed services and so forth.

From a practical standpoint, let's take your typical enterprise today where they're starting to migrate some applications or designing new applications, which may be the more common use case here, and they want to use containers. There's actually some debate over how we start to develop for these distributed systems.

Mark:  You hit on two different things and I want to go back to it. One is the tendency to migrate things to containers, and the other one to develop new services in containers.

It seems to me there's a lot of push for migration as opposed to, there are people who are developing new things. It seems like there's an awful lot of push to migration. The people, people want to jump into the cloud. They want to jump into containers. They're like, "Well, containerize this. Our thing, it needs to be containerized," without really understanding what that means in some cases.

That's the case where, which direction do you go? Do you start by pulling it apart and putting each piece into containers? Or do you stuff the whole thing in and then see what parts you can tease out?

I think it really depends on the commitment of the people and their comfort with either approach. If you've got people who are comfortable taking your application and disassembling it and finding out the interfaces up front and then rebuilding each of those parts, great, go for it.

If, and this actually makes me uncomfortable, because I prefer to decompose it. But if you've got something where you can get it as a monolith or small pieces into one or a small number of containers and it does your job, I can't really argue against that. If it moves you forward and you learn from it, and it gets the job done, go ahead. I'm not going to tell you that one way is right or wrong.

Gordon:  To your earlier point, in many cases, it may not even make sense to move to a container, it's working fine in a VM environment. You don't really get brownie points by containerizing it.

Mark:  I think that it seems like there's a lot of demand for it. Whether or not the demand is justified, again, is an open question.

Gordon:  A lot of people, they use different terms, but use Gartner terminology as its mode 1 IT and this mode 2 IT. Really, the idea with mode 1 IT is, you very much want to modernize where appropriate, for example, replacing legacy Unix with Linux and bringing more DevOps principles into your software development and so forth. But you don't need to wholesale, replace it or wholesale migrate it.

Whereas, your new applications are going to be more developed in kind of mode 2 infrastructure with mode 2 techniques.
We've kind of been talking about how you migrate or move, assuming that you do. How about for new applications? There are actually even seems to be some controversy with various folks in the DevOps "movement" or in Microservices and so forth over, what's the best way to approach developing for, new IT infrastructures.

Mark:  Microservices is an interesting term because it has a series of implications about modularity and tight boundaries and connections between the infrastructure that...

To me, Microservices almost seem like an artificial, it's an artificial term. It's something that represents strictly decomposed, very, very short‑term components. I find that to be an artificial distinction. Maybe it's a scale issue, that I see services as a set of cooperating communicating parts.

Microservices is just an extreme of that, where you take the tiniest parts and set a boundary there and you then you build up something with kind of a swarm of these things.

Again, I think that we're still learning how this stuff works. People are still exploring Microservices, and they'll look back and say, "Oh yeah, we've done stuff like this with," I think it's SOA applications and SOAP and things like that.

But if you really look at it, there are comparisons, but there are also significant differences. I think the differences are sometimes overlooked.

Gordon:  One of the examples that I like to bring up is, there's a lot of attention paid to Netflix, for example, for which is their famously this super Microservices type of architecture.

But the reality is, there's other web companies out there, like, Etsy, for example, who are also very well known for being very DevOpsy. They speak at a lot of conferences and the like. They basically have this big monolithic PHP application. Having a strict Microservices architecture isn't necessary to do all this other stuff.

Mark:  It shifts your knowledge and your priorities. The Netflix model lends itself well to these little transient services. When a customer asks for something, I haven't watched their talks, but I'm assuming what that trigger is the cascade of lots of little apps that start up and serve them what they have. When they're done, those little services get torn down and they're ready for the next one.

There are other businesses where that isn't necessarily the right model. Certainly, as your examples show, you can do it either way. I guess each business needs to decide for themselves where the tipping points are for migration from one to the other.

Gordon:  Yeah. I think if I had to kind of summarize our talk here, and maybe it's a good way to close things out is, there are a lot of interesting new approaches here which certainly, at least there is some unicorns using very effectively. But it's still sort of an open question over the broader mainstream, the majority, late majority, slower adopters, how this plays out across a wider swath of organizations.

Mark:  I think what we have now is a set of bespoke, hand‑crafted, they're doing things at scale. At Netflix, they're doing things at a large scale. They had to develop a lot of that for themselves.
Now, it means that a lot of what used to be human intensive operations are now done automatically. That doesn't necessarily generalize.

That's where I think there's still a lot of work to be done, to look at the Netflix's, to look at the other companies that are strongly adopting Microservices, especially for, well, both for inside and outside services. Because you could say the same thing for inside a company.
I think over the next four or five years, we'll see those patterns emerge. We'll see the generalization happen. We'll see the cases where people identify, "This is an appropriate way and we've documented it, and someone's bottled it so that you can download it and run it and it will work."

But I think we're still a few years out from generic, containerized services, new ones, at the push of a button. Still requires a lot of custom work to make them happen.

Thursday, October 08, 2015

Containers: Don't Skeu Them Up (LinuxCon Europe 2015)

Skeuomorphism usually means retaining existing design cues in something new that doesn't actually need them. But the basic idea is far broader. For example, containers aren't legacy virtualization with a new spin. They're part and parcel of a new platform for cloud apps including containerized operating systems like Project Atomic, container packaging systems like Docker, container orchestration like Kubernetes and Mesos, DevOps continuous integration and deployment practices, microservices architectures, "cattle" workloads, software-defined everything, management across hybrid infrastructures, and pervasive open source.

This session discusses how containers can be most effectively deployed together with these new technologies and approaches -- including the resource management of large clusters with diverse workloads -- rather than mimicking legacy sever virtualization workflows and architectures.

This version of the presentation is significantly reworked from earlier. It excises much of the container background while adding discussion on microservices alternatives and services orchestration.

Links for 10-08-2015