Thursday, January 19, 2017

Podcast: Why sysadmins hate containers with Mark Lamourine

Two hats

In this podcast, Red Hat's Mark Lamourine brings the perspective of a former sysadmin to explain why containers can seem like a lot of work and potential risk without corresponding benefit. We also discuss the OpenShift Container Platform as well as a couple of Fedora projects aimed at removing barriers to sysadmin container adoption.

Show notes:

Audio:

Link to MP3 (0:29:20)

Link to OGG (0:29:20)

[Transcript]

Gordon Haff:  Hi, everyone. Welcome to another edition of the "Cloudy Chat" podcast. I have my former partner in crime, Mark Lamourine, back with me today. He's been off doing some other things.

Mark came to me with a fantastic title for a podcast, and I knew I just had to sit down with him, "Why Do Sysadmins Hate Containers?"

Mark, you've been a sysadmin.

Mark Lamourine:  That's my background. My background is a system administrator. I have computer science degree, but I've spent most of my time either as a system administrator or as a developer advocating for system administrators who have to manage the software that people I'm working with produce.

Gordon:  I go to an event, like Amazon re:Invent, for example, and there are a lot of sysadmins there, maybe a little more new‑age system admins, DevOps, whatever the popular term is this week. They seem to love containers. Where do you make that statement from?

Mark:  There's actually two. What brought this up to me was I was at the LISA Conference, LISA16, in Boston this fall. I noticed that there were only a couple of talks, one tutorial, and a couple of books on containers. There was a lot of the other traditional sysadmin things. There's new tools. There's people learning different areas.

I was there because I assumed that sysadmins were going to still think that containers are growing and that this would be a big thing coming. There was some of that, but I got an awful lot of, "Yeah, we don't do containers. We tried containers. It didn't work. That's old hat." There were a whole bunch of things which ranged from disinterest to disdain for containers among that group of people.

The difference between that group and a group at re:Invent is that re:Invent is specifically aimed at the technology. It's aimed at that company. It's aimed at Amazon. All the people who come are self‑selecting interested in cloud, in Amazon, in their products, and in their tools.

At LISA, the self‑selection is I am a professional system administrator without regard to the technology I use. There were a bunch of people there who use Amazon. They use virtual machines. They use cloud. They didn't find containers to be a compelling thing to follow.

Gordon:  Why don't they find containers a compelling thing to follow when everyone says they're so great?

Mark:  There were a number of different reasons that I heard. Some of them were just misinformation. There were people who said, "Yeah, we knew about that with jails." In 1970s, BSD had jails in its chroot. I'm not going to go into it, but there's an answer to that. Containers are not that. That was a very old thing.

I liken that to saying, "Well, this guy, Jameson," or, whatever his name was, in France, "discovered inoculation back in the 1800s. Why do we need flu vaccines and monoclonal antibodies?"

Gordon:  It's like, "Oh, what's this cloud thing? We had time‑sharing." "Oh, virtualization. That was invented by IBM in 1960. What do we need this new?" It's this idea of, "Oh, everything's been done before."

Mark:  There are a number of things like that. There's a number of flavors. The, "Oh, Solaris had Zones. We know about that. See where that went." There were a number of responses like that. There was a number of, "Oh, it's hype," and those people aren't wrong.

It also is an incomplete answer. I agree that it's hype, but I also agree that it's important because while the hype maybe way out in front of reality, reality is way in front of what it was three or four years ago.

Gordon:  They just don't see any benefit for themselves?

Mark:  That's really the sense that I got. When I got past the people who were just naysayers for whatever reason, and I started bringing up, "Here are these tools. Here are these things I've used. Here's what I've done with it," the response was, "Well, but how does that help me?"

They're getting their developers, and they're getting their managers coming saying, "Oh, we need, well, cloud in some form." Some of them are OpenStack, some of them are Kubernetes, some of them are OpenShift, but their managers and their developers are saying, "Hey, there's this cool thing," and the sysadmins respond with the two kind of predictable responses.

One is, "Yeah, OK. I'm going to build a service for you that's work for me." The second one is, "This doesn't really help me. It gives me a lot more work. I've got to build new containers, I've got to build all of this stuff," or they would say, "Let's put our app into containers," and everyone's first response is, "Let's shove the entire application suite into one container and treat it like it's a virtual machine lite."

Everybody finds quickly that's not productive. It requires a lot more work to do refactoring. Somewhere in that process, many of them have said, "Our engineers got tired of it," or, "We got tired of it, and we just went back to the old way of doing things, because it doesn't buy us anything right now."

Gordon:  I'll get back to doing things the new way versus the old way in the moment, because I think it's an important point. There's something that those of us who promote technology often forget. Without saying these sysadmins are Luddites, they have a job to do. That job is to keep systems up, and the idea of, "Let's do this new stuff that's going to put me out on the bleeding edge, and probably get me on pager duty in the middle of the night when I'm trying to sleep." That just doesn't sound very appealing.

Mark:  As a sysadmin, I think sysadmins are a slightly different breed from many other geeks and technophiles. Sysadmins are, by their nature, conservative. They are probably the least bling attracted technophiles you'll find. In large part, they're the ones responsible for making sure that it works, and so they're going to tend to be conservative.

They'll explore a little bit, but their goal really is to make things work and to go home and not get paged.

Anything that you introduce to them that is both a lot of day work, and an opportunity to get paged, they're going to greet with a certain amount of skepticism.

Gordon:  That's absolutely fair. I don't think developers, and certainly not the move fast, and break things crowd, really appreciate that aspect of sysadmins. I think there's also an element though of, "This new stuff is going to abstract more. I already understand how the system works. It's going to create a new point of failure. It's going to complicate things. It is something else that I'm going to need to learn." That's not necessarily always the right point view either.

Mark:  No, but they're human. Their goals are to make their own lives easier. As a sysadmin, one of the other characteristics of sysadmins is that they will spend a lot of time avoiding doing a tedious task twice. They'll spend a lot of time creating a script to do something that only takes them 10 or 15 seconds to type. When they've typed it the 100th time, they get tired of it. Those things they know how to do.

When you impose something new, because it's a requirement for other things, they're going to be resistant to that until it helps them because they're inherently lazy people. I mean that in the best sense.

Gordon:  Actually, that was pretty much the topic from some Google presenter at a recent conference. I don't remember what the details of it were, or who it was exactly, but he went through the Google infrastructure, and how Google approached problems.

It was at Cloud Native Con/Kubecon actually, and they talked about how their approach was, "Oh. I've done this three times. It must be time to automate it."

Mark:  Sysadmins, I find it, and I know this is true of me, are pathologically lazy. Again, I use that in the nicest sense in that there are times when I have spent an hour, or more, understanding and encoding an automated solution to a problem that literally took me a minute a day.

It sounds like, "Well, that was a waste of an hour in a day," except that after a while, it saves me an hour.

Gordon:  We're doing a podcast right now and there's a lot of fairly repeatable manual processes associated with this. I'm putting some intro on, I put some outro on, I do some transcoding to different formats.

There's manual editing of course, and you can't really automate that. I spent a couple of days at some point, writing a Python script, and now, it's super quick and not nearly as error‑prone. "Oh, I forgot to make that file public on AWS."

Mark:  This is a characteristics of sysadmins that they do want to automate, but they are always going to use the tool they know first. Containers certainly present a real long learning curve before they start seeing the benefit. That's where a lot of the resistance really comes from.

Gordon:  It's probably worth contrasting containers with virtual machines in that regard, because the way virtual machines came in was, we were in the dot-bomb, dot-com type of era, nobody had any money, with these underutilized servers.

People just wanted to improve the utilization of those servers. They didn't want to change their operational procedures in major ways to do it. That's one of the reasons virtual machines became so popular as the approach at the time. They solved a specific problem, utilization of physical servers that no one had any money to buy, but without requiring a lot of changes.

Of course, virtual machines did evolve over time. Things like Live Migration and different things in terms of storage pooling and that kind of thing. But, fundamentally, virtual machines didn't present a big barrier to sysadmins from an operational point of view.

Mark:  All it really added was the virtual machine barrier. Once you got your virtual machine going, you could hand it off to another sysadmin or to a Puppet script and say, "That's a computer. Acts just like your other computers."

Containers are not VMs. They are processes with blinders on. Building those blinders is a new way. You'll see people who still try and treat them as VM lite. There are people who will try and stuff essentially a virtual machine inside.

They very quickly find that there's either no benefit, or it actually incurred some cost. Those people will often go back to using virtual machines because they don't want to change their paradigm. The argument of the people who actually are advocates of containers is that there is some benefit to this model.

It doesn't appear quickly and it requires a lot more retooling, both of the real tools and of the mindset of the administrators before they see the benefit.

Gordon:  We've gone into configuration management in a lot more detail in another podcast, but there are some analogs there. You can certainly do automation using scripts, but for example, as VMs came in and suddenly you're multiplying your number of "servers," by 10 or 20 or whatever the number is, doing the scripting didn't work so well any longer.

You really needed some of these newer types of tools that did things in more declarative ways, or that did things in other ways that were more suitable for very large and complex app configurations.

Mark:  That actually brings up a psychological or behavioral analog that I wanted to highlight, is that we've had situations like this before where either sysadmins or various other areas of the community were resistant to changes that some of us saw as important early.

I remember trying to convince people that they should use sudo. They're like, "What do you mean? I can't code without root. How can I possibly work as a non‑root user? Why should I use this sudo thing? I just always log in as root anyway."

It took about a decade from when I first saw sudo, to the point where it was just accepted common practice that everyone did. There were other analogs like that. Configuration management was one of them. I remember having conversations with people who would say, "I run a shop with 10 or 20 hosts. Do I need configuration management?"

Those of us in the room were going, "Yes, you absolutely do." They go, "But, I don't have time for that." I'm like, "That's why you need it." It took a while for that to catch on. I think virtual machines were actually a big influence in the adoption of configuration management in the small shops.

They went from having four or five machines to having 20, 30, or 50 VMs. Even though they had the same hardware, they saw the multiplication of the manageable units. Configuration management made sense to them. Again, that's become a common part of a sysadmin toolbox, is to know Puppet or Chef--not CF Engine so much anymore.

Ansible now is big one. You're talking about the familiar tools model. Ansible's largest appeal to a lot of people is that it looks like shell scripting, or it looks like these other scripting languages where Puppet and Chef were more traditional configuration management.

That's a bit of insight but the analog still holds. In containers, you've got people going, "Do I need that? Why do I need that?" Those of us who have worked with them a lot are going, "Yes. Yes, you need this." It's going to take something to induce them to see it.

Gordon:  You're seeing the cycle continuing with security around containers, for example. I was just reading yesterday, there was a minor possible exploit related to containers. Details don't really matter here, but there was a discussion going on in "Hacker News" that basically people were going, "Well, you could change this and that would fix this," and so forth.

Somebody wrote, "Or you can use SELinux and setting enforce=1, and this exploit can't happen."

Mark:  That's another tool that still is in the usage phase. There are still people who go, "SELinux is too hard to use." They'll disable it just as a matter of course. It still is hard to use, but most people don't need to interact with it that way most of the time.

Another barrier to entry for people creating containers is they have to think about resource usage that they didn't have to think about when everything was on the host. They just install a new thing on the host and the resource would be there.

Now, they have to make sure that all the research sources they need are inside the container. For a sysadmin who's just working on the box and trying to run some tool, the idea of putting that tool in a container really doesn't make sense for them, until they can treat the container the same way they treat the installed package.

Gordon:  Do sysadmins need to just suck it up, or is there something we can do for them?

Mark:  First thing is that, no, they don't have to suck it up. They are the ones doing the job. It's up to them to decide when to use this stuff. We can advocate and we can try and give them the tools, and we can try and make a point and help people understand.

We have the responsibility of understanding too. They are our users. If you present software to a user and it doesn't help the user, they are not going to use it, no matter who they are. We do have a responsibility within the container community to look at their usage, look at their needs, and find ways to help them.

There are a couple of projects I know about right now that are trying to address that. The Fedora Modules Project which is a project a guy here works on, Langdon White, is one of the leads on it.

I'm going to do a little aside here. One of the objections that people had to containers was they said, "Well, they're just packages. We've done this before. We know how packages work."

I would disagree that that is a sufficient answer because there are packages with other stuff on them that can work in different ways, but they had a certain point. If you're going to use containers to do sysadmin jobs, sysadmin tasks on a host, you need to be able to treat them like a package. The sysadmins need to be able to use a model similar to what they're used to.

Fedora Modules is an attempt to take packages which are commonly installed on hosts and put them into containers that can be used in a manner similar to packages. The best examples right now are things like web browsers and certain system services. The packages are trying to address early are things that sysadmins would use on a regular basis.

They would commonly install a web server, Nginx or Apache. They'd install it as an RPM, and then they'd configure a bunch of root files or a bunch of owned files. The Modules Project is trying to produce a container image which can be used the way an Nginx RPM would be used in that you say, "Container install this thing," and, "Container enable," instead of "Package install, package enable."

Now, you have an Nginx running on your box. It's running in a container instead of as a host process. The benefit, if you're using containers, is that you can run multiple Nginx on the same host without having to worry about separating the configuration files. They can run as independent things.

The Modules Project is just trying to create these containers which are analogues to host RPMs for things that are appropriate. Chrony was one I heard someone talk about, so NTP services, web services, other systems services like that that don't necessarily have to be bound, especially ones that are for users, like the Apache Server, or a MySQL Server.

There's no good reason for that to be installed on the host. This would allow a way to do it.

The second project that is trying to address it is working from the other side, from the developer side. That's the Fedora Layered Docker Image Build Service. The people in the Fedora Project have looked at the model, the build model for RPMs, and said, "Why don't we apply this to containers?"

They've created a container service, a container built service, which is an analogue to the RPM submission and build service so that instead of submitting an RPM spec which points to a GitHub repo somewhere for software, you submit a Docker container spec, and it might be a Docker file, but it might also be some other Docker build mechanism.

What you get is a professionally packaged container in a well‑known repo signed by the Fedora Project so that unlike Docker Hub where anyone can put anything out there, in the Fedora Project, it has to go through a vetting process. It has a set of standards in the same way that RPM specs have a standard. They have maintainers. They have certain kind of tracking.

These two projects, the Modules Project which works from the sysadmin's side and the container built project which work from the build developer, package developer side, those two projects together are working towards that middle where a sysadmin could choose instead of installing DNF install, or Yum install Nginx, they could do container install, or a module install Nginx.

For them, the behavior is very analogous.

Gordon:  Bring this home. I'm going to steal something that you mentioned to me a little bit ago, that we talk about having green‑field and brown‑field environments and having to deal with both of those, but from a skills level and from and perhaps more importantly from an experience level, there really is no green‑field as far as sysadmins are concerned.

You can't just assume that they're a blank slate and forget all about the processes and experience that they've acquired over decades in some cases.

Mark:  Yeah. When I said it, it was the first time I kind of thought that going, "Yeah, you can't treat them the same way."

In the container world, we have done an awful lot of work which assumed green‑field or which green‑field was imposed on us because like I said, people started trying to stuff whole applications with multiple processes into a container, and they very quickly found that that was a real problem.

They would try to tease apart the components of their application on existing application and refactor it, and they'd go, "Oh my God. This is such a pain, too. It's not really working," and they would inevitably either quit or go back to green‑field and say, "Look. Let's redevelop these as containers, as micro services from the start because of the lack of a way to migrate easily."

I think part of the reason that lack is there is that we're treating containers as the thing that runs. We're still treating containers as something we build, not something we use. As we learn the patterns, we've talked about this before, we're still in the position of learning the patterns of how containers are used well. As we learn those patterns, we're going to start eliminating variables.

We're going to start eliminating parameters. We're going to start adding defaults and assumptions, and we're going to start addressing these real world used cases in ways that they hide the minutiae that doesn't matter anymore. Someone who uses a package, who installs a package doesn't care about how it was built, doesn't care where that source code came from, doesn't care...

I mean, they can go find out, but as far as a user goes, they just Yum install a package, and they're done. That's all they should have to do, and I think we'll get there with containers. I think we'll get there with containers for sysadmins. It will be longer before we get there for container services.

I've heard somebody recently in a position of advocacy to say, "Well, Kubernetes, OpenShift, or whatever will be the new operating system." While I think some kind of container cloud infrastructure is in the offing in the long‑term, I think we have a ways to go before we get there, and think there are a whole lot of transitional states we're going to go through.

That's kind of where I work, is, "What works now?" I want to look ahead, but I need to remember that there are people doing the work now.

Gordon:  I'm actually giving a presentation at MonkiGras in a couple of weeks, and hopefully I get this podcast posted before then, about packaging, not really software packaging, but the grand history of packaging going back to pottery. One of the things I bring forward in that is that, yeah, a lot of early packaging was pretty much functional.

You put your wine in this clay jar of some sort, but really where we've progressed to with packaging is this much more almost experiential model of packaging where you make it easier to consume, easier to use, easier to have confidence in the elements that are part of that package, so many other things we're trying to do with OpenShift for both developers and operations, for example.

You're absolutely right. Containers are probably ultimately going to end up being one of those components that most people don't need to think about very much, whereas the actual packaging and the actual user experience, whether they're a sysadmin or developer, is at some higher level platform, for the most part.

Mark:  That comes back to something I said earlier with the people who object to containers, who brush aside containers on the grounds that, "They're just packages and we've been there and done that," are ignoring a very significant part of the advancement of containers. It's the contents, and it's some structure for the contents which an RPM would have.

It's got the spec for building it, and yeah, we've done those things before. What it has that traditional packages don't is that traditional packages are static. They have information about how to install themselves, which is an advance over tarballs, but they don't' have the metadata about how they're expected to be used. That's the significance in the container packaging.

We do not have semantics or we're developing container semantics for that packaging metadata which will say, "Here's how I expect this software to be used, and here's the inputs it expects. Here's the parameters it expects. Here are the other packages or the other containers it expects to interact with." I think that's what the people who think containers are just packages are overlooking.

It's an area of research that we still don't know all the answers to.

Wednesday, January 04, 2017

Optimizing the Ops in DevOps

This post is based on my recent presentation at DevOps Summit Silicon Valley in November 2016. You can see the entire presentation here

Screen Shot 2016 11 22 at 3 03 33 PM

We call it DevOps but much of the time there’s a lot more discussion about the needs and concerns of developers than there is about other groups. 

There’s a focus on improved and less isolated developer workflows. There are many discussions around collaboration, continuous integration and delivery, issue tracking, source code control, code review, IDEs, and xPaaS—and all the tools that enable those things. Changes in developer practices may come up—such as developers taking ownership of code and pulling pager duty.

We also talk about culture a great deal in the context of developers and DevOps. About touchy-feely topics like empathy, trust, learning, cooperation, and responsibility. It can all be a bit kumbaya.

Screen Shot 2016 11 22 at 3 18 55 PM

What about the Ops in DevOps? Or, really, about the other constituencies who should be part of the DevOps process, workflow, and even culture? Indeed, DevSecOps is gaining some traction as a term. DevOps purists may chafe at “DevSecOps" given that security and other important practices are supposed to already be an integral part of routine DevOps workflows. But the reality is that security often gets more lip service than thoughtful and systematic integration.

But what’s really going on here is that we need to get away from thinking about Ops-as-usual (or Security-as-usual) in the DevOps context at all. This is really what Adrian Cockcroft was getting at with the NoOps term; he didn’t coin it but his post about NoOps while he was at Netflix kicked off something of an online kerfuffle. Netflix is something of a special case because they are so all-in on Amazon Web Services, but Adrian was getting at something that’s more broadly applicable. Namely that, in evolved or mature DevOps, a lot of what Ops does is put core services in place and get out of the way.

Screen Shot 2017 01 04 at 10 04 27 AM

Ironically, this runs somewhat counter to the initial image of breaking down the wall between Dev and Ops. Yes, DevOps does involve greater transparency, collaboration, and so forth to break down siloed behaviors but there’s perhaps an even stronger element of putting infrastructure, processes, and tools in place so that Devs doesn’t need to interact with Ops as much while being (even more) effective. One of the analogies I like to use is that I don’t want to streamline my interactions with a bank teller. For routine and even not so routine transactions, I just want to use an ATM or even my smartphone.

It’s up to Ops to build and operate the infrastructure supporting those streamlined transactions. Provide core services through a modern container platform. Enable effective automated developer workflows. Mitigate risk and automate security. But largely stay behind the scenes. Of course, you still want to have good communication flows between developers and operations teams; you just want to make those communications unnecessary much of the time.

(At the same time, it’s important for Dev and Ops teams to understand how they can mutually benefit by using appropriate container management and other toolchains. There’s still too much of a tendency by both groups to think of something as an “ops tool” or a “dev tool.” But that’s a topic for another day.)

Let’s look at each of those three areas.

Modern container platform

Screen Shot 2016 11 22 at 3 53 00 PM

A DevOps approach can be applied just about anywhere. But optimizing the process and optimizing cloud-native applications is best done on a modern platform. Take it as read that most IT is going to be brownfield and DevOps may even be a good bridge between existing systems, applications, and development processes and new ones. But here I’m focusing on what’s optimized for new apps and new approaches.

You need scale-out architectures to meet highly elastic service requirements. Application designs with significant scale-up components simply aren’t able to accommodate shifting capacity needs. 

Everything is software-defined because software functions, such as network function virtualization and software-defined storage, are much more flexible than when the same functions are embedded in hardware.

The focus is on applications composed of loosely-coupled services because large monolithic applications can be fragile and can’t be updated quickly.

A modern container platform enables lightweight iterative software development and deployment in part because modern applications are often short-lived and require frequent refreshes and replacements. 

As I wrote about in The State of Platform-as-a-Service 2016, a PaaS like Red Hat’s OpenShift has evolved to be this modern container platform, embracing and integrating docker-format containers, kubernetes for orchestration and using Red Hat CloudForms (based on the ManageIQ upstream project) for open source hybrid cloud management.

Automated developer workflows

Screen Shot 2017 01 04 at 10 23 51 AM

When thinking about the toolchain associated with DevOps, a good place to start is the automation of the continuous integration/continuous delivery (CICD) pipeline. The end goal is to make automation pervasive and consistent using a common language across both classic and cloud-native IT. For example, Ansible allows configurations to be expressed as “playbooks” in a data format that can be read by both humans and machines. This makes them easy to audit with other programs, and easy for nondevelopers to read and understand.

A typical automated workflow begins with operations provisioning a containerized development environment, whether a PaaS or more customized environment. This provides an example of how a mature DevOps process separates operations and developer concerns; by providing developers with a dynamic self-service environment, operations can focus on deploying and running stable, scalable infrastructure while developers focus on writing code.

Automation then ensures that the application can be repeatedly deployed. Many different types of tools are integrated into the DevOps workflow at this point. For example:

  • Code repositories, like Git
  • Container development tools to convert code in a repository into a portable containerized image that includes any required dependencies
  • Vagrant for creating and configuring lightweight, reproducible, and portable development environments¥IDEs like Eclipse
  • CICD software like Jenkins

Mature DevOps systems may even push the code directly to production once it has passed automated testing. But this isn’t about removing Ops from its role of ensuring stable and robust production systems. Rather, it’s about automating the processes to ensure that deployed code meets set criteria without ops needing to be directly involved with each deployment.

Mitigate risk and automate security

Screen Shot 2017 01 04 at 10 32 15 AM

Ops is also ultimately chartered with protecting the business. This doesn’t mean eliminating all risk—which can’t be done. But it does mean mitigating risk which is accomplished in part by managing the software supply chain and by automating away sources of manual error. 

That’s not to say that security is purely an ops concern—hence the aforementioned DevSecOps term. Creating a mindset in which everyone is responsible for security is key as is the practice of building security into development processes. Security must change from a defensive to an offensive posture that is both automated and constant.

Among the practices to be followed in such a proactive environment are:

  • Components built from source code using a secure, stable, reproducible build environment
  • Careful selection, configuration, and security tracking of packages
  • Automated analysis and enforcement of security practices
  • Active participation in upstream and community involvement
  • Thoroughly validated vulnerability management process

I think this quote from Gartner captures the required dynamic well: "Our goal as information security architects must be to automatically incorporate security controls without manual configuration throughout this cycle in a way that is as transparent as possible to DevOps teams and doesn't impede DevOps agility, but fulfills our legal and regulatory compliance requirements as well as manages risk.” (Gartner. DevSecOps: How to Seamlessly Integrate Security Into DevOps. September 2016. G00315283)

----

Credits

Dev: Nelson Pavlosky/flickr under CC http://www.flickr.com/photos/skyfaller/113796919/

Ops: Leonardo Rizzi/flickr under CC http://www.flickr.com/photos/stars6/4381851322/

Piggy bank: https://www.flickr.com/photos/marcmos/3644751092

Stop: https://www.flickr.com/photos/r_grandmorin/6922697037

DevOps wall: Cisco