Patching The Cloud?
Just to confuse you, as a lead-in on this topic, please first read my recent rant titled “Will You All Please Shut-Up About Securing THE Cloud…NO SUCH THING…”
Let’s say the grand vision comes to fruition where enterprises begin to outsource to a cloud operator the hosting of previously internal complex mission-critical enterprise applications.
After all, that’s what we’re being told is the Next Big Thing™
In this version of the universe, the enterprise no longer owns the operational elements involved in making the infrastructure tick — the lights blink, packets get delivered, data is served up and it costs less for what is advertised is the same if not better reliability, performance and resilience.
Oh yes, “Security” is magically provided as an integrated functional delivery of service.
Tastes great, less datacenter filling.
So, in a corner case example, what does a boundary condition like the out-of-cycle patch release of MS08-067 mean when your infrastructure and applications are no longer yours to manage and the ownership of the “stack” disintermediates you from being able to control how, when or even if vulnerability remediation anywhere in the stack (from the network on up to the app) is assessed, tested or deployed.
Your application is sitting atop an operating system and underlying infrastructure that is managed by the cloud operator. This “datacenter OS” may not be virtualized or could actually be sitting atop a hypervisor which is integrated into the operating system (Xen, Hyper-V, KVM) or perhaps reliant upon a third party solution such as VMware. The notion of cloud implies shared infrastructure and hosting platforms, although it does not imply virtualization.
A patch affecting any one of the infrastructure elements could cause a ripple effect on your hosted applications. Without understanding the underlying infrastructure dependencies in this model, how does one assess risk and determine what any patch might do up or down the stack? How does an enterprise that has no insight into the “black box” model of the cloud operator, setup a dev/test/staging environment that acceptably mimics the operating environment?
What happens when the underlying CloudOS gets patched (or needs to be) and blows your applications/VMs sky-high (in the PaaS/IaaS models?)
How does one negotiate the process for determining when and how a patch is deployed? Where does the cloud operator draw the line? If the cloud fabric is democratized across constituent enterprise customers, however isolated, how does a cloud provider ensure consistent distributed service? If an application can be dynamically provisioned anywhere in the fabric, consistency of the platform is critical.
I hate to get all “Star Trek II: The Wrath of Khan” on you, but as Spock said, “The needs of the many outweigh the needs of the few.” How, when and if a provider might roll a patch has a broad impact across the entire customer base — as it has had in the hosting markets for years — but again the types of applications we are talking about here are far different than what we we’re used to today where the applications and the infrastructure are inextricably joined at the hip.
Hosting/SaaS providers today can scale because of one thing: standardization. Certainly COTS applications can be easily built on standardized tiered models for compute, storage and networking, but again, we’re being told that enterprises will move all their applications to the cloud, and that includes bespoke creations.
If that’s not the case, and we end up with still having to host some apps internally and some apps in the cloud, we’ve gained nothing (from a cost reduction perspective) because we won’t be able to eliminate the infrastructure needed to support either.
Taking it one step further, what happens if there is standardization on the underlying Cloud platform (CloudOS?) and one provider “patches” or updates their Cloud offering but another does or cannot? If we ultimately talk about VM portability between providers running the “same” platform, what will this mean? Will things break horribly or be instantiated in an insecure manner?
What about it? Do you see cloud computing as just an extension of SaaS and hosting of today? Do you see dramatically different issues arise based upon the types of information and applications that are being described in this model? We’ve seen issues such as data ownership, privacy and portability bubble up, but these are much more basic operational questions.
This is obviously a loaded set of questions for which I have much to say — some of which is obvious — but I’d like to start a discussion, not a rant.
/Hoff
*This little ditty was inspired by a Twitter exchange with Bob Rudis who was complaining that Amazon’s EC2 service did not have the MS08-067 patch built into the AMI…Check out this forum entry from Amazon, however, as it’s rather apropos regarding the very subject of this blog…
A worm in the cloud. That would be interesting to watch.
For a self hosted shop like ours, we pull our security guy, a network guy and a couple Windows sysadmins into a conference call and make a quick analysis of risk/reward for a expedited patch deployment. We can do that because we know where every server is, what it does and what firewall rules surround it. And best of all, it's our (my) call. We (I) call it wrong and we (I) get to write the long e-mails to customers and people above me in the org chart. I'm Ok with that.
If it's a cloud, what do you do – sit back and wait? Open up a ticket with tier 1? Have the cloud vendor dump a load of fuzzy mumbo-jumbo back on you telling you how secure they are (without actually telling you how secure they are)?
For our purchased applications, we are adding language to RFP's and contracts that spell out time frames for full support of new OS and database versions (18 months from RTM), and for vendor security patches (30 days). We are really tired of vendors that still don't support Win 2003/SQL2005 and take 6 months to approve a monthly security patch. I suspect that either the contract with the cloud will have to have some language related to the topic, or the customer will simply trust the cloud provider to do the right thing, as we do with SaaS.
My employer faced this very issue this week. Our data center operator was calling us to tell us when our system would be patched but … there were systems critical to our operatios that the business wanted assurances ( aka "guarantees" ) would not fail. In the end we took the recommendation of the vednor and left the decision up to them about when and where of non-critical system. Measuring risk in this scenario is indeed very difficult.