AWS DevOps

Article

Next Generation Managed Services in the cloud

Part 2

By Terence White - Specialist Lead, Consulting 

In this installation of a series of posts about Next Generation Managed Services (NGMS) in the cloud I will be talking about the top on everyone's mind recently - security patching.

“What’s the deal with security patching?” you ask. “I regularly click on the Windows Update button on my desktop and apply updates and it very rarely breaks anything.” Well things are different in server land. The “very rarely breaks” might translate into thousands or millions of dollars of downtime, loss of productivity, data loss or worse. And updating one server by hand is quick. Asking to update one thousand servers by hand is going to cause your staff to revolt.

There has been a sharp upturn in very visible security vulnerabilities over the last two years, and nobody wants to be the reason for your company making the front page of the newspapers for all the wrong reasons. Regular operating system, application server and application security patching is an absolute must. But with patching comes risk, downtime and late nights for operations staff.

Let’s take a look at what a traditional managed services vendor will do around security patching.

Once a month the staff will grab the list of vulnerabilities and patches issued by Microsoft, Ubuntu, SUSE, Red Hat, Apple and other vendors. They will look at the seriousness of the issues and the effort required by the fixes. They may try some of the patches on some tame test servers first. They will need to know the risk of not patching each server and then make a judgement call on whether the server needs to have the fixes applied or not. Patching itself would be an effort as the patches must first be run through in a non-production environment (which hopefully reflects production), and then changes might be raised through the customer’s change process for the patching. It can take several days to weeks before even urgent patches are rolled out. Sometimes the patches break a production server and the server needs to be recovered and patching stopped for the remaining servers until the cause is determined and fixed. Even with scripted automation this is a long and drawn out process.

In an NGMS environment this is quite different. Our NGMS team uses AWS's Systems Manager to maintain large farms of servers. Systems Manager allows us to automate operational actions such as security patching, deployments and other maintenance tasks across thousands of servers from a single console without having to log into the servers individually. We manage the servers by policy on groups of servers rather than by treating them as individuals. There are pre-defined automation documents for things like patching and server restarts which are maintained by AWS, and we have created some of our own for our regular tasks. There are built in safety controls to ensure we don't break all the servers at once. We treat Windows and Linux servers the same at the Systems Manager level and can patch both at the same time automatically by setting up a patching policy for server groups so that our staff don't have to get up at 3am on Sunday mornings. We also get regular compliance reports against the patching baselines and other operational data to report back to the customer who can be assured that their environments are up to date with security patches.

If you are ready to move to the Next Generation way of doing things reach out to us  - we are always open to having a discussion about how our tools and techniques can help your business.

Did you find this useful?