Monthly Archives: April 2016

Release Notes & Patches

It’s been a while since I’ve posted to the blog. I had (and have) aspirations of writing here on a regular basis, not every day but certainly more often than I have been lately. I don’t have time to post every day (or multiple times a day) about news happening in the Sysadmin part of the world. There are better sites out there for that type of thing, this site doesn’t need to replicate work that already being done better elsewhere.

I want to focus more on longer/better but less frequent articles. I want to continue writing posts more like the Unifi post. This one is about the importance of reading release notes for all the bits of software sysadmins are responsible for in a modern datacenter.

I just finished a major software upgrade for my company’s production VMware cluster. It was running vSphere 5.5 xxxx and needed to be upgraded to 5.5 update 3, both to address a bug we were experiencing at the version we were at but to also get the wide range of security fixes that had been patched between the two builds. Seems simple enough, right? I mean just login to the vSphere client, connect to the vSphere Update Manager and go to town.

Not so much. I’ve got an approved maintenance window of 3 hours a week, same 3 hours every Thursday. The business knows that’s the time upgrades happen, but everything needs to be back in a running state before 10 PM. I can’t get all of this done in one 3 hour block, so things need to be kept happy and running between maintenance windows.

Besides vSphere, I also needed to account for the following:

  • Trend Micro Deep Security
    • Has various hooks into each host in order to be able to inspect and product the guest VMs. Needs to support both the existing ESXi build as well update 3. Also needed to confirm that the new version of DSM would work with the existing appliances since they could only be upgraded as each host was upgraded in turn.
      -vShield Networking and Security
    • Needed upgraded to address bugs, etc but also needs to be upgraded to a version that is supported by Deep Security, the version of ESXi I was currently running, as well as the version of ESXi I would be going to.
  • Nutanix Controller VMs (NOS)
    • Although there were no known issues at the time of update 3a’s release, I waited approximately 2 weeks for Nutanix to do internal QA with their code and Update 3a to ensure there were no tricky gotchas waiting for me. That’s great because that’s one less thing I need to worry about and it isn’t like I didn’t have a couple maintenance windows worth of other updates that needed to be applied for prior to rolling out the update hypervisor anyway.
  • Horizon View Desktops
    • Needed to upgrade to a version of Horizon View that supported both the current build of ESXi I was on as well as the Update 3a. The VMware Product Interoperability listed no such version. I had to open a ticket with VMware support to verify which build of View I should go to. The matrix has since been updated to show version 6.1.1 was the magic build for me.

After a lot of checking, double checking, and note taking I had a comprehensive set of steps in Omnifocus that would result in an updated cluster that could be completed in chunks spread across several weeks with no downtime outside of the Thursday night maintenance window.

That process was:

  • Upgrade vShield Network
  • Update Deep Security Manager
  • Upgrade vCenter Server Appliance
  • Upgrade Horizon View Connection Server
  • Upgrade Nutanix Controller software
  • Begin updating the hypervisor on each host, one at a time.
    • Pick first host
      • Put host in maintenance mode
        • Upgrade vShield Endpoint Driver
        • Upgrade Trend Micro Filter Driver
        • Upgrade physical NIC drivers for ESXi (update needed)
        • Reboot
        • Remove old Trend Micro appliance
        • Provision new Trend Micro appliance
        • Apply vSphere updates
        • Reboot
        • Exit maintenance mode
    • Verify Nutanix Controller services restarted and rejoined the cluster
  • Repeat for additional hosts

I was lucky. I managed to just barely squeak by without needing to do multiple updates of a single product to get up to date. If I had waited much longer, I’d have had to upgrade vSphere partway, upgrade View, then upgraded vSphere the rest of the way, then finish updating View.

I’ve got resources in the cluster such that we can continue to run at 100% load with one host out of the cluster. I could power off test VMs and other non-critical servers to free up resources so that more than one host could be down at a time. But at the end of the day, I decided that the time savings from jumping through all the hoops to be able to reboot multiple hosts at once would likely be the same as if I just took down one host a time and vMotion’d everything around. In the end, I just did it one host a time. To get everything updated and make it through two reboots of a physical server (rebooting a VM has us all so spoiled, such a fast reboot cycle versus booting a physical server) took about an hour each. I ended up doing two hosts (back to back) in a maintenance window, so it took a few weeks to get everything done.

In news that will come as a shock to absolutely no one who reads a Sysadmin blog, before I got all my hosts upgraded to the latest and greatest build…….a new round of patches was released. Don’t get me wrong, bugs need fixed and security holes need patched. I’m glad to receive improvements and updates. I just need to not let it go so long between update cycles. It makes it a real pain to get it all sorted out.

Release Notes

It’s been a while since I’ve posted to the blog. I had (and have) aspirations of writing here on a regular basis, not every day but certainly more often than I have been lately. I don’t have time to post every day (or multiple times a day) about news happening in the Sysadmin part of the world. There are better sites out there for that type of thing, this site doesn’t need to replicate work that already being done better elsewhere.

I want to focus more on longer/better but less frequent articles. I want to continue writing posts more like the Unifi post. This one is about the importance of reading release notes for all the bits of software sysadmins are responsible for in a modern datacenter.

I just finished a major software upgrade for my company’s production VMware cluster. It was running vSphere 5.5 xxxx and needed to be upgraded to 5.5 update 3, both to address a bug we were experiencing at the version we were at but to also get the wide range of security fixes that had been patched between the two builds. Seems simple enough, right? I mean just login to the vSphere client, connect to the vSphere Update Manager and go to town.

Not so much. I’ve got an approved maintenance window of 3 hours a week, same 3 hours every Thursday. The business knows that’s the time upgrades happen, but everything needs to be back in a running state before 10 PM. I can’t get all of this done in one 3 hour block, so things need to be kept happy and running between maintenance windows.

Besides vSphere, I also needed to account for the following:

Trend Micro Deep SecurityHas various hooks into each host in order to be able to inspect and product the guest VMs. Needs to support both the existing ESXi build as well update 3. Also needed to confirm that the new version of DSM would work with the existing appliances since they could only be upgraded as each host was upgraded in turn. -vShield Networking and Security
Needed upgraded to address bugs, etc but also needs to be upgraded to a version that is supported by Deep Security, the version of ESXi I was currently running, as well as the version of ESXi I would be going to.
Nutanix Controller VMs (NOS)Although there were no known issues at the time of update 3a’s release, I waited approximately 2 weeks for Nutanix to do internal QA with their code and Update 3a to ensure there were no tricky gotchas waiting for me. That’s great because that’s one less thing I need to worry about and it isn’t like I didn’t have a couple maintenance windows worth of other updates that needed to be applied for prior to rolling out the update hypervisor anyway.
Horizon View DesktopsNeeded to upgrade to a version of Horizon View that supported both the current build of ESXi I was on as well as the Update 3a. The VMware Product Interoperability listed no such version. I had to open a ticket with VMware support to verify which build of View I should go to. The matrix has since been updated to show version 6.1.1 was the magic build for me.
After a lot of checking, double checking, and note taking I had a comprehensive set of steps in Omnifocus that would result in an updated cluster that could be completed in chunks spread across several weeks with no downtime outside of the Thursday night maintenance window.

That process was:

Upgrade vShield Network
Update Deep Security Manager
Upgrade vCenter Server Appliance
Upgrade Horizon View Connection Server
Upgrade Nutanix Controller software
Begin updating the hypervisor on each host, one at a time.Pick first hostPut host in maintenance modeUpgrade vShield Endpoint Driver
Upgrade Trend Micro Filter Driver
Upgrade physical NIC drivers for ESXi (update needed)
Reboot
Remove old Trend Micro appliance
Provision new Trend Micro appliance
Apply vSphere updates
Reboot
Exit maintenance mode
Verify Nutanix Controller services restarted and rejoined the cluster
Repeat for additional hosts

I was lucky. I managed to just barely squeak by without needing to do multiple updates of a single product to get up to date. If I had waited much longer, I’d have had to upgrade vSphere partway, upgrade View, then upgraded vSphere the rest of the way, then finish updating View.

I’ve got resources in the cluster such that we can continue to run at 100% load with one host out of the cluster. I could power off test VMs and other non-critical servers to free up resources so that more than one host could be down at a time. But at the end of the day, I decided that the time savings from jumping through all the hoops to be able to reboot multiple hosts at once would likely be the same as if I just took down one host a time and vMotion’d everything around. In the end, I just did it one host a time. To get everything updated and make it through two reboots of a physical server (rebooting a VM has us all so spoiled, such a fast reboot cycle versus booting a physical server) took about an hour each. I ended up doing two hosts (back to back) in a maintenance window, so it took a few weeks to get everything done.

In news that will come as a shock to absolutely no one who reads a Sysadmin blog, before I got all my hosts upgraded to the latest and greatest build…….a new round of patches was released. Don’t get me wrong, bugs need fixed and security holes need patched. I’m glad to receive improvements and updates. I just need to not let it go so long between update cycles. It makes it a real pain to get it all sorted out.

Website Hosting

So, if anyone has visited the site recently you likely noticed two things immediately. A distinct lack of posts and an ever changing roulette wheel of Content Management Systems (and if you dug a little deeper, an ever changing hosting provider as well). Why? Why would I do that?

I wanted to try a few different setups. I started out on Squarespace, moved to WordPress on Linode, and then moved from there to Ghost on Digital Ocean. Now I’m back on Squarespace. Between each jump I had to export and convert the posts of the blog. I also had to figure out themes, the look and feel of the site, setup an SSL cert (or not). I actually wrote blog posts about each jump, why I was moving from Squarespace to WordPress, why I was moving from WordPress to Ghost, now I’m writing about why I ended right back where I started. I mean that literally. I logged into Squarespace to fire up a two week trial and found my old site was just there, just inactive. I re-upped with Squarespace, manually copied over the pitiful 3 or 4 posts I’ve made in the few months since I moved to WordPress and was good to go.

Part of the reason I moved from Squarespace to WordPress was because this is a Systems Administration blog. It seemed (and in some ways still does) a little lame for me to not roll up my sleeves and ssh into my server and keep things running in tip top shape. But I do that all day every day at my 9 to 5 job. Do I really want to sign up to do that in my off hours as well? If I’m honest with myself, no……I really don’t. I like certain aspects of it, sure. But all in all, I get more than enough of that at work. After being on WordPress for a while I realized that all I got done with my “free” time is screw around with the underlying OS of the blog and tweaking bits of WordPress instead of actually writing for the blog. I also realized that at $10 a month, Linode is at the very high end of what a small site like this would cost to host. So between my $10 a month Linode instance, worrying about WordPress exploits, and in general feeling a bit “bleh” about the whole thing I moved to Ghost on Digital Ocean.

Ghost doesn’t use a traditional SQL database like WordPress. Without MySQL, I didn’t really need a VPS with 1 GB of RAM. The smallest Droplet at Digital Ocean would work fine (cutting the hosting cost by a whooping $5/mo or a much more impressive 50%). So I setup Nginx and Ghost (actually I used the Digital Ocean Ghost template) and configured it to host multiple separate instances of ghost. One for this site and one for my personal site. My thinking was the droplet costs the same no matter how I use it and both sites will be very low traffic so why not. The personal site never got a single piece of content written for it or posted. I spent an evening or two making it all work together and be happy with the free SSL cert from Let’s Encrypt. I got that setup and working and the only blog post I ever wrote was a brief post explaining that I moved the site to Digital Ocean and Ghost and to stay tuned for new awesome posts!

Eventually what I realized is that once you pay for a whole year of Squarespace at once to get the 10% discount and then apply another 10% off discount from your favorite podcast it’s less than the $10 a month for Linode (and my uber awesome oh-so-cheap DO Droplet….was saving me literally $2.50 a month). I decided it was time to just admit it. I love screwing around with servers just a little too much. I can’t help myself. I’d rather do that than write blog posts. Plus none of the themes and tweaks I did to either Ghost or WordPress made it look half as good as this theme from Squarespace. So why not use Squarespace for my blog? It’s cheap. It looks great. And on the occasion I get mentioned by someone with a few thousand twitter followers I don’t need to worry about my site crumbling under the load.

In addition to that stunning realization, I discovered something incredible.

Migrating content between different web sites really sucks. Like really. Yeah, import/export features get you 95% of the way there. But man, that last 5% is awful. If only there was a way to write and save blog content in plain text while keeping the formatting, etc intact. That’s right! There’s this thing called Markdown and I’m an idiot for not using it sooner! Actually I started using it back when I wrote that one post while on Ghost. But YES. Starting with this post, and all posts going forward, they will be saved as Markdown formatted files saved on my computer. Where they can be easily backed up and easily manipulated if I ever move away from Squarespace (not anytime soon).

So here I am, here I’m staying. Maybe once Google reindexes my site this post will save some other sysadmin from thinking “I wonder where I should host my blog? I know! I’ll spin up an instance of WordPress on a VPS!”. Trust me. It costs just as much to host on Squarespace once you factor in your time, if you are like me (not an artistic person) the site will look better for it, and on the chance someone famous links to your site you don’t need to worry about the server falling over.