Improving the Processes – Journey

Gartner is holding their Infrastructure and Operations Management Conference this week.  As valuable as this conference would be to my career, I did not choose to go this year, but I did attend Gartner’s Data Center Conference last year in Vegas.  This was my consolation for not going to VMworld.  And it turned out being a great show.  Very eye-opening as all conferences are, but this one took me into more of the IT manager’s mind set.  Attendees were by and large managers and directors or above at their companies, most of which were large companies.  My eyes were opened to many technologies and processes that were rather foreign to our rather small shop at CFF.  There were not the usual awesome technical demo’s put together by the senior engineers and product specialists – this was more theory of how to run the IT processes and technology and even people.

Why am I writing about this now?  Since Gartner IOM is going on as I write, I happened to see some tweets from the show from an ITIL guru whom I admire, George Spafford.  George is one of the Gartner analysts whom I had the great pleasure of sitting with for a one-on-one session.  At the time, I had just read the book he co-authored with Gene Kim and Kevin Behr, Visible Ops.  Great short read and great practicable advice as the subtitle suggests.  George was very personable and encouraging of his views on how to ramp up ITIL efforts in an organization.  I had also just earned my ITIL V3 certification so I was especially excited about the topic.  We discussed the importance of finding and stopping processes that stand in the way of efficient IT, baby steps needed to start ITIL, the importance of getting management onboard and the critical nature of performing a post-mortem after each major incident to continually improve.  I also reflected on the major points of his book, the first of which cut to the heart:  “Stabilize the Patient”.  In a nutshell, the authors explained the need to get absolute control over all changes.  That means all changes must stop until a proper change management process can be implemented.  The reason was based on their statistics that 80% of systems downtime in an organization result from a change that was made.

This statistic spoke loud and clear to me as we had seen this time and again when scheduled changes created downtime extending outside of maintenance hours, unathorized changes being made with lack of coordination with the impacted department or even within IT, or lack of fully testing when changes did occur.  Last year, this problem was magnified as we moved all of our production systems to a new data center.  We endured so much business disruption in the process, certainly the majority of which was expected due to the massive amount of change required in an effort like this.  But we realized that a sometimes ineffective change management process led to some unexpected downtime – lack of fully mapping out all configurations to be changed and thus lack of testing of these components.

Obviously we had some work to do within configuration management as well.  This speaks to the second tenet of Visible Ops, “Catch and Release”: Learn what you have and document it.  Unfortunately, our environment had been built in various aspects through various stages by various individuals who have gone various ways – with limited documentation.  One of my goals within my organization has been documenting our infrastructure, from the bottom up.  How does a world-class organization get by without knowing all the nuances of the environement?  It doesn’t.  Our environment has become increasingly complex over the recent years, as most organizations have.  And the only way we can become world-class is getting a handle on all components.

Thankfully, the last year was a huge learning opportunity, both from a process standpoint and a technological one.  Today, we’re taking great strides toward solid change management and configuration management.  We’ve also welcomed a new senior-level member of our staff who among other things, has brought in real-world experience in these areas and who has guided our efforts.  We are continually tweaking the processes to make them more efficient, steadily moving forward with eyes on becoming a world-class organization.

Installing Dell OpenManage on ESXi 4.1

I know this topic has been written about in the past, but I figured this would be a good topic for my first technical post.  Just yesterday I had to remind myself how to install Dell OpenManage Server Administrator on a VMware ESXi 4.1 host server.  Since ESXi does not include the Service Console (such old news isn’t this?!) there is no ability to install the OMSA client on the host.  Instead, one simply installs the OpenManage Offline Bundle and VIB on the host, enables the CIM providers and then connects to the host using a locally installed OMSA client.  Dell’s documentation can be found here:  http://support.dell.com/support/edocs/software/smsom/6.3/en/omsa_ig/html/instesxi.htm

Here are the quick and dirty steps using the vSphere CLI.

1.  Download and install the latest version of vSphere CLI:  http://www.vmware.com/support/developer/vcli/  (Requires a MyVMware account.)

2.  Download the latest Dell OpenManage Offline Bundle and VIB for ESXi from http://support.dell.com.  So far the latest I’ve found for ESXi 4.1 is OM-SrvAdmin-Dell-Web-6.5.0-2247.VIB-ESX41i_A01.zip.

3.  Shut down all VMs on the host and place the host in maintenance mode.

4.  Navigate to the working directory of the vSphere CLI.  C:\Program Files (x86)\VMware\VMware vSphere CLI\bin.

5.  Install the OpenManage Bundle to the ESXi host server using the following syntax:  vihostupdate.pl –server <IP address of ESXi host> -i -b <path to Dell OpenManage file>

6.  Restart the ESXi host server after confirmation of successful installation.

Not quite done yet… In order for the newly installed Server Administrator Web Server to communicate with the CIM providers, these providers must be enabled.  To do this through the vSphere Client, follow these next steps:

1.  Logon to the ESXi host using the vSphere Client.

2.  Go to the Configuration tab of the respective host.

3.  Under the Software section, click on Advanced Settings.

4.  In the Advanced Settings, find UserVars.  Then change the value of CIMOEMProviderEnabled field to 1.  Click OKNote: The Dell documentation points out the wrong variable to change.  This should be CIMOEMProviderEnabled (singular).

5.  Execute the Restart Management Agents on the Direct Console User Interface (DCUI) of the ESXi host.  This will hopefully allow the CIM providers to be enabled without rebooting the host.  If it does not, a reboot will have to be initiated.

Bonus:  The above setting can also be modifed using the vSphere CLI.  Use the following command:  vicfg-advcfg.pl –server <ip_address of ESXi host> –username <user_name> –password <password> –set 1 UserVars.CIMOEMProviderEnabled

In order to pull up OpenManage, the OMSA client must be installed on a local desktop or server.

1.  From http://support.dell.com, download and install the latest OpenManage Server Administrator Managed Node for your version of OS.

2.  Type the Hostname/IP Address, Username and Password for the selected host server.  Be sure to check the “Ignore certificate warnings” checkbox.

That’s it!  A little more involved than the old way of installing the Server Administrator on the old ESX servers with the Service Console.  This still allows for full functionality with the smaller footprint afforded by ESXi.

VCP5 in the Bag!

Whew!  It was a close one, but I managed to pass VCP-510 to earn my latest VCP certification.  And this was a tough one, especially since all my hands-on experience was 100% lab.  I created a lab environment on my laptop and just went to town.  Thankfully, my laptop is robust enough to handle 3 host servers, a couple nested VMs, another VM as a domain controller and an OpenFiler iSCSI virtual storage array.

When I studied for my VCP4 exam back in Nov-Dec 2009, I was already using vSphere 4 in a production environment at the office.  We had just recently upgraded from ESX 3.5.  I couldn’t quite try concepts out and break things as in a real lab, but I at least had real world experience which came in handy on exam day.  I also read through all of VMware documentation, and used many of the common study aids at the time posted by popular bloggers.  This time, I tried the same approach and was a little overwhelmed by the sheer volume of documentation to work through.  Of course, I targeted the Exam Blueprint but it was still a massive amount of ground to cover.  Good thing there was some overlap with vSphere 4.

I pulled guides from many of the great bloggers out there, all of which have been listed by others out there so there’s nothing new on this list.  This just gives me a good reference point to come back to when I start studying for the VCAP exams!

VCP Exam Blueprint

Forbes Guthrie’s vReference notes – Amazing!

Andrea Mauro’s VCP5 notes – Great stuff!

Even picked up TrainSignal’s vSphere 5 Training DVD

And of course, all of VMware’s vSphere 5 documentation!

Lot of study, but if paid off.  Here on the last day that the class requirement was waived, I squeaked by with a modest passing score!  Hooray!  Feels good to add VCP5 to my collection of certifications!

Brian Trainor, VCP 3/4/5

Welcome World!

Or shall I say, “Welcome, Brian” to the world of blogging.  This is the beginning of my journey that so many others have taken before me.  I am starting a blog.  I am an IT professional in the DC area, specializing in data center operations and virtual infrastructure (and just about everything else that sys admins do!).  Many times I have thought about starting a blog focusing on the challenges and observations I come across during my days as an IT admin, especially those ideas having to do with virtual infrastructure.  I am a huge VMware fan, have used their enterprise products since Dec 2007, architected and built the current virtual infrastructure for my employer, helped them virtualize close to 70% of our environment within 3 years, achieved both VCP 3 and 4 certifications, and am now missing out on the biggest virtual event of the year and of the universe right now – VMworld 2011!  I will confess that my missing this event is acting as the catalyst that finally motivated me to start this blog.  I needed to channel my “missing VMworld blues” into something productive for myself and that maybe one day can be productive for the greater virtual and cloud community.  As for now, this will be my own little repository to let my own thoughts flow, to use as a reference and to help me articulate the concepts that I come across on a daily basis.

On a very personal and almost completely separate note, the reason I am missing VMworld this year is that my brother-in-law, my wife’s brother, died unexpectedly and very tragically one week before the conference.  This has been a terribly rough time for my family, a time for grieving and seeking comfort in each other’s presence.  My brother-in-law, Matt Griffith, was an amazing husband and father.  He left behind a strong wife, two wonderful boys, 10 and 15 years, and was so devoted to his family and raising his boys to be responsible young men.  I was so touched by all the photos that we sifted through preparing for his memorial service.  The togetherness he was so committed to in his family was very obvious.  Between family trips, birthdays, sports leagues, and just hanging around the house, he had no greater love than time spent with each of his boys and his wife. What a tremendous loss for them!  My heart weeps for each of his boys, for his wife, for his parents who have lost their oldest son, and for my wife who lost her big brother.  Everything hurts right now and will for a while.  I hope to be able to honor my brother-in-law in some special way one day.