Monday, June 7, 2010

Keep your Hardware updated

It’s very important to keep you hardware updated. Why? Well, every now and then, the hardware integrator will upload updated to the firmware and to the drivers for the OS.

Some would say why fix it if it’s not broken, but in general the updates fix problems that can bring down your Data Center. Let’s check what happened to me this weekend.

Scenario: 4 DELL R610 (Intel 5520s) running “the latest” DELL build of VMware ESXi 4, all connected to a DELL MD3000i.

The problem: Every 30mins (give or take) one of the Virtual Machines (VM) would loose connectivity for a few seconds.

During all Friday night and the entire weekend, I got alerts every 30mins that this VM was off line. This was driving me nuts. Once I connected to the office, I could ping the VM, I could login via Remote Desktop, I could view logs.. all normal… so what was wrong?

A bug in the “latest” build of ESXi. If you delete a LUN from the MD3000i and you do not refresh the Storage Adapter in ESXi, you will get a strange error. Every 30min, ESXi checks the LUNs and if it cant find one, it will check the other path (if there is one) for the LUN, and because its not there, it brings down the server for a few seconds and this will make some VMs disconnect from the NIC.

The solution: Update ESXi to the real new build from DELL. This update avoids the NICs to disconnect.

So the real question is how to deal with this dilemma. How do you keep up to date with your updates? To be honest? No idea……

Today was a “fix the problem” day. Tomorrow will be “avoid this to happen again” day.

George.
The Captain.

No comments:

Post a Comment