One of the big reasons you spend big money on server-quality hardware is to get cool stuff to make administration easier. Each hardware vendor provides their own software tools – Dell includes OpenManage, IBM includes Director, and HP includes their System Management Homepage.
To illustrate how it works without violating anybody’s NDAs, I picked up a used HP DL380 off eBay to use as a demo.
When you remote desktop into your HP server, you’ll see HP System Management Homepage on the desktop. Launch it, and it’ll ask for your authentication information:
It integrates with Windows authentication, so as long as you’re a relatively powerful user on the local machine, you can use your regular Windows credentials and get in.
If HP System Management Homepage is Blank or Gives Errors
If it hasn’t been configured before, the home page will be completely blank:
You may also get a popup warning saying something like:
A timeout occurred while loading data for the HP System Management Homepage which may result in missing or incomplete information. Please ensure that the various agents configuration is correct. One common error is around setting SNMP community strings and havin gat least one read/write string specified. For additional information on how to discover which components may be causing the timeout, see the HP System Management Homepage log and the HP System Management Homepage User Guide Troubleshooting Section.
That’s technically true – but if you’re dealing with a system that’s never been configured before, and your sysadmins aren’t actively using HP SMH for inventory and management, don’t bother with SNMP. There’s an easier way to fetch hardware data from Windows agents – WBEM.
Click Settings at the top, and under “Select SMH Data Source”, click Select:
From there, change the data source to WBEM, and click the Select button:
After you pick WBEM, *if* the provider agents have already been installed, the hardware details will populate within a minute or two. If not, you’ll need to install HP’s free WBEM providers. These are safe to run on SQL Server, and they give you a lot of good data that we’ll discuss below.
The Working Home Page
Once WBEM is up and running (or SNMP is configured correctly, but good luck with that), here’s what your System Management Homepage will look like:
From here, we can drill down into details to see what kinds of processors, memory, and power supplies we’ve got.
Note that in mine, it’s showing an error for HP NC373i Multifunction Gigabit Server Adapter. That’s because I’ve got a network cable unplugged. If that’s normal for me, I can click on that network card and change the Link Down Status Included to “Ignore Status” – that way I don’t get a red X on the dashboard:
You might have bigger problems, too. To simulate a drive failure, I yanked a hard drive out of the front of my running server. Here’s how the home page looks – note the yellow exclamation point next to the RAID controller:
This isn’t a red-level warning because I’ve already got a hot spare drive in my server. If I click on the Smart Array for details, I get:
In the left, in the Physical Drives list, the red X over one of the drives shows me which one has failed. The Logical Drive is the RAID array, and it’s in a degraded state because it’s in the process of rebuilding. During a rebuild, I can expect slower storage performance, but my server won’t go down altogether. Later, my sysadmin (okay, actually there’s just one of us here) can pull the failed drive and replace it with a good one. I won’t get another rebuild at that point – that newly inserted drive can just become my new hot spare, depending on my settings.
HP System Management Homepage isn’t the only place to see storage problems. When the HP drivers are installed correctly, you’ll also see it in the Windows event logs:
Get Even More Storage Details with the ACU
If my sysadmins are really good, they’ve even installed the HP Array Configuration Utility, which lets me drill down into RAID options in Windows. Check your Start menu to see if it’s installed:
After launching the ACU, you get buttons for the controller settings, caching settings, and more:
From here (as well as from the System Management Homepage), you can get details about whether your array cache is optimized for reads or writes:
If you’re really serious about performance, and you’ve got time before you go live with a server, you can do benchmark testing to determine the right cache settings for your database server’s data files, log files, and TempDB files. For example, if you know your log files are 100% write (except for the log backups), maybe it makes sense to use 100% write cache on those.
Getting Alert Emails When Things Break
We can’t be going into HP SMH all the time just to check on things – ain’t nobody got time for that. Sysadmins can also install HP Event Notifier, which works in combination with the drivers to send us emails when something goes wrong. Check your start menu to see if it’s already been installed:
The configuration is a simple wizard:
You start by configuring the mail server and the “from” address it will use:
I like my “from” address to be the name of the server – in this case, Bigmouth, because my current lab boxes are named after Smiths songs. I like my reply address to be a distribution list for the IT team. That way, when an email alert comes in, one of my admins can just hit reply and say “I’m on it” or “Ignore this, I’m replacing a drive” or whatever.
Next, you’ll configure the destination emails, which again should be distribution lists, not individual employees.
Finally, you get to pick the events that will trigger emails. You want all of them:
I absolutely love this because it catches all kinds of things that regular OS & application alerts can’t. For example, one fine summer Sunday in South Florida, our data center air conditioners struggled to keep up with demand. I received email alerts from my servers saying that their RAID controllers were too hot – and within 15 minutes, the first air conditioner failed outright, followed shortly thereafter by the second (supposedly redundant) AC unit. Those early warnings from the RAID controller temps gave us extra time to get into the office, and every little bit helps. (Especially when you have to shut down hundreds of servers and shared storage devices.)
Managing Firmware and Software with the Version Control Repository Manager
VCRM is like Windows Update for your hardware. If your System Management Homepage has Version Control on the front page, click on it, and you’ll see something like this:
In my example above, I’ve got a list of hardware and drivers going down the left, and then on the right it shows the installed version. That’s missing an important piece – what you really want to know is the most recent version for each one.
Unfortunately, that part is a lot more work. It requires installing a repository server somewhere in your data center, and then pointing each of your servers to that repository. When it’s done, it’s amazing:
Now, I’ve got both Installed Version and Latest Version. When an update comes out, I’ll see it here, and I can simply check boxes and install the updates. Your servers probably aren’t going to have that – but that’s okay! That’s an advanced power tool.
Accessing Your Servers Remotely Via the HP iLO
HP’s Integrated Lights Out (iLO) gives you access to the server’s keyboard, mouse, and monitor remotely over the network. You can get the iLO IP address from your sysadmin or from System Management Homepage. Then go to that IP address with your browser:
You may get SSL certificate warnings – by default, the iLO ships with its own self-signed certificate, and your browser doesn’t trust that.
After logging in, you’ll get the iLO dashboard, which has some pretty nifty buttons:
The “Server Power” button does just what you think it does: you can either do a momentary press, or a press-and-hold to force the server to restart. However, before you do that, you should probably take a look at the console to make sure you’re rebooting the right server (or double-check to see that it’s actually frozen).
Click on the Remote Console tab, and you can actually take control of the server. It’s just like standing in front of the server in the rack. You can even click the Virtual Media tab to map local CDs to the server just as if they were plugged into the server. This is great for emergency diagnostics, but not so great for installing software – it’s much slower than using a network share.
Not all iLO features are available by default – some advanced stuff like remote console while the server’s up require advanced licensing packs. But who cares? You don’t really need that part – you can get an amazing start with the rest of the features I’ve mentioned.
Did I Mention This Stuff is Free?
When you buy and install a new server from HP, Dell, or IBM, they all include an insane amount of really cool management tools. They’re installed by default, but if you decide to pave & reinstall everything, make sure you get these goodies and install them.
They help you dig into your hardware’s capabilities, see how much free space your motherboard has for additional memory, learn how your storage cache is configured, and more. Knowing this stuff makes you a better systems administrator and database administrator.