Saturday work

Yesterday I spent the day at work testing our installation of APCUPd and tidying up the goram rat's nest of network and electrical cables my predecessor left me.

APCUPSd worked with only a few hitches:

  1. I had one machine polling a UPS, and told it to shut down when there was 30% charge left. The other machines, which poll the master, were set to shut down 30 seconds after the power went out. They shut down, but that bumped up the charge reading on the battery because the load was that much lower. So I didn't get to test the automatic shutdown of the master.
  2. The other three machines were all set to shut down after 30 seconds; however, NFS cross-mounting made for problems with one of them. I'll need to stagger those three machines, whether they're looking at the charge or just shutting down n minutes after the power goes out.
  3. The Solaris 10 box shut down just fine, but when it restarted it did not let me log in — even in the console. Since Solaris 10's boot sequence is dead silent by default (thank you, Sun), it was hard to be sure what was happening. The last time I was patching this machine, reboots took 10 minutes; I gave 20 this time before giving up and going to single-user mode. The problem appears to be /etc/nologin, stuck there from the shutdown. This prevented a login prompt from coming up even in the console, without any sort of warning. Arghh.

As for the cleanup: satisfying. I'm no longer quite so ashamed of the server room.