I had a new VM host at $WORK I was trying to set up, and I kept running into problems when installing a VM with koan/cobbler: it would complain that "at most 2047 MB can be simulated" for a VM. This was on a 64-bit machine, with a 64-bit kernel (CentOS 5.8), with 128 GB of memory installed (and detected), with a 64-bit guest being installed.
In the end, I had neglected to install the kvm package from CentOS, which among other things includes the kvm-kmod module. Once I got that installed and rebooted, everything went fine.
At $WORK I'm trying to install tmap on 64-bit CentOS 5. Here's how it goes:
Built RPM for tmap, and it works -- but not using tcmalloc.
What's tcmalloc? Part of Google-perftools; a faster malloc. We really want tcmalloc.
Found RPM for google-perftools installed, but includes only the 32-bit version of tcmalloc due to dependence of other parts of perftools on libunwind.
installing libunwind on 64-bit CentOS 5 a big PITA and I decide to try working around it.
Conveniently, tcmalloc can be compiled on 64-bit platform; produces libtcmalloc_minimal, which documentation says is perfectly valid malloc.
tmap does not come, out of the box, configured to look for (in configure script) tcmallocminimal, but there is an commented-out option to do so. You can remove the comment and run autogen.sh, and then configure will look for libtcmallocminimal.
...but this fails because the way I compiled/built rpm for tcmalloc does not include libtcmallocminimal.so; includes libtcmallocminimal.so.4.
and so my half-assed RPM/devel skillz come back to bite me in the ass again.
Random rpmbuild stuff:
Here's how to create symlinks when building an RPM
If you're packaging a statically-built binary (I know, I know) and it suddenly craps out with "unexpected reloc type in static binary", put this in your spec file:
    %define __os_install_post %{nil}
rpm --showrc
pushd $RPM_BUILD_ROOT/opt/bin
ln -sf ../package/bin/foo .
Today I had to compile a program that needed a newer version of the Autoconf suite than is available on CentOS 5. I got around this like so:
Downloaded and rebuilt the SRPM for autoconf2.6 from the good folks at pkgs.org
Installed it on the build machine (dang! shoulda been using Vagrant for that! Or least Mock...)
This gives you /usr/bin/auto[whatever]2.6x -- which is good! don't overwrite stuff! But you'll still get complaints about not new enough versions.
Symlink all the 2.6 binaries to ~/bin/:
for i in /usr/bin/auto*2.6x ; do
    ln -s $i ~/$(basename $i | sed -e's/2.6x//g')
done
~/bin first.Walla.
A nice thing about working at a university is that you get all this time off at Xmas, which is really nice; however, it's also the best possible time to do all the stuff you've been saving up. Last year my time was split between this job and my last; now, the time's all mine, baby.
Today will be my last of three days in a row where the machines have been all mine to play with^W^Wupgrade. I've been able to twiddle the firewall's NIC settings, upgrade CentOS using Cfengine, and set up a new LDAP server using Cobbler and CentOS Directory Server. I've tested our UPS' ATS, but discovered that NUT is different from APCUPSD in one important way: it doesn't easily allow you to say "shut down now, even though there's 95% battery left". I may have to leave testing of that for another day.
It hasn't all gone smoothly, but I've accomplished almost all the important things. This is a nice surprise; I'm always hesistant when I estimate how long something will take, because I feel like I have no way of knowing in advance (interruptions, unexpected obstacles...you know the drill). In this case, the time estimates for individual tasks were, in fact, 'way paranoid, but that gave me the buffer that I needed.
One example: after upgrading CentOS, two of our three servers attached to StorageTek 2500 disk arrays reported problems with the disks. Upon closer inspection, they were reporting problems with half of the LUNs that the array was presenting to them -- and they were reporting them in different ways. It had been a year or longer since I'd set them up, and my documentation was pretty damn slim, so it took me a while to figure it out. (Had to sleep on it, even.)
The servers have dual paths to the arrays. In Linux, the multipath drivers don't work so well with these, so we used the Sun drivers instead. But:
cfservd
had refused its connection because I had the MaxConnections
parameter too low.I got it fixed in the end, and I expanded the documentation considerably. (49,000 words and counting in the wiki. Damn right I'm bragging!)
Putting off 'til next time, tempted though I am: reinstalling CentOS on the monitoring machine, which due to a mix of EPEL and Dag repos and operator error appears to be stuck in a corner, unable to upgrade without ripping out (say) Cacti. I moved the web server to a backup machine on Tuesday, and I'll be moving it back today; this is not the time to fiddle with the thing that's going to tell me I've moved everything back correctly.
(Incidentally, thanks to Matt for the rubber duck, who successfully talked me down off the roof when I was mulling this over. Man, that duck is so wise...)
Last day today. (Like, ever!) If I remember correctly I'm going to test the water leak detector...and I forget the rest; it's all in my daytimer and I'm too lazy to get up and look right now. Wish me luck.
And best of 2010 to all of you!