12 Jun 2009
If you have space for two PDUs and you put one on each side of the
rack, you will have no separate space for network cables and you'll
get interference. If you put those two PDUs on one side of the rack,
you'll put it on the wrong side and your power cords will interfere
with your network cables. If you put those two PDUs on the correct
side of the rack, you'll find that racking new items is a pain because
the cords block the post holes on that side.
Tags:
serverroom
11 Jun 2009
Gave a tour of the new server room today to about 30-odd people in the
department. Ended on a bit of a low note ("and that's the end! Any
questions?") but other than that it went well. Even got an ounce of
champagne at the end of it.
Oh, and yesterday I found out that our SL-500 has three fibre
channel interfaces, compared to the one interface in the server we
bought. I think the sales folks assumed we had a fibre switch, and I
didn't realize it all (data + control) wouldn't go over one
cable. Arghh.
Just saw a character named Terence on "Entourage" who was not
Terrance Stamp. Now I want to see "Bowfinger" and "The Limey", in that
order.
Tags:
hardware
serverroom
backups
10 Jun 2009
Given the recent hoo-ha about abandoned blogs, and my own tendency to
lose interest in writing about something the longer I put it off (I
haven't graphed it, but I suspect it's a nice exponential decay), I
figured I should finally write up what I've been doing the last week:
the move at $WORK to our new server room.
So: construction finally got finished on our new server room. Our UPS
was installed, our racks set up, and the keys handed over (though they
were to be changed again twice). Our new netblock was assigned, the
Internet access at the new location was in place, and movers were
booked.
Things I did in advance which helped immensely:
- Checklist in Org mode, plus printed copies; the ability to constantly edit a nice todo list, complete with checkboxes and statistics, was wonderful.
- Printed copies of the spreadsheet showing rack assignment, cabling requirements, VLAN changes, etc
- Tested new firewall with VMs (thus pointing out that "antispoof quick" is not a good thing to do with a bridging OpenBSD firewall)
- Cardboard for the floor of the new server room to lay the servers on (since we weren't going to be able to rack the machines as quickly as they came from the movers)
Last Thursday morning, it all started. I got the machines shut down
(thank you, SSH and ubiquitous wireless access at UBC) before the two
volunteers who were helping me showed up. We started getting machines
unracked; since it was only about 20 machines, I figured it wouldn't
take too long. While that was true, I had not counted on the rat's
nest of power cables (our power requirements were such that we had to
connect machines to PDUs in adjacent racks), or the fact that we
wouldn't be able to disassemble that 'til we'd got the machines out.
There was one heartstopping moment: a 1U server, while extended on its
rails, came off one of the rails while no one was supporting
it. Amazingly the other rail held on while it rotated quickly through
90 degrees to bang loudly against the rack. "You swear quickly," the
movers remarked. (Doubly amazingly, the machine seems to be fine,
though the rails for the thing are shot.)
The movers were big and burly, which was wonderful when it came to
moving the Thumper. I weigh more than it does, but not by much,
and I'd had the bad fortune to screw up my back a week before the
move. It was tricky trying to figure out how to remove it from the
rails, but the movers' trick of supporting it with a couple of big
blankets, while fully extended from the rack, made such considerations
less urgent. Eventually we got it figured out. I don't know how that
could have gone smoother, since we'd got Sun to rack the thing and,
frankly, it's not like you spend a lot of time un- and re-racking
something like that. Anyhow, a minor point.
The new location was right around the corner, which was handy. The
movers had put the servers in these big laundry-like carts on wheels;
in the end, we only had four of em. We got the machines unloaded,
racked the Thumper with the movers help, signed the paper, then went
off for lunch where we picked up two more volunteers.
After that, we started racking servers. Having only one sysadmin
around (me) proved to be a bottleneck; the volunteers had not worked
with rackmounted machines before, and I kept having to stop what I was
doing to explain something to them. It would have been a great help to
have another admin around; in fact, I think this is the biggest move
I'd want to make without some other admin around.
Problems we ran into:
- Cage nut pullers are small and get lost easily. (Moral: designate one place for tools, just like it sez here)
- Mounting brackets didn't work. One of 'em, I just figured out today, we had in backwards. The other wasn't threaded for the bolts from APC, and I had only the right bolts — no cage nuts to fit. (Moral: photograph the racks for anything non-standard; if you have to ask, it's non-standard)
- One of the things we couldn't mount was a Very Important Disk Array. Fortunately it held a database which had been mirrored on another Very Important Disk Array, which also couldn't be mounted in its brackets. Instead, we used a rack shelf I happened to have around, and that worked well….but its advertised capacity wasn't enough to hold all four trays (2 trays per array), so we made do with one. (Moral: have a spare rack shelf or two on hand)
- The bolts from APC had these enormous heads, which would end up impinging on the rack unit above/below. This got to be a pain. Only today did I discover that there were plenty of bolts and cage nuts provided by the contractor who installed the racks. (Moral: dress rehearsal includes putting cage nuts and bots in adjacent holes to see how they fit)
- We had to re-hang the PDUs so they'd reach the power supplies. There were two in each rack, and both were on the right; the power supplies were all on the left, and I'd bought a bunch of 2' power cords to help with cable management. (Moral: Think about cable management for power, not just network)
- Another thing about the PDUs: The outlets don't stretch throughout the length of the bar, but instead are clustered such that there's a dead space at the bottom/top 8" or so. The power cables had to be chained together sometimes to reach the extremes. (Moral: dress rehearsal includes plugging things in)
- My plan to mount the switch in the middle of the rack with all the equipment has the advantages of shorter network cables (no running back to front, and no running top to bottom). But I should have noticed the middle empty spot in the PDUs adn mounted it there; as it is, there's a block of outlets in the PDUs I can't use because the power cables will get too close to the network cables. (Moral: think about cable management for network, not just power)
- Underestimated the amount of time it'd take to get things racked. I suppose this can only be bettered with experience.
- Underestimated the amount of time it'd take to get cables dressed; did not realize how important this was for working with things.
- Did not bring warm shirt for when the cooling was turned on. Mistake!
- Did not have lots of water on hand; did not figure out in advance where bathroom was (important in a building where you only have access to one room)
- Really could have used a phone in advance in the room; cel coverage was spotty
- Ratchet set very handy when tightening screws in awkward places (ie, behind power bar); last resort: hold bit in jaws of pliars/Leatherman. (Moral: dress rehearsal includes looking for tight corners and figuring out how you're going to work in them)
- Preserve all bits and label them; carry masking tape/removeable labels and sharpies; label anything and everything you haven't already; use ziplock bags for stuff and tape them to the machines they're associated with
- Firewall not modified to allow LDAPS to LDAP server from new netblock
- Monitoring machine came up with no ethernet interfaces; modprobe tg3 gave "probe of 0000:04:04.0 failed with error -22". (Moral: figure out how you're going to get information off a machine with no network)
- Anyone else notice that C13-C14 power cords are just plain wobbly in the PDU sockets? I had more than one pop out on me while moving cords around. (Moral: Andy Rooney lives!)
- Coulda used more printouts of the rack assignments.
- One cable was flaky: it worked for a while, then didn't. This was the cable that connected our firewall to the ILOMs for the servers, which meant I was unable to work from home on getting them up and running. This was probably for the best; I sorely underestimated just how wired I was when I went home. (Moral: you're more tired than you think)
- One of the racks was designated as the networking rack; however, since we didn't have that many switches to mount, I figured I'd use it for other stuff too. This turned out not to work: the distance between the front and back rails had been shortened to make room for network cables, and that meant the rack rails for the equipment I wanted to mount didn't fit.
Things that went well:
- Ripwrap is awesome. So are cordless drills that come with two batteries.
- The rack rails from Sun that just clip in are also awesome. Man, that makes things fast.
- There was good beer in the fridge when I got home. Thanks, Pre.
- Frankly, all the prep meant that things went pretty well overall. This was good.
I'm going to post this now because if I don't, it'll never get done. I
may come back and revise it later, but better this than nothing at
all.
Tags:
serverroom
emacs
work
hardware
28 Apr 2009
This has been one of those days where all I've done is stare at monitors too closely.
I know, I'm a sysadmin, what do I expect? But some days I get up, move around; I'm sedentary (and introverted) by nature but I try to talk to people, stare off into the distance, get away from my desk. Going to the server room is always a good break.
Not today, though. My carefully-chosen ATI video card (the Radeon 4550) is giving me headaches, metaphorical and real:
- the proprietary fglrx drivers work if you want a cloned display, but enabling Xinerama makes X segfault
- or, interestingly, the fglrx driver will show the desktop on one monitor, and an "uninitialized" (X checker pattern, chunky X cursor) screen on the other
- the radeonhd drivers work perfectly for VGA out, but the DVI out is flickery and "noisy"
Dual monitors is important. My own damn fault for not getting something old enough...
Tags:
work
hardware
linux
24 Apr 2009
I'm testing Bacula 3; the new release has just come out, and I'm very
much looking forward to rolling it out here.
One of the things I've been doing is trying to get TLS working, which
I utterly failed at in my last job. I must've failed to see these
pages, which a) point out that the otherwise-excellent Bacula
manual is (ahem) sparing when it comes to TLS, and b) you need to put
the cert files in places that strike me as unexpected.
Thus, in bacula-dir.conf
you put the directives listing the
director's cert/key in the client section — IOW, you say "and
use this key/cert combo when connecting to client foo." Meanwhile, on
client foo, you add the client's cert/key directives in the
director section ("and use this key/cert when talking to the
director"), along with things like the CA cert and required CNs.
Oh, and did you know that you can debug SSL handshakes with
openssl? True story.
Tags:
backups
toptip
30 Mar 2009
I really dug Charlie Stross' Halting State (link throws the author a few shekels). But now he's declared it obsolete.
Tags:
reading
25 Mar 2009
Thanks to Undeadly and ossowicki for the pointer to
wmname, which fixes the grey java windows problem when using
Awesome or other tiling window managers. No more starting up
Gnome or IceWM to use NetID or Strangebrew, hurrah!
Tags:
25 Mar 2009

Tags:
politics
25 Mar 2009
...that TCP Offload Engines (TOE) were so detested by Linux kernel
folks. The arguments here make interesting reading and seem convincing
to me.
(From Andy Grover's blog.)
Tags:
networking
linux
reading
23 Mar 2009
NetSNMP uses 32-bit counters for disk sizes. Guess what happens when you've got one of these?
Due to be fixed in the next release, so at least that's something.
Tags:
networking
hardware
20 Mar 2009
Actually for a whole office. Excellent reading. Wish I'd known about this at $JOB-2...
Tags:
hardware
20 Mar 2009
With the move to the server room coming up in a couple months, I've
been spending some time trying to lay out the racks we'll have
there. My current layout is in an OpenOffice spreadsheet; I thought
I'd try some other tools and see how they shape up.
- APC Configurator - Windows only. I did try this a while ago and found it wasn't bad, but no power calculations — one of the reasons I went to a spreadsheet in the first place.
- AusrackID - flash based, so works on a Firefox on Linux. Very nice, but a limited number of hardware choices — so there are Apple Xserves but no Thumpers. There's no way to add hardware and no real generic choices ("4U server", "12-disk 8U array", etc). Also no power calculations.
- RackTables - GPL'd LAMP app, so dead easy to install. Not bad at all, but it's an early version (0.16.6) so the interface is a bit clumsy interface and there are lots of features planned RSN. Aims to be a cross between a server room planner and an asset tracker, so that might not fit in with my planned use of GLPI. No power caculations, though there is a request to add SNMP monitoring of APC PDUs.
Still sticking with a spreadsheet for now; it's not the best, but it
is flexible and quick. Any other tools I missed?
Tags:
hardware
19 Mar 2009
We've got a new server room being built right now; it should be done in about six weeks, so I'm putting together an order for bits and pieces that I'll need.
I've mentioned before that cable management is one thing I get obsessed about, so this site is like porn for me. I'm not shilling for them; haven't ordered from them, no idea if they kill puppies in their spare time or what, but holy CRAP this is all the stuff I've ever wanted: RipWrap (so that's what it's called!), label printers, 87 varieties of zap straps, and I don't know what all.
Wow. Just wow.
Edit: Okay, seriously. There's some really good stuff in here among the advertisements.
Tags:
hardware
19 Mar 2009
I just noticed that the Free Software Foundation is putting
together what they call a "book sprint" — kind of like the 3-day novel
writing contest — to write an intro to the command line for
newbies. They're hoping to get it done by next Monday (!).
I like the idea of this project a lot; if I can get some spare time
this weekend, I'll definitely be dropping by.
Tags:
reading
18 Mar 2009
Okay, I feel like a bit of a tool for never realizing how cool
suspend-to-ram is in a laptop. My new laptop for work is a Dell D630,
which I'd got 'cos its hardware is pretty much completely compatable
w/Linux. However, I've also figured out that a) Ubuntu does
suspend-to-ram quite nicely (aside from a couple times when the
keyboard doesn't work, but closing/reopening the lid makes it work),
and b) it just sips — sips, I tell you! — from the battery.
Now to try and make it work on my own laptop, which is currently
sitting at the shop waiting for me to pick it up.
Today's agenda:
- Install new 48-port switch in server room
- Update Fedora Directory Server wiki page on building RPMs for/on CentOS
- Set up mail server to accept mail for older, semi-deprecated domain
- Drink coffee, catch up on sleep
See? I am still a sysadmin.
Tags:
linux
hardware
networking
ldap
17 Mar 2009
Just spent an hour trying to debug why a simple Nagios check script
was not working. It basically ran lynx -dump | grep desired
string, but for some reason was utterly failing to work.
Eventually I thought to get the script to print out its
environment. It turned out that my own environment variables had
leaked to the nagios program itself; as a result, lynx was trying (and
failing) to open /home/hugh
. /etc/init.d/nagios
did not (properly?
perhaps) clean the environment as I assumed it had. I changed my
Makefile to run env -i /etc/init.d/nagios restart
, and now it works
just fine.
(Incidentally, I love Makefiles as a way of scripting stuff you run
over and over and over again. Yeah, they're clumsy and I'm not doing
anything I couldn't do with a simple script -- but it's a timesaver to
just run "make" and be done with it.)
Tags:
25 Feb 2009
After this entry about the difference between push and pull for
Mercurial, and how that doesn't fit with the way I instinctively want
to use a repository, it's interesting to read Ted Tso responding
to a similar complaint from a git user. Tso explains the discrepancy
well:
Part of the problem here is that for most git workflows, most people
don't actually use "git push". ....in most large projects, the number
of people [who] need to use the "scm push" command is a very small
percentage of the developer population, just as very few developers
have commit privileges...
Ah, but in a distributed SCM world, things are more
democratic....While this is true, the number of people who need to be
able to publish their own branch is small....
There is one exception to this, of course, and this is a developer
who wants to get started using git for a new project which he or she
is starting and is the author/maintainer, or someone who is
interested in converting their project to git.
The whole entry, plus the comments, are worth reading.
Tags:
revisioncontrol
24 Feb 2009
As mentioned on Undeadly.org and openbsd-misc, OpenBSD is
asking for donations for BGP routers and a new CVS server. I've
donated, since I wouldn't be able to do half my job without them; if
you feel the same and can spare some money, I urge you to do the same.
Tags:
bsd
wontyoupleaselendahand
24 Feb 2009
Last week was reading week here at UBC. Monday I was off sick. Tuesday
we got an email from the folks at the building where we've got guest
access to one of their server rooms: the cooling was being shut down
from 7am on Wednesday to 3pm on Thursday, so we'd have to turn off our
servers. We're guests, so it's not like we've got a lot of say in the
matter.
Natch, Thursday 3pm came and went. We got an email at 3:45pm from a
manager there, saying that unexpected problems had arisen; they were
hoping to have things back up by the weekend. That night I pointed our
website at a backup server; it was not serving my boss' big web app,
as there was no way to make that tiny little box serve a nearly 1TB
database.
Friday I obsessed over the ambient temperature on our firewall (which
I'd left turned on); it hovered around 35C. Around 10am we were told
that they were hoping to have it on later that day, but that another
shutdown might need to be scheduled for the next week (this week). At
noon we were told that things were looking hopeful, but they couldn't
guarantee cooling over the weekend.
At 2pm I found a local A/C rental agency who told us they'd be out to
look at the room on Monday. 4pm I emailed my contact at the other
department, plus his manager, to ask for updates and whether any
further shutdowns could be scheduled after we'd arranged for cooling.
Over the weekend I obsessed over the temperature some more; it had
dropped to 21C and stayed there, but without feedback from the
facilities people I was reluctant to trust it.
Monday (yesterday; wow, time flies) we were told that the cooling
system should perform well; however, a part still needed to be
replaced. It was on order and would be coming in late this week or
early next, and would require a four-hour outage.
This morning the cooling guy visited (he was at a funeral yesterday,
so fair enough) and said that, yep, we could get a nice portable unit
in for around $400 for a week.
I'm not writing this down because I'm proud of how I handled this. I'm
writing this down so that someone else can maybe learn the things I
should've known:
- If the cooling is going to be down, arrange for backup. This can be cheap if it's a small room, and it's a hell of a lot nicer than being at other people's mercy.
- Outage times are estimates, and you should treat them as such.
- 4pm on a Friday afternoon is not the time to bring up questions that should have been raised on Tuesday.
I have a habit of thinking "There's not much that can be done about
that." Actually, it goes even further than that; it doesn't occur to
me sometimes to think about what can be done. I'm not sure if this is
lack of confidence, or trying too hard to get along, or just sheer
laziness, but I'm trying hard to stop doing that.
Tags:
hardware
warstory
19 Feb 2009
Nicks' post on customizing your home was interesting. Over the
last year or so, I've been slowly improving the way I do this. My
results have been mixed, probably because of the way I use
Mercurial.
So I've got a repo to keep my dotfiles. There's a truly awful script
that will symlink the real files to the repo, and doesn't clobber the
originals more than one time out of three. I clone to work, or to a
laptop, and start customizing. Overall, I feel like this should
work…but it's decidely awkward.
Let's take the case of bash init files. I've got mine divided into
.bashrc
and .bashrc_local
. The latter, as you'd expect, is
machine/situation-specific — ssh aliases, commands for work,
etc. .bashrc
sets various aliases and functions that are unlikely to
change. Just before exporting all the environment variables,
.bashrc_local
is sourced, which gives me a chance to override
anything.
.bashrc
should be in the repo — no question about that. But
.bashrc_local
should be there too, since I may clone my repo at work
(say) to another filesystem. Since Mercurial is distributed, there's
no problem with this — except when it comes to merging things back
home. Since I think about home as The One True Repo, I want to keep
everything there. But usually I've run hg push ssh://home
, which
promptly clobbers .bashrc_local there (at least when I do an hg
update
. Or if I merge from home, I end up creating new heads in my
repo, and a multi-headed repo can't be pushed. (I'm fuzzy on the
details; usually when this happens I bang away at it randomly until
merges happen, and swear until I'm blind.)
As outlined here, the difficulty is probably in the way I use
Mercurial and the way I've become used to SVN's (and CVS's) idea of
branches that look like directories (and are thus very, very visible
and easy for me to think about). xyld says, "I'm fed up with having to
do hg merge and not actually merge anything, but just to satisfy the
Mercurial internals." That's pretty much how I'm starting to
feel. There's the option of doing pull, rather than push, to
cherrypick the changes I want, but it's still a bit awkward for me to
think about.
I understand SVN; it fits well with my brain, which is not a
developer's. I understand hg, and I like the idea of distributed repos
for certain things. But xyld's comments about switching to git
resonate with me, and I may start trying that out.
Tags:
revisioncontrol