The Life of a Sysadmin

Carousel is a lie!

Entries tagged "backups".

bsdstats.org
Wed Sep 5 06:01:54 PDT 2007

Came across a mention of BSDstats.org on the Dragonfly BSD Digest, and I've set it up on my home machine. There are a ton of FreeBSD machines, and only 64 OpenBSD clients reported…time to change that!

I'm reading the documentation for Bacula right now, and it's amazing. Clearly written, thorough and extensive — almost 800 pages long. I'm very impressed.

Tags: backups, bsd.
Quickbooks, Bacula
Fri Sep 14 12:21:50 PDT 2007

I was able to get Quickbooks 2007 working with a non-admin account today…woot! Here's what I did:

This isn't ideal — the explorer process in QB is still running privileged — but at least that's the only IE process running as admin.

And Bacula: tripped over a small thing. I'm running the btape utility to make sure our tape drive works with it. I ran bfill, rather than fill, then wondered why I got errors at the end. Turns out to be an old command that probably shouldn't be around anymore.

Now to run fill…another couple hours to go.

Tags: backups, windows.
That's a mighty big catchup I got goin' there
Thu Sep 25 06:14:13 PDT 2008

Work...hell, life is busy these days.

At work, our (only) tape drive failed a couple of weeks ago; Bacula asked for a new tape, I put it in, and suddenly the "Drive Error" LED started blinking and the drive would not eject the tape. No combination of power cycling, paperclips or pleading would help. Fortunately, $UNIVERSITY_VENDOR had an external HP Ultrium 960 tape drive + 24 tapes in a local warehouse. Hurray for expedited shipping from Richmond!

Not only that, the Ultrium 3 drive can still read/write our Ultrium 2 media. By this I mean that a) I'd forgotten that the LTO standard calls for R/W for the last generation, not R/O, and b) the few tests I've been able to do with reading random old backups and reading/writing random new backups seem to go just fine.

Question for the peanut gallery: Has anyone had an Ultrium tape written by one drive that couldn't be read by another? I've read about tapes not being readable by drives other than the one that wrote it, but haven't heard any accounts first-hand for modern stuff.

Another question for the peanut gallery: I ended up finding instructions from HP that showed how to take apart a tape drive and manually eject a stuck tape. I did it for the old Ultrium 2. (No, it wasn't an HP drive, but they're all made in Hungary...so how many companies can be making these things, really?) The question is, do I trust this thing or not? My instinct is "not as far as I can throw it", but the instructions didn't mention anything one way or the other.

In other news, $NEW_ASSIGNMENT is looking to build a machine room in the basement of a building across the way, and I'm (natch) involved in that. Unfortunately, I've never been involved in one before. Fortunately, I got training on this when I went to LISA in 2006, and there's also Limoncelli, Hogan and Chalup to help out. (That link sends the author a few pennies, BTW; if you haven't bought it yet, get your boss to buy it for you.)

As part of the movement of servers from one data centre across town to new, temporary space here (in advance of this new machine room), another chunk of $UNIVERSITY has volunteered to help out with backups by sucking data over the ether with Tivoli. Nice, neighbourly think of them to do!

I met with the two sysadmins today and got a tour of their server room. (Not strictly necessary when arranging for backups, but was I gonna turn down the chance to tour a 1500-node cluster? No, I was not.) And oh, it was nice. Proper cable management...I just about cried. :-) Big racks full of blades, batteries, fibre everywhere, and a big-ass robotic Ultrium 2 tape cabinet. (I was surprised that it was 2, and not U3 or U4, but they pointed out that this had all been bought about four or five years ago…and like I've heard about other government-funded efforts, there's millions for capital and little for maintenance or upgrades.)

They told me about assembling most of it from scratch...partly for the experience, partly because they weren't happy with the way the vendor was doing it ("learning as they went along" was how they described it). I urged them to think about presenting at LISA, and was surprised that they hadn't heard of the conference or considered writing up their efforts.

Similarly, I was arranging for MX service for the new place with the university IT department, and the guy I was speaking to mentioned using Postfix. That surprised me, as I'd been under the impression that they used Sendmail, and I said so. He said that they had, but they switched to Postfix a year ago and were quite happy with it: excellent performance as an MTA (I think he said millions of emails per day, which I think is higher than my entire career total :-) and much better Milter performance than Sendmail. I told him he should make a presentation to the university sysadmin group, and he said he'd never considered it.

Oh, and I've completely passed over the A/C leak in my main job's server room…or the buttload of new servers we're gonna be getting at the new job…or adding the Sieve plugin for Dovecot on a CentOS box...or OpenBSD on a Dell R300 (completely fine; the only thing I've got to figure out is how it'll handle the onboard RAID if a drive fails). I've just been busy busy busy: two work places, still a 90-minute commute by transit, and two kids, one of whom is about to wake up right now.

Not that I'm complaining. Things are going great, and they're only getting better.

Last note: I'm seriously considering moving to Steve Kemp's Chronicle engine. Chris Siebenmann's note about the attraction of file-based systems for techies is quite true, as is his note about it being hard to do well. I haven't done it well, and I don't think I've got the time to make it good. Chronicle looks damn nice, even if it does mean opening up comments via the web again…which might mean actually getting comments every now and then. Anyhow, another project for the pile.

Tags: backups, hardware, lisa, meta, networking, work.
Blogger.com hates me. I mean, +1 for Bacula
Thu Oct 16 16:25:16 PDT 2008

Matt asked how Amanda worked for people, and whether they'd recommend anything else. I tried to leave a comment, but Blogger's CAPTCHA (god, I hate that acronym) never seems to work for me. So here goes. (Irony of a man w/an email-based comment system complaining about someone else's left as exercise f/t reader.)

Amanda: Nice, but: At my last job (2.5 years ago now), we started running into problems when backing up a 1TB RAID5 array...simple Promise disk array, nothing special or terribly fast. Amanda would take hours to do an estimate of the backups…which, since Amanda tries to pack tapes as full as it can, it does all the time. This got to be a huge pain, and we didn't find a solution to this problem before I left. (We were using GNU tar for Amanda; not sure if that had anything to do with it, and I can't remember what the alternatives were…maybe dump? Dunno.) Not sure what the current state is.

Bacula: +1 on the nice. Very, very good at my current job; absolutely no problems with it at all. And the documentation is enough to cry for, it's so complete and wonderful and thorough and accurate and well done. Clients for Unix, Windows, and Mac. Total filesystmes here are…uh…less than 1TB, definitely, although it's creeping up there. So the smaller size may have something to do with it.

Tags: backups.
Sleep!
Wed Feb 4 20:43:54 PST 2009

I can't believe it...my youngest son, after nearly three weeks of being up four or five times each night, slept nearly all the way through without a break: he only woke up at 1am and 5:15am, which is close enough to my usual wakeup time as makes no difference. It was wonderful to have a bit of sleep.

This comes after staying up late (11pm!) on Sunday bottling the latest batch of beer, a Grapefruit Bitter recipe from the local homebrew shop. You know, it really does taste like grapefruit, and even this early I'm really looking forward to this beer.

My laptop has a broken hinge, dammit. I carry it around in my backpack without any padding, so I guess I'm lucky it's lasted this long. Fortunately the monitor still works and mostly stays upright. I've had a look at some directions on how to replace it; it looks fiddly, but spending $20 on a new set of hinges from eBay is a lot more attractive than spending $100. Of course, the other consideration is whether I can get three hours to work on it….But in the meantime, I've got it on the SkyTrain for the first time in a week; it's been hard to want to do anything but sleep lately.

Work is still busy:

Update: turned out to be an MTU problem:

I had no idea there were GigE NICs that did not support Jumbo frames. Though maybe that's just the OpenBSD driver for it. Hm.

Tags: backups, beer, geekdad, hardware, networking, web.
Bacula over TLS at last!
Fri Apr 24 11:17:08 PDT 2009

I'm testing Bacula 3; the new release has just come out, and I'm very much looking forward to rolling it out here.

One of the things I've been doing is trying to get TLS working, which I utterly failed at in my last job. I must've failed to see these pages, which a) point out that the otherwise-excellent Bacula manual is (ahem) sparing when it comes to TLS, and b) you need to put the cert files in places that strike me as unexpected.

Thus, in bacula-dir.conf you put the directives listing the director's cert/key in the client section — IOW, you say "and use this key/cert combo when connecting to client foo." Meanwhile, on client foo, you add the client's cert/key directives in the director section ("and use this key/cert when talking to the director"), along with things like the CA cert and required CNs.

Oh, and did you know that you can debug SSL handshakes with openssl? True story.

Tags: backups, toptip.
Bacula, gossip, advice
Thu Jul 2 16:31:35 PDT 2009
This sounds like when I was at my previous employer and they asked if
I could develop a web-based system to take surveys.  I nearly said,
"yes" because, well, I know perl, I know CGI, and I could do it.
However, I was smart enough to say "no, but surveymonkey.com will do
it for cheap."  Best of all it was self-service and the HR person was
able to do it entirely without me.  If I had said I could write such a
program, it would have been days of back-and-forth changes which would
have driven me crazy.  Instead, she was happy to be empowered to do it
herself.  In fact, doing it herself without any help became a feather
in her cap.

The lesson I learned is that "can I do it?" includes "do I want to do
it?".  If I can do something but don't want to, the answer is, "No, I
don't know how" not "I know how but don't want to".  The first makes
you look like you know your limits.  The latter sounds like you are
just being difficult.
Tags: backups, cfengine, reading.
GPT and MBR
Fri Jul 3 12:17:25 PDT 2009

I've run into an interesting problem with the new backup machine.

It's a Sun X4240 with 10 x 15k disks in it: 2 x 73GB (mirrored for the OS) and 8 x, um, a bunch (250GB?), RAID0 for Bacula spooling. (I want fast disk access, so RAID0 it is.) RAID is taken care of by an onboard RAID card, so these look like regular disks to Linux.

Now the spool disk works out to about 2.2TB or so — which is big enough to make baby fdisk cry:

WARNING: The size of this disk is 2.4 TB (2391994793984 bytes).
DOS partition table format can not be used on drives for volumes
larger than 2.2 TB (2199023255040 bytes). Use parted(1) and GUID
partition table format (GPT).

Well, okay, haven't used parted before but that's no reason to hold back. I follow directions and eventually figure out that mkpart gpt ext3 0 2392G will do what I want. GPT? Piece of cake! And then I rebooted, and I couldn't boot up again. Blank screen after the POST. Crap!

The first time this happened, the reboot also coincided with some additional problems during the POST where too many cards were trying to shove their ROM into the BIOS memory (or some such); I thought the two were connected. But then I did it again today, and I finally started digging.

The problem is that parted overwrites the MBR when setting up a GPT disklabel. This has been noted and argued over. My understanding of the two sides of the debate is:

Meanwhile, the parted camp has a number of bugs dealing with this very issue, two opened a year ago, and none have any response in them.

This enterprising soul submitted a patch back in December 2008, which appears to have fallen to the floor.

As for me, I was able to convince the BIOS to boot from the smaller disk, and then get a rescue CentOS image going via PXE booting, and then reinstall grub on the smaller disk. Sorted. All I had to do was change root (hd1,0) to `root (hd0,0) in grub.conf.

A touch anti-climactic after all that, perhaps. But it was interesting a) to learn about all this (I hadn't really thought about successors to the DOS partition format before), and b) to see what a slender thread we (okay, I) hang our hopes on sometimes. It's a necessary, sobering thing to realize how much of what I use, depend on, believe in is created by volunteers who are smart, hard-working people — they argue and and focus and forget just like real people, not inhabitants of some shining city on a hill I sometimes take them for ("Next beer in Jerusalem!").

Tags: backups, hardware, linux.
Zombie bacula-sd and open port
Mon Jul 6 10:23:04 PDT 2009

Weird...Just ran into a problem with restarting bacula-sd. For some reason, the previous instance had died badly and left a zombie process. I restarted bacula-sd but was left with an open port:

# sudo netstat -tupan | grep 9103
tcp        0      0 0.0.0.0:9103                0.0.0.0:* LISTEN      -

which meant that bconsole hung every time it tried to get the status of bacula-sd. Unsure what to do, I tried telnetting to it for fun and then quit; after that the port was freed up and grabbed by the already-running storage daemon:

tcp        0      0 0.0.0.0:9103                0.0.0.0:* LISTEN      16254/bacula-sd

and bconsole was able to see it just fine:

Connecting to Storage daemon tape at bacula.example.com:9103

example-sd Version: 3.0.1 (30 April 2009) x86_64-example-linux-gnu example
Daemon started 06-Jul-09 10:18, 0 Jobs run since started.
 Heap: heap=180,224 smbytes=25,009 max_bytes=122,270 bufs=94 max_bufs=96
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8
Tags: backups, networking.
Two x "aha!" re: Bacula
Tue Jul 21 14:49:50 PDT 2009

First, it occurred to me today that the problems I've been having with bacula-sd dying or becoming unresponsive may be because of the way Nagios has monitored it. I've been using the check_tcp plugin, and when I looked on the backup machine there were, at one point, 21 connections to the sd port. Half were from the monitoring machine and were in the CLOSE_WAIT state. The max concurrent jobs for -sd is set to 20. I've turned off Nagios monitoring for now; we'll see how that does.

Second -- edit: sorry, stupid error. I withdraw the point.

Tags: backups.
Eject, *then* reboot
Sat Oct 3 13:56:22 PDT 2009

Ran into a little problem this week when I tried to do a restore from a backup at work. Bacula loaded the tape, then said it couldn't read the label. Wha?

After much investigation, during which I completely neglected to cut-n-paste the error messages, I think I've figured out what happened:

Ack. Needless to say, this was not good. Fortunately, the file in question was not a terribly important one; unfortunately, that's about the last 2 weeks of incrementals gone. Lesson learned: don't assume your backup program knows what's going on when hardware reboots from under it.

In other news: on Thursday I got 5 new Dell servers. Woot! One of 'em will be our new LDAP/web/email/FTP server (Xen ftw!); the rest are going to be running protein search engines for various researchers across BC. They're racked and I'm stoked, except that it turns out the difference between the DRAC6 Express and Enterprise, besides a few hundred dollars, is that the Enterprise does console redirection and the Express doesn't. Dammit.

I'm going to see if there's any trickery that can be done, but I'm not holding out hope. I have got a 32-port console server, but it's two racks away...might have to run a small batch o' cables up and over to make this work.

2 comments. Tags: backups, dell, hardware, oops, virtualization.
Randomized Updates
Tue Feb 23 05:55:06 PST 2010

Backups: Bacula has been giving me problems the last week or so. I've got this file server I'm trying to back up; it's got a 2TB partition, and I've been naively trying to just grab it all in one go. Partly that's because it hasn't been backed up before, and I figured this'd be the quickest, simplest way to get going.

What's happened is that after slurping 2 TB over a 100 Mbit connection (no, there's no way to make that quicker), which takes 53 hours, the writing to tape fails for reasons I've yet to figure out. Bacula doesn't say "Oh, the first bit worked so I can just grab that next time...." (To be fair, that's probably a much harder problem than I imagine.) And in the meantime, despite having two drives and two pools of tapes, backups for other stuff pile up behind this big backup and then don't work: they get put on spool space, but then despooling to tape fails.

Contact manglement: I've been looking for a contact management program for $WORK. Requirements:

This turns out to be surprisingly hard to find, and not just because Freshmeat's interface is terrible. Applications appear to fall into n categories:

So now I'm trying to decide between using Dadabik, which'll let me make a frontend w/o much work as long as I can come up with a schema, or modifying one of the complete-but-bletcherous apps and getting a prettier page. (I'm always paranoid about people refusing to use a web-based tool because it isn't pretty enough; I don't know how to make it prettier and it's not something I personally care about enough to do something about, so I'm caught between don't care and don't know how to fix it if I do care. As a result I panic.)

Family: Son #2 went to the hospital Sunday night with his mom; he's fine, but I was up 'til they got back at midnight. Still got up at 5:30am as usual, thinking I'd catch up last night. Then Son #1 had a bad nightmare last night and it took a while to get him calmed down. Spent a couple hours after that staring at the ceiling, trying to get myself calmed down. Still up at 5:30am as usual.

Dentist: Root canal didn't work. My former dentist, who is the second most graceless dentist I've ever seen, couldn't get through and referred me to an endodontist (someone who does root canals; thank you, Wikipedia). My appointment for them is on April 1st.

And that is that.

2 comments. Tags: backups, geekdad.
It's a race to the finish
Thu Feb 25 10:04:52 PST 2010

I mentioned that I've been having problems with Bacula recently. These have been aggravated by the fact that the trigger seems to be a job that takes 53 hours to finish.

Well, I think I've got a handle on one part of the problem. See, when Bacula is doing this big job, other jobs stack up behind it -- despite having two tape drives, and two separate pools of tapes, and concurrent jobs set up, the daily jobs don't finish. The director says this:

9279 Full    BackupCatalog.2010-02-20_21.10.00_10 is waiting for higher priority jobs to finish
9496 Full    BackupCatalog.2010-02-23_21.10.00_13 is waiting execution
9498 Full    bigass_server-d_drive.2010-02-24_03.05.01_15 is running
9520 Increme  little_server-var.2010-02-24_21.05.00_38 is waiting on Storage tape
9521 Increme  little_server-opt.2010-02-24_21.05.00_39 is waiting on max Storage jobs

but storage says this:

Running Jobs:
Writing: Full Backup job bigass_server-d_drive JobId=9498
Volume="000031"
pool="Monthly" device="Drive-0" (/dev/nst1)
spooling=1 despooling=0 despool_wait=0
Files=708,555 Bytes=1,052,080,331,191 Bytes/sec=11,195,559
FDReadSeqNo=22,294,829 in_msg=20170567 out_msg=5 fd=16
Writing: Incremental Backup job little_server-var JobId=9508 Volume="000017"
pool="Daily" device="Drive-1" (/dev/nst0)
spooling=0 despooling=0 despool_wait=1
Files=156 Bytes=3,403,527,093 Bytes/sec=72,415
FDReadSeqNo=53,041 in_msg=52667 out_msg=9 fd=9
Writing: Incremental Backup job little_server-etc JobId=9519 Volume="000017"
pool="Daily" device="Drive-1" (/dev/nst0)
spooling=0 despooling=0 despool_wait=0
Files=9 Bytes=183,606 Bytes/sec=3
FDReadSeqNo=72 in_msg=50 out_msg=9 fd=10
Writing: Incremental Backup job other_little_server-etc JobId=9522 Volume="000017"
pool="Daily" device="Drive-1" (/dev/nst0)
spooling=0 despooling=0 despool_wait=1
Files=5 Bytes=182,029 Bytes/sec=3
FDReadSeqNo=45 in_msg=32 out_msg=9 fd=19
Writing: Incremental Backup job other_little_server-var JobId=9525 Volume="000017"
pool="Daily" device="Drive-1" (/dev/nst0)
spooling=0 despooling=0 despool_wait=0
Files=0 Bytes=0 Bytes/sec=0
FDSocket closed

Out of desperation I tried running "unmount" for the drive holding the daily tape, thinking that might reset things somehow...but the console just sat there, and never returned a prompt or an error message. Meanwhile, storage was logging this:

cbs-01-sd: dircmd.c:218-0 <dird: unmount SL-500 drive=1
cbs-01-sd: dircmd.c:232-0 Do command: unmount
cbs-01-sd: dircmd.c:596-0 Try changer device Drive-0
cbs-01-sd: dircmd.c:617-0 Device SL-500 drive wrong: want=1 got=0 skipping
cbs-01-sd: dircmd.c:596-0 Try changer device Drive-1
cbs-01-sd: dircmd.c:612-0 Found changer device Drive-1
cbs-01-sd: dircmd.c:625-0 Found device Drive-1
cbs-01-sd: block.c:133-0 Returning new block=39cee10
cbs-01-sd: acquire.c:647-0 JobId=0 enter attach_dcr_to_dev

...and then just hung there. "Aha, race condition!" I thought, and sure enough a bit of searching found this commit in November: "Fix SD DCR race condition that causes seg faults". No, I don't have a segfault, but the commit touches the last routine I see logged (along with a buncha others).

This commit is in the 5.0.1 release; I wasn't planning to upgrade to this just yet, but I think I may have to. But I'm going on vacation week after next, and I'm reluctant to do this right before I'm away for a week. What to do, what to do...

1 comments. Tags: backups, bug, debugging, upgrades.
Come from
Tue Mar 16 21:37:23 PDT 2010

I've been off on vacation for a week while my parents have been visiting. It's been fun, relaxing, and generally a good time. But today it was back to work.

I found out that backups have not been running since the day after I went on vacation. There were something like 650 jobs in Bacula stacked up, waiting to run; the storage daemon was stuck trying to do something, and nothing was getting done.

Near as I can tell, the storage daemon was stuck in a deadlock. I've got some backtraces, I've posted to the devel mailing list, and it looks there's a bug, recently fixed, that addresses the problem. Inna meantime I've put in Nagios monitoring for the size of the run queue, and I'm figuring out how to watch for the extra bacula-sd process that showed up.

Tonight, at least, the same problem happened again. This is good, because now I have a chance to repeat it. Got another backtrace, killed the job, and things are moving on. Once things are finished here, I think I'm going to try running that job again and seeing if I can trigger the problem.

Sigh. This was going to be a good day and a good post. But it's late and I've got a lot to do.

1 comments. Tags: backups, bacula, debugging.
What I've been banging my head against
Tue Mar 23 11:30:48 PDT 2010

I think I've finally figured out what's going on with my bacula-sd hangs. At the risk of repeating myself, this post is adapted from the bug report I've just filed.

Here's the situation: with the latest Bacula release (5.0.1), I regularly see bacula-sd hang when running jobs; it happens often and at seemingly random times; and when this happens, I see two bacula processes, a parent and a child. (Since bacula is multi-threaded, I usually just see one storage daemon process.) This came to my attention when I came back from vacation to find a week's worth of backups stalled (sigh).

When bacula-sd hangs, the traceback of the relevant thread in the parent process's looks like this:

Thread 10 (Thread 0x466c0940 (LWP 12926)):
#0  0x00000035aa4c5f3b in read () from /lib64/libc.so.6
#1  0x00000035aa46cc07 in _IO_new_file_underflow (fp=<value optimized out>) at fileops.c:590
#2  0x00000035aa46d5ce in _IO_default_uflow (fp=<value optimized out>) at genops.c:435
#3  0x00000035aa468e8b in _IO_getc (fp=<value optimized out>) at getc.c:41
#4  0x00002b76479565c0 in bfgets (s=0x466bf710 "", size=<value optimized out>, fd=0x60080a0) at bsys.c:617
#5  0x000000000040d432 in release_device (dcr=0x60988b8) at acquire.c:533
[snip]

Here, bacula's storage daemon has just finished running a job, and before it releases the tape drive to someone else runs the "Alert" command. This is specified in the config file for the storage daemon, and is meant to see if the drive has, say, run out of magnetism during the last job. Here's the source code in stored/acquire.c:

  alert = get_pool_memory(PM_FNAME);
  alert = edit_device_codes(dcr, alert, dcr->device->alert_command, "");
  bpipe = open_bpipe(alert, 0, "r");
  if (bpipe) {
     while (fgets(line, sizeof(line), bpipe->rfd)) {  /* AardvarkNote: This is where the parent hangs */
        Jmsg(jcr, M_ALERT, 0, _("Alert: %s"), line);
     }
     status = close_bpipe(bpipe);
  }

Meanwhile, the child process stack looks like this:

Thread 1 (Thread 0x466c0940 (LWP 13000)):
#0  0x00000035aa4df9ee in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00000035aa4d06a5 in _L_lock_1206 () from /lib64/libc.so.6
#2  0x00000035aa4d05be in closelog () at ../misc/syslog.c:419
#3  0x00002b764795cd35 in open_bpipe (prog=<value optimized out>, wait=0, mode=<value optimized out>) at bpipe.c:138
#4  0x000000000040d3f1 in release_device (dcr=0x60988b8) at acquire.c:531
[snip]

open_bpipe() can be found in lib/bpipe.c; it's a routine for forking a child process (FORESHADOWING: not another thread!) and sets up a pipe between parent and child. The relevant bits look like this:

/* Start worker process */
switch (bpipe->worker_pid = fork()) {
[snip]
case 0:                            /* child */
   if (mode_write) {
      close(writep[1]);
      dup2(writep[0], 0);          /* Dup our write to his stdin */
   }
   if (mode_read) {
      close(readp[0]);             /* Close unused child fds */
      dup2(readp[1], 1);           /* dup our read to his stdout */
      dup2(readp[1], 2);           /*   and his stderr */
   }
   /* AardvarkNote: This is where the child hangs: */
   closelog();                     /* close syslog if open */
   for (i=3; i<=32; i++) {         /* close any open file descriptors */
      close(i);
   }
   execvp(bargv[0], bargv);        /* call the program */

closelog() itself is simple: it "closes the current Syslog connection, if there is one." But running strace on the child process just shows a lot of futex calls...nothing very useful there at all. So what the hell is going on, and why it hanging at closelog()?

Some background info on threads: In Linux at least, they're user-land things, and the kernel doesn't know about them. To the kernel, it's just another process with a PID. Implementing threads is left as an exercise to the reader...or in this case, to glibc and NPTL.

Since threads are part of the same process, they share memory. (A fork(), by contrast, copies over parent memory to the child -- and then the child has its own copy of everything, separate from the parent.) glibc/NPTL implements locks for certain things to make sure that one thread doesn't stomp all over another thread's memory willy-nilly. And those locks are done with futexes, which are provided by the kernel...which explains why it would show up in strace, which tracks system calls.

Why is this relevant? Because in the glibc code, closelog() looks like this:

void
closelog ()
{
  /* Protect against multiple users and cancellation.  */
  __libc_cleanup_push (cancel_handler, NULL);
  __libc_lock_lock (syslog_lock);        /* AardvarkNote:  This is where things hang */

  closelog_internal ();
  LogTag = NULL;
  LogType = SOCK_DGRAM; /* this is the default */

  /* Free the lock.  */
  __libc_cleanup_pop (1);
}

That __libc_lock_lock (syslog_lock) call is there to prevent two threads trying to mess with the syslog file handle at one time. Sure enough, the info for frame #2 of the child process shows that the process is trying to get the syslog_lock mutex:

#2  0x00000035aa4d05be in closelog () at ../misc/syslog.c:419
419      __libc_lock_lock (syslog_lock);
ignore1 = <value optimized out>
ignore2 = <value optimized out>
ignore3 = <value optimized out>

As noted in the mailing list discussion, closelog() should be a noop if there's no descriptor open to syslog. However, in my case there is such a file descriptor, because I've got bacula-sd configured to log to syslog.

Well, as the Bible notes, mixing fork() and threading is problematic:

There are at least two serious problems with the semantics of fork() in a multi-threaded program. One problem has to do with state (for example, memory) covered by mutexes. Consider the case where one thread has a mutex locked and the state covered by that mutex is inconsistent while another thread calls fork(). In the child, the mutex is in the locked state (locked by a nonexistent thread and thus can never be unlocked). Having the child simply reinitialize the mutex is unsatisfactory since this approach does not resolve the question about how to correct or otherwise deal with the inconsistent state in the child.

And hey, doesn't that sound familiar?

Now that I had an idea of what was going on, I was able to find a number of similar problems that people have encountered:

So: what seems to be happening is that, when the stars align, the child process is being created at the same moment that the parent process is logging to syslog. The child is created with a locked syslog_lock mutex, but without the thread that had been holding it...and thus, without anything that can release it. The child blocks waiting for the mutex, the parent blocks on the child, and backup jobs halt (well, at least the spooling of jobs to tape) until I kill the child manually.

This was complicated to find for a number of reasons:

So what to do?

For now, I'm removing the log-to-syslog option from bacula-sd.conf. When I do that, I see no problems at all running jobs.

On the programming side -- and keep in mind I'm no programmer -- it looks like there are a number of options here:

  1. Don't call closelog() after fork() in open_bpipe(). This may leave an open descriptor to syslog, which may be a problem. (Or maybe it's just ugly. Don't know.)

  2. Don't fork() a child process when running the alert command, but start another thread instead. I have no idea why a fork() is preferred over a new thread in this scenario, but I presume there's a good reason.

  3. Use pthread_atfork() to set things up for a fork(). That's what The Bible says to do, but I don't know what Bacula would actually need to do with it in order to make this deadlock go away.

Good lord, I'm closing in on 1700 words here. If you've stuck with me this long, I hope it was interesting; once I got over the panic it was a fascinating problem, and it's taught me a lot about processes, threads and deadlocks. And how can that be a bad thing?

Tags: backups, debugging.
Config file parsing
Wed May 26 06:14:40 PDT 2010

I've been setting up some new VMs for a separate project at work. I've realized that this is painful for two reasons: Bacula and Nagios.

Both are important...can't have a service without monitoring, and can't have a machine without backups. But each of these are configured by vast files; Bacula's is monolithic (the director's, anyhow, which is where you add new jobs) and Nagios' is legion. And they're hard to configure automagically with sed/awk/perl or cfengine; their stanzas span lines, and whitespace is important.

I've recently added a short script to my Nagios config; it regenerates a file that monitors all the Bacula jobs and makes sure they happen often enough. This is good...and I want more.

I found pynag, a Python module to parse and configure Nagios files. This is a start. I've had problems getting its head around my config files, because it didn't understand recursion in hostgroups (which I think is a recent feature of Nagios) or a hostname equal to "*". I've got the first working, and I'm banging my head against the second. The three books I got recently on Python should help (wow, IronPython looks nice).

There are a lot of example scripts with pynag. None do exactly what I want, but it looks like it should be possible to generate Nagios config files from some kind of list of hosts and services. This would be a big improvement.

But then there's Augeas, which does bi-directional parsing of config files. Have a look at the walk-through...it's pretty astounding. I realized that I've been looking for something like this for a long time: an easier way of managing all sorts of config files. Cfengine (v2 to be sure) just isn't cutting it anymore for me.

Now, the problem with Augeas for my present task is that there isn't anything in the current tree that does what I want, either. There is a commit for parsing nagios.cfg -- not sure if it's been released, or if it will parse everything in a Nagios config_dir. There's nothing for Bacula, either. This will mean a lot more work to get my ideal configuration management tool.

(On a side note, my wife said something to me the other day that was quite striking: I need tasks that can be divvied up into 45-minute chunks. That's how much free time I've got in the morning, bus rides to and from work, and the evening. Commute + kids != long blocks of free time.)

And I've got a congenital weakness for grand overarching syntheses of all existing knowledge...or at least big tasks like managing config files. So I'm trying to be aware of my brain.

...and there's son #2 waking up. Time to post.

1 comments. Tags: backups, monitoring, python.
Rule #3 of sysadmin club
Tue Aug 17 06:16:55 PDT 2010

I'm trying to get Bacula to make a separate copy of monthly full backups that can be kept off-site. To do this, I'm experimenting with its "Copy" directive. I was hoping to get a complete set of tapes ready to keep offsite before I left, but it was taking much longer than anticipated (2 days to copy 2 tapes). So I cancelled the jobs, typed unmount at bconsole, and went home thinking Bacula would just grab the right tape from the autochanger when backups came.

What I should have typed was release. release lets Bacula grab whatever tape it needs. unmount leaves Bacula unwilling to do anything on its own, and it waits for the operator (ie, me) to do something.

Result: 3 weeks of no backups. Welcome back, chump.

There are a number of things I can do to make sure this doesn't happen again. There's a thread on the Bacula-users mailing list (came up in my absence, even) detailing how to make sure something's mounted. I can use release the way Kern intended. I can set up a separate check that goes to my cel phone directly, and not through Nagios. I can run a small backup job manually on Fridays just to make sure it's going to work. And on it goes.

I knew enough not to make changes as root on Friday before going on vacation. But now I know that includes backups.

Tags: backups, fail, work.

RSS Feed