The Life of a Sysadmin

Carousel is a lie!

Entries from February 2006.

How to extract audio from a movie with mplayer
2006-02-03 07:49:42

Just a little tip for Google:

mplayer -ao pcm /path/to/movie

will produce a WAV file called audiodump.wav.

No tags
One week 'til I'm 34!
2006-02-04 13:08:51

It feels like I've been slacking with my entries lately, so it's time to do some catchup.

First, the NWR04B: I've not been very active on this lately, but there has been a little progress. When last I wrote I was trying to figure out why the kernel was hanging at rtnl_lock, when I used the ADM5120 driver for the switch. It turned out that I was calling register_netdev, which in turn calls rtnl_lock, from within another routine that calls register_netdev itself. That's a problem right there. I fixed this (it was due to some blind cut-n-paste from the old driver), and now it's getting further: it initializes eth0 through eth6...though still doesn't actually send or receive traffic, near as I can tell. I need to spend some time sprinkling more printks throughout the code to figure out where it's failing.

Next, I'm doing some work on Thornhill, my web server. Amanda has been installed; I want to back up stuff a little more intelligently than I'm doing now (tar up everything and dump it on my desktop, which gets backed up by Amanda running on my desktop). Running into a few firewall problems, but nothing unexpected or too difficult.

I'm also trying out Xen again, with an eye to upgrading Thornhill. A while back Alioth answered some questions I had about Xen and servers, and it seemed worth trying. So I've got VMWare Player running on the fastest machine I have (Hunsacker, a 2.4GHz P4 MythTV backend) while I practice getting things right. I've put Gentoo both in dom0 and a guest domain (FristDomain (I kill myself)), and I'm populating FristDomain with the usual LAMP environment. This is all pretty preliminary; I'm pretty much just trying to get familiar with how it all fits together.

I'm considering moving to NetBSD for dom0...stateful IPv6 filtering (though Linux has that now), pf, and just the chance to try something new. For the web server OS, though, I think I'll stick with Linux, and probably with Gentoo. I want something easily upgradeable, and for that it's Gentoo or Debian. I think Gentoo will be a little more up-to-date than Debian, and I want to give portage a try...Hunsacker runs Gentoo, but I rarely touch it.

At work, we had a problem last week with the Subversion repository when, against my advice, someone acting under their manager's direction tried checking in the contents of a SuSE DVD. They weren't trying to check in the ISO itself, at least, but rather, all the contents: whole lotta binary RPMs, mostly. This borked the repository, probably because of a default 2GB limit for Apache. The user saw this error:

svn: MERGE request failed on '/svn'
svn:  Revision file lacks trailing newline

So did everyone else who tried to work with the repository afterward.

I tried svnadmin recover like the good book says, but ze goggles, zey did nossing! Well, crap. We were running hotbackup.py every night, and a quick look showed that last night's copy had everything up to revision 1538 -- 14 revisions ago. (It was revision 1553 that failed.) So I could try moving that in place and losing a bunch of work, or look for something else.

In the end, I was able to get things working by taking a copy of the hotbackup, dumping everything since then, and then applying that dump to the backup. To wit:

$ cp /path/to/hotbackup /path/to/recovered_repository
$ svnadmin dump /path/to/repository --revision 1539:1552 --incremental > dumpfile
$ svnadmin load /path/to/recovered_repository < dumpfile
$ svnadmin verify /path/to/recovered_repository
$ mv /path/to/recovered_repository /path/to/repository

I may up the limit for Apache, but I'm very much inclined not to do so. I really don't think we'll need to check in 2GB at one time, and I still think checking in a DVD is a stupid thing to do.

No tags
Saint Aardvark's Axiom of Self-Righteous Anger
2006-02-04 13:14:28

A user at work wanted to move from a desktop machine to a laptop. The Windows profile moved over just fine, so all that was left to do was copy over his outlook.pst. Only it turns out his desktop's hard drive has been quietly failing for a while, and there's some corruption right in his 1.2GB Outlook file. Well, fuck.

The Inbox Recover Tool is meant to help with this sort of thing. It took me a while to find a mention of that, longer to realize that it was actually called scanpst.exe, and even longer to decide that the Windows search tool wasn't going to find C:\Program Files\Common Files\MAPI\1033 -- a fact that is fucking buried in Microsoft's Office support section. (Why 1033? Something to do with Unicode and US English character sets.) Of course, it didn't work.

So okay, what about getting Outlook to export to another file? Good idea! Only it fails about 700MB through, and there's no indication what worked and what didn't -- so no chance for the user to decide if that's enough or not.

So what about exporting a subset of the folders, seeing what fails, and then repeating the process without the failing folder? A little tedious, sure, but it'll work, right? Wrong: you can export one folder, or you can export one folder and its subfolders, but you cannot export more than one folder at one time. Jesus fucking Christ!

Workaround for that was to copy folders (one at a fucking time, natch) to another folder (call it Backup) and try exporting that -- and then see what fails, yadda yadda. But natch, that doesn't work either. You have to watch closely to see what folders are being exported, and anyway a folder may be displayed as being exported more than once, so you still don't know whether a given folder may have worked.

Plus, there was the failing hard drive (remember that?); I suspect that it this new backup folder was just getting thrown on the same crappy chunk of hard drive, making the export of the Backup folder fail in interestingly inconsistent ways. And of course, the whole process takes fifteen minutes to fail, during which time I can't do anything else and neither can the user.

And in the middle of my frustration and rage, an even greater rage welled up in me when I realized that Outlook had totally ruined this guy's email.

Think about it! Here's all this plain text email -- even attachments are encoded in ASCII -- and it has been completely fucking borked by being irretrievably (well, in this case anyway) converted to some proprietary binary format that is completely opaque to me, without at least the saving grace of having good tools for its manipulation available. Redundancy, ease of recovery and ease of manipulation has been thrown away for the sake of (let's be generous here) speed and functionality (indexing, correlation, etc). It's completely ridiculous.

This led to the formation of Saint Aardvark's Axiom of Information Utility:

Any sufficiently important information must be indistinguishable from plain text.

Plain text is redundant, easily (though not necessarily speedily) recognized by the human brain, and has many automated tools to deal with it (think of Unix). All these things make it very, very recoverable. If the information is that important, you need to be able to get at it even if there's a hardware failure. Binary formats throw that away, and that is simply wrong.

But what's a self-important axiom without an equally self-important corrollary?

Any gains in the functionality or speed of information access must be obtained from derived versions of the original information, leaving the original in its plain text form.

I'm perfectly willing to give Outlook the benefit of the doubt in this case; having used a PDA for all of two weeks, I feel uniquely qualified to recognize the utility of having cross-referenced contacts, to-do lists, email, and so on. But this must not come at the expense of recovery!

Think of source code. It's possible to hack on a binary with a hex editor or a disassembler. You can even fix bugs or change the way a program works in this way. But you would never maintain a program in this way: it's hard to understand, it's easy to make a mistake, and it's hard to (say) port to a new language or hardware platform. That's what source code is for: it's easy to understand (assuming you're a programmer), and even if some of it gets garbled it's easy to recover. Plus, you can use tools like indent to change how it looks, or grep to pick out interesting bits, or tags to cross-reference function calls with their definitions.

Of course, you wouldn't try to run source code -- that's what a compiler is for. You gain speed by transforming the source code while still leaving that source code intact: nothing is lost in the process. And that's what Outlook should have done: compiled the plain text email into whatever database (I'm assuming) format Outlook likes, that allows Outlook to do Outlook stuff quickly, while still leaving the original source code -- the email -- intact.

Of course, you don't have to imagine recompiling Outlook's PST file each time; this'd be an incremental thing. And really, it shouldn't be that much different from what it does now...same speed, just a little more disk space taken up. And if the PST file gets borked, no matter -- the recovery tool is nothing more than a compiler that regenerates it from the original email.

As much as I'm picking on Outlook though, this isn't Outlook's problem alone. I've written before about how PHPWiki obscures the information it stores in MySQL. And I did a similar thing to myself years ago by compressing email, since I was running out of disk space. Somewhere along the way the files got corrupted, and I can't get that email back because gzip barfs on it.

And of course, this is just my opinion, formed in the heat of anger. It's almost certainly not a new idea, and might even be wrong. I'd love to hear some feedback on this.

9 comments. No tags
Holy crap
2006-02-12 22:02:39

Our offer for this townhome has been accepted. We have until the end of the week to lose our nerve. If we don't, we move in April 1st.

Holy crap.

No tags
Far too much Windows
2006-02-18 16:59:05

Saturday after Patch Tuesday, and I spent far too much time today dealing with it. KB 911564 (aka Vulnerability in Windows Media Player Plug-in with Non-Microsoft Internet Browsers Could Allow Remote Code Execution) simply would not work, remotely nor interactively nor interactively through the Windows Update website. In the end, we had to go around booting machines into fucking safe mode (thank you, the posters of this thread, for the tip) in order to get the damned things to apply.

Sysinternal's handle showed that WinLogon.exe, for some reason, had C:\Program Files\Windows Media Player open, on one machine we checked that was having problems. No idea why, but it's about the only thing we could find that might be causing problems.

However, the news wasn't entirely bad...Windflower, the Perl-based rewrite of Ivy, actually patched a few machines today over an SSH session. Version 0.2 is available here. Hurray!

ff0d?lnk=st&q=911564+hang&rnum=1&fwc=1

No tags
Weekend Update
2006-02-24 20:37:13

So we bought the townhome...something I keep forgetting that folks don't know about, since my normally on-the-ball wife has not yet written about this. (We have a good division of labour: I write about computers, and she writes about everything else.) So far our biggest screw-up has been asking for a possession date of April 1st, when we have to be out of our apartment the day before. Oops. Oh well, we'll make it work.

I'll be storing Thornhill at a friend's place (thanks, John!) for a few days around the move. I've got, what, 8 domains on it at the moment for friends and family. Amazing what you can get a poor, underworked Sempron to do these days. :-)

Still working on getting Xen working, but it's going slowly with all the house stuff. I gotta say, I'm pretty impressed; it's very, very neat to just fire up a new machine and have at it.

1 comments. No tags

RSS Feed