A user at work wanted to move from a desktop machine to a laptop. The
Windows profile moved over just fine, so all that was left to do was
copy over his outlook.pst
. Only it turns out his desktop's hard
drive has been quietly failing for a while, and there's some
corruption right in his 1.2GB Outlook file. Well, fuck.
The Inbox Recover Tool is meant to help with this sort of
thing. It took me a while to find a mention of that, longer to realize
that it was actually called scanpst.exe
, and even longer to decide
that the Windows search tool wasn't going to find C:\Program
Files\Common Files\MAPI\1033
-- a fact that is fucking buried in
Microsoft's Office support section. (Why 1033? Something to do with
Unicode and US English character sets.) Of course, it didn't work.
So okay, what about getting Outlook to export to another file? Good idea! Only it fails about 700MB through, and there's no indication what worked and what didn't -- so no chance for the user to decide if that's enough or not.
So what about exporting a subset of the folders, seeing what fails, and then repeating the process without the failing folder? A little tedious, sure, but it'll work, right? Wrong: you can export one folder, or you can export one folder and its subfolders, but you cannot export more than one folder at one time. Jesus fucking Christ!
Workaround for that was to copy folders (one at a fucking time, natch) to another folder (call it Backup) and try exporting that -- and then see what fails, yadda yadda. But natch, that doesn't work either. You have to watch closely to see what folders are being exported, and anyway a folder may be displayed as being exported more than once, so you still don't know whether a given folder may have worked.
Plus, there was the failing hard drive (remember that?); I suspect that it this new backup folder was just getting thrown on the same crappy chunk of hard drive, making the export of the Backup folder fail in interestingly inconsistent ways. And of course, the whole process takes fifteen minutes to fail, during which time I can't do anything else and neither can the user.
And in the middle of my frustration and rage, an even greater rage welled up in me when I realized that Outlook had totally ruined this guy's email.
Think about it! Here's all this plain text email -- even attachments are encoded in ASCII -- and it has been completely fucking borked by being irretrievably (well, in this case anyway) converted to some proprietary binary format that is completely opaque to me, without at least the saving grace of having good tools for its manipulation available. Redundancy, ease of recovery and ease of manipulation has been thrown away for the sake of (let's be generous here) speed and functionality (indexing, correlation, etc). It's completely ridiculous.
This led to the formation of Saint Aardvark's Axiom of Information Utility:
Any sufficiently important information must be indistinguishable from plain text.
Plain text is redundant, easily (though not necessarily speedily) recognized by the human brain, and has many automated tools to deal with it (think of Unix). All these things make it very, very recoverable. If the information is that important, you need to be able to get at it even if there's a hardware failure. Binary formats throw that away, and that is simply wrong.
But what's a self-important axiom without an equally self-important corrollary?
Any gains in the functionality or speed of information access must be obtained from derived versions of the original information, leaving the original in its plain text form.
I'm perfectly willing to give Outlook the benefit of the doubt in this case; having used a PDA for all of two weeks, I feel uniquely qualified to recognize the utility of having cross-referenced contacts, to-do lists, email, and so on. But this must not come at the expense of recovery!
Think of source code. It's possible to hack on a binary with a hex
editor or a disassembler. You can even fix bugs or change the way a
program works in this way. But you would never maintain a program in
this way: it's hard to understand, it's easy to make a mistake, and
it's hard to (say) port to a new language or hardware platform. That's
what source code is for: it's easy to understand (assuming you're a
programmer), and even if some of it gets garbled it's easy to
recover. Plus, you can use tools like indent
to change how it looks,
or grep
to pick out interesting bits, or tags
to cross-reference
function calls with their definitions.
Of course, you wouldn't try to run source code -- that's what a compiler is for. You gain speed by transforming the source code while still leaving that source code intact: nothing is lost in the process. And that's what Outlook should have done: compiled the plain text email into whatever database (I'm assuming) format Outlook likes, that allows Outlook to do Outlook stuff quickly, while still leaving the original source code -- the email -- intact.
Of course, you don't have to imagine recompiling Outlook's PST file each time; this'd be an incremental thing. And really, it shouldn't be that much different from what it does now...same speed, just a little more disk space taken up. And if the PST file gets borked, no matter -- the recovery tool is nothing more than a compiler that regenerates it from the original email.
As much as I'm picking on Outlook though, this isn't Outlook's problem alone. I've written before about how PHPWiki obscures the information it stores in MySQL. And I did a similar thing to myself years ago by compressing email, since I was running out of disk space. Somewhere along the way the files got corrupted, and I can't get that email back because gzip barfs on it.
And of course, this is just my opinion, formed in the heat of anger. It's almost certainly not a new idea, and might even be wrong. I'd love to hear some feedback on this.
Problem: Outlook 2003 User gets a message from System Administrator saying that his message to a coworker is undeliverable -- something about relaying denied -- and asks me why this happens.
Pretty simple, right? Just get him to forward the message and then check the logs. Only no, it's not simple: despite twiddling all the bits you're supposed to, I keep getting the message attachments in MS' TNEF format. I use Mutt. I give up and decide to look at Outlook itself. (Yes, I know about the decoder scripts you can get, but I was being bullheaded.)
Now we have the problem of getting the proper Internet headers in the email. (I've given up trying to persuade Outlook that I am ritually pure enough to look upon the shining glory that is The Message Source without melting like some kind of Nazi-collaborating French archaeologist; it doesn't work.)
A quick Google turns up three suggestions: a $24.95 VB plugin, giving up entirely, or right-clicking and selecting Options, then looking at the box that sez Internet Headers. I'm game, so I try right-clicking and choosing Options. There's the Internet Headers box, all right, but it's empty. WTF? I look around, but there's nowhere else to clickyclick.
I try right-clicking on another message, and sure enough it shows the headers. I try selecting the From address in the problem message (helpfully labelled as "System Administrator", which I'm pretty sure is a bald-faced lie), then Properties. It just says it's from System Administrator, and shows no actual, real email address. You remember...email, one of the things Outlook is meant to do?
Then it dawns on me: the mentioned-in-passing comment from the user that the message is probably from Outlook itself is true. I'd thought this was just a friendly gloss on an unfriendly message, but it wasn't. This fucking message is from Outlook. And it's not until I skip ahead and tell you the exciting conclusion -- it was our mail server refusing to relay and saying so, something never not once mentioned in the offending message -- that you're going to realize the full horror of the situation.
For Outlook does not only mangle email, and hide attachments in weird files called "winmail.dat", and shake its baloney all over the place like a drugged-out Hula girl in the "Before" picture in all those rehab clinic advertisements. No. That is not all.
Outlook -- the mail client -- also takes error messages from mail servers and disguises them as email messages that have just arrived, rather than showing the user the fucking error as an error when and as it occurs! It hides the origin of the error by pretending to be some non-existant sysadmin when it sends this message! And it does nothing to indicate that this false email is any different from the other messages from Bill and Bob and Ted littering your inbox about horizontal opportunity mission statements, complete with an animated surfing guy for Bob shouting "Whoah!" to differentiate his mangling of the English language from Ted's, leaving me to wonder what the fuck kind of congenital brain damage must've been at work to make this seem like a good idea to anyone.
Fuck me, I hate Outlook.