The Life of a Sysadmin

Carousel is a lie!

Entries from August 2004.

The Johnstown Arp Flood
2004-08-10 12:45:17

So last night, in the midst of other network problems, I notice lots of messages like this on our FreeBSD machines: /kernel: arplookup 1.2.3.4 failed: host is not on local network WTF? But as this was not the network problem I was looking for, I decided I did not need to see its identification and left it. This morning, an SSH session to a test box became unresponsive after being idle for a few minutes; pings didn't work either. I ran over and checked that no one had disconnected it -- no one had -- and was able to ping my machine and others from it just fine. As I walked back to my desk, someone came up to me and asked if the network was having problems right now, since he was no longer able to reach the network from his machine. I asked him to give it a try again, and went back to my desk to try getting to the test machine again. It was fine at first, then stopped responding again -- no SSH, no ping. I called the other guy and asked him if his computer was fine -- it was. I tested connectivity from my computer, but everything responded just fine except for the test box. I walked over again, logged in and tried pinging my desktop machine. It took a good five or six seconds to respond, but then the responses were seen starting at packet 0. What the...I checked out other machines and saw that the arplookup message was turning up again. Time to check it out. Well, first clue is that the address was almost one I'd assigned to a developer for his User-mode Linux sandbox: 3.4.1.2. I logged in and checked it out, but there was no indication it was using the address -- ifconfig turned up nothing, nor did tcpdump or the arp table. I checked the other machines' arp tables, but they had no entry either. Then I remembered that this guy had been complaining about intermittent slow network access yesterday and today. At the time I figured it was related to the original problems I'd been checking out, but maybe they weren't. I decided to grit my teeth and talk to him. First clue: the arp table on his W2K box (poor bastard) had an entry for 1.2.3.4. Aha! I tried running tcpdump on a laptop running FreeBSD hooked up to the same switch, but no luck. I was kind of hoping for a stray arp who-has or something, but I guess not. Second clue: he mentioned in passing that the test box of his own by his desk, a single-board computer running Linux, was being used to test an ethernet driver that he was developing. He had the kernel set up to ping his User-mode Linux every five seconds. He also had it set up to watch for incoming traffic, swap bytes around, and then send it to his User-mode Linux. A_ha!_. We agreed:

My suspicion at this point is that the bogus traffic was wreaking havoc with ARP tables, possibly those belonging to the (cheap, unmanaged, hopefully soon-to-be-upgraded-to-expensive-Cisco-Catalyst) switches that connect our whole network, or possibly just several Very Important Servers. I realize that it only caused problems with a few machines, not the whole network, but I'm wondering if this might be similar to what happened in February. Gotta say, though...the intense debugging activity almost fulfilled my deep-seated sysadmin fantasy of debugging raw Ethernet frames on the fly, so I'm happy.

No tags
dd if=/dev/zero of=/dev/brain
2004-08-12 09:36:46

Am I the only one who has to put something like this in their crontab? # 0 or 7 - Sunday # 1 - Monday # 2 - Tuesday # 3 - Wednesday # 4 - Thursday # 5 - Friday # 6 - Saturday For some reason, trying to figure this out by hand just makes my brain hurt.

No tags
thebulkclub.com
2004-08-13 07:13:34

So a while back, Slashdot posted a story about TheBulkClub.com, an online forum for heathen cowfucking spammer scum ("Suppose you were a lying, sociopathic thief. And suppose you were a spammer. But I repeat myself." -- Mark Twain) that, sadly, left its membership list and other goodies exposed. Being the good citizen that I am, I posted a reply that, I flatter myself, was both informative and helpful: it pointed the way to several mirrors of the information, including one on my own site. Well, what do I receive the other day but this charming email:

Date: Wed, 11 Aug 2004 10:23:03 -0700 (PDT) From: EmailSupplyNET <emailsupplynet@yahoo.com> Subject: Question about website To: aardvark@example.com Hey, I like (part) of your website, http://saintaardvarkthecarpeted.com It's informative. There was something on your site about "thebulkclub.com" Did you create that site for them or something? I run an email list site and am trying to contact them for advertising on their forums/boards... Any ideas/help? Thanks in advance, Thanks, www.EmailSupply.net EmailSupplyNet@Yahoo.Com 877.426.6636 --------------------------------- Do you Yahoo!? Yahoo! Mail is new and improved - Check it out!

It's quite the site. They offer a sample list -- 4MB of email addresses, meant to be a sample of the up to 14 million you can buy. I must warn you, it would be wrong to run this command: while [ true ] ; do wget http://www.emailsupply.net/sample.txt -O /dev/null ; done So don't do that. But my question is, what should I do? I'm open to ideas, suggestions, thoughts, plans and dicta.

No tags
_That's_ weird
2004-08-13 20:44:23

So it was a busy day at work: I had to do some juggling with home directories on our file server for Windows people, and set up a new Linux server for people running User-Mode Linux. Which, BTW, rocks...but be sure to read this link. I came across this problem today (freezing at "Initializing stdio console driver", but managed to get around it by installing a new version of uml-utlities. Admittedly, I'm only trying the 2.4 series. But that didn't mean I wasn't able to find a weird thing... So most of our workstations run FreeBSD. Our main NFS server runs FreeBSD. But we've got a couple workstations running Redhat Linux, and this problem was on one of them. It was very weird: Every time he ran ls in a particular NFS-mounted directory, ls would segfault and dump core. It was just this particular directory. And after some investigation, it turned out to be dependent on being this particular user. I tried going to that directory. I could run ls just fine. I was running bash, so I tried running /bin/csh (most of the developers here run csh...poor bastards)...everything worked. I tried getting him to run bash; if he ran it in the problematic directory it dumped core, and if I got him to cd somewhere else and then come back and run ls it dumped core. If I sued to root and then to him, it dumped core. I tried, as him, going into another directory, very nearby in the tree with the same number of characters in the path. It was fine. I got desperate and got him to try rebooting his machine. It still dumped core. WTF? I'm curious enough at this point that I'm seriously considering digging up the source and compiling a debug version, then running it under GDB. This is all far enough beyond my experience that it's ridiculous. Still, I have to know or it's gonna kill me.

No tags
Stay on target IV...
2004-08-14 20:06:29

Getting closer to getting MySQL working. I came across this post today which seemed to be nearly identical to what was happening to me. I followed the suggestion and took out the --enable-static option I'd been putting into configure. Result: much happier, with hardly any crashing at all. Now if I can just get it to find the user.frm table, I'll be a happy monkey. All this to pick up a copy of libmysqlclient.so. I must be on crack.

No tags
Gloria!
2004-08-16 14:38:56

My wife and I kinda made an impulse purchase on the weekend: a new 12" iBook G4. It was weird: I made a joke about buying a laptop. Then I explained that I was only joking, but if we were going to buy one it should be an iBook since I kept hearing how sweet they were. Then we were going to go to Stanley Park, hang out at the beach, but maybe go to London Drugs (I don't know about you Americans, but in Canada we go to the drugstore for everything...car insurance, furniture, computers, you name it. Oh, and occasionally prescriptions) to see what prices were like. Then we were buying one. It all happened so fast.

So far, it's pretty damned impressive. After all the trouble I had to go to get gphoto to work with our digital camera, my wife just plugged it in here and it worked with iPhoto right away. Not only that, but we were looking at a slideshow of the crack-induced photos we'd taken while Fur Elise played in the background. Fucking unreal, man.

It's weird: I do feel a bit like I've made a deal with the devil. I've come to agree more and more with RMS about Free-as-in-Freedom, and here I am with a closed-source OS. Yada-yada-Darwin, what about Aqua? But it's sooooo nice...well, mostly, anyway.

I'm trying to use MacStumbler at the moment to find a wireless network to hook up to, but no luck: it just sits there, looking like it's scanning but with no more feedback than a scrolling bar. Dammit, I thought W2K was the only culprit there...and dammit, if I can't blog from the steps of the Vancouver Art Gallery, this thing is going back to the store. I suspect a problem with MacStumbler, but it's hard to be sure; I managed to find five or six access points at the office with Knoppix and the work laptop, and (apparently) wasn't able to find a thing with MS. I need to find a command-line version.

So far, though, that's my only complaint. Pretty fucking sweet, if you ask me.

Had a problem at work with Debian and VNC: the alt keys wouldn't work, for some reason. This was pretty annoying for the developer who really, really wanted to use Emacs. It took me about an hour of poring through Google -- Jesus Christ, the number of complaints about ALT keys disappearing, and Good God the long uber-thread about the change in keyboard behaviour between Debian versions -- to find the solution: vncserver --compatiblekbd A-ha!

Back to work and still no wireless access. Carousel is a LIE!!!

UPDATE: The VNC trick doesn't work. Details: The developer is running VNCViewer under VNC to connect to an X desktop on a Debian machine. On that machine, he's opening up an xterm and running User-Mode Linux. Alt-equals-meta works for Emacs when run on the Debian machine, but not for Emacs when run in the User-Mode Linux xterm. Fuck. UPDATE: Buddy found the trick: shift-left-click in the xterm to get the menu, then click "Meta sends escape". Double fuck!

Tags: emacs, hardware.
LD\_LIBRARY\_PATH vs. BCM4306
2004-08-17 07:06:50

At the Pacific Slamatarium, SATURDAY! SATURDAY! SATURDAY! I wrote earlier about a developer who found that ls, among other commands, would dump core when he went to a certain directory. What's more, it only worked for him, and only if he used tcsh -- if he switched to bash, everything was fine. Well, I was a bit of an idiot for wondering if I should be compiling debug versions of ls. First clue was when he went to another directory nearby, ran ls and got this message: ls: error while loading shared libraries: libc.so.6: ELF file data encoding not little-endian What the...Then I realized that another significant thing about this was what was in the directories he was having problems with: different versions of GCC/glibc/Linux, cross- and native-compiled. Okay, so somehow ld was looking in the current working directory for libraries to load (ack!). But why? I took a look at his environment and found: LD_LIBRARY_PATH=:/home/foo/this:/home/foo/that:/usr/local/foo:/usr/local/bar [...] Sure enough, take out that leading colon at the beginning and everything was fine. I'm not sure right now if this would be a bug^wfeature of ld or the shell, but it was good to get to the bottom of it. So the next thing to get working is wireless access. First of all, the Airport Extreme that we bought for the iBook will not do passive mode sniffing/tracking/blogging (still learning all this, so pls. correct errors in terminology) ; it uses a Broadcom chipset, and Broadcom is not interested in helping the folks at Kismac (thank you, Sam and anonymous stranger. Hm. And the Linksys WMP54GS won't work on my machine for two reasons:

  1. It uses the BCM4306 chipset from Broadcom.
  2. It needs a PCi2.2 motherboard, and I've got this old Abit BH6 which almost certainly isn't.

Back to the store with the PCI card, and the hunt will continue. I might get the WAP54 for the Linux-running coolness, but we'll have to see.

No tags
\*NEWS FLASH
2004-08-18 11:20:56

When you have:

  1. a PHP-enabled Apache web server,
  2. with a working MySQL connection,
  3. already-working pages in PHP that can connect successfully to the database in question,
  4. account details for MySQL, and
  5. all the necessary privileges in MySQL and the server

you DO NOT need me to install phpMyAdmin in order to manipulate tables. Nor do you get bonus points for asking me how to connect to MySQL without phpMyAdmin. No, thank you.

No tags
Notes from today
2004-08-19 19:39:46
  1. DNS vs. access.db: It's a strong man's battle. In the end, though, having an unresolvable domain in the MAIL FROM: address means that Sendmail's DNS checks will trump anything you mighthave put in acces.db. There's always the accept_unresolvable_domains feature, but that's about as ugly a kludge as maintaining your own DNS entry for the domain in question.
  2. Connectors connect: If there's a problem with your new iBook's wireless reception, make sure the antenna connector is firmly seated into the Airport card. Still unable to warbus, but I'm blaming my tinfoil hat.
  3. You damn betcha I am, ratface: umlazi sounds pretty damned neat indeed. I'm told that User-Mode Linux is, currently, a hack that should be replaced by more machines, but I'm keeping this in my bookmarks file anyway.

And that's my bus stop, folks.

No tags
My scrollback buffer is bigger than your scrollback buffer
2004-08-20 23:59:19

There are two big-ass reasons why FTP sucks ass: clear-text passwords and the way it fucks with firewalls. Both are awful hangovers from the early days of the Internet where cute little elves would pop out of your compiler to offer hints on the fun they were having next door. We laugh now at the pooheads who would telnet to their server, or open up their firewalls a port further than necessary. So why the fuck don't Dreamweaver et. al. have scp plugins? Why are we constantly having to open up an old, insecure protocol for the sake of poorly designed, overpriced software? Ahem. As you were. In other news, Knoppix 3.4 will not only boot from a USB CDROM without trouble, it will not hang on autodetecting partitions and writing them to /etc/fstab. Both these steps tripped up 3.3. Whee, what a mad merry-go-round my life is! Also, here are some stats on kernel compilation times. In the one corner we have a 2.8GHz P4, 512KB cache, 800MHz frontside bus with 1GB of RAM and a 7200 RPM IDE hard drive. In the other corner, we have a EPIA-M MiniITX mobo with 1 GHz Via CPU, 64KB cache, 256MB of RAM, a FSB speed I can't be bothered to look up and a 4400 RPM IDE laptop drive. The time was for "make dep && make bzImage" on version 2.4.26 of the Linux kernel with a pretty random (by which I mean specific to our needs) configuration. Try to guess whichis which: real 1m51.998s user 1m45.920s sys 0m5.120s real 6m7.849 user 5m24.530 sys 0m25.130 Just for fun, I tried swapping the drives around: the P4 got the laptop drive, and the MiniITX board got the full-on Kevin's mom. Results: real 2m8.743s user 1m44.840s sys 0m6.160s real 6m39.898s user 5m25.500s sys 0m25.940s for the first time, and then: real 1m54.601s user 1m45.330s sys 0m5.550s real 5m54.717s user 5m26.410s sys 0m25.690s after that. The fuck? Also, have a look at this thread. I think I speak for all of us when I say that Linux will simply not be ready for the desktop until its scrollback buffer behaves like FreeBSD's. After all, the REAL measure of a man's worth is the size of his scrollback buffer. Yeah, baby!

2 comments. No tags
Cool
2004-08-23 20:50:30

From procinfo(8), part of sysutils:

-Ffile Redirect output to file (usually a tty).

Nice if, for example, you want to run procinfo permanently on a
virtual console or on a terminal, by starting it from init(8) with
a line like: p8:23:respawn:/usr/bin/procinfo -biDn1 -F/dev/tty8 `

At last, a Linux equivalent of systat -vm. Or nearly, anyway.

No tags
File under Golden
2004-08-24 19:44:41

So I had a bit of a brainstorm the other day. I've got two servers: Here and There. There's some stuff Here that needs to move There. The problem is that the server Here is in use a fair bit, and part of that use involves INSERTing things in MySQL and then SELECTing them back again. It's a pain to shut down things Here altogether in preparation for moving There, particularly as the move is liable to take, oh, twenty-four hours or so. The database needs to be consistent between the two, but the length of the move makes that impractical unless Special Measures are taken. Dark server room. Midnight. We see THE SUPERVISOR talking to THE SYSADMIN.

SUPERVISOR: That database needs to be consistent, dammit! SYSADMIN: (tightly) I can't do that without taking...special measures.

SUPERVISOR grimaces.

SUPERVISOR: Whatever it takes, dammit. I don't want to know. SYSADMIN: All right, then. I'll do your dirty work.

SYSADMIN turns slowly and walks out the door.

SUPERVISOR: Dammit!

I will conced that's a little dramatic. But what else would you call MILITARY-GRADE ENCRYPTION, i.e. SSH tunnels from Here to There? (It must be military grade; it's developed in Canada.) Okay, so it's not that big a deal for you people what think all the time. But it was pretty clever, I thought, and would ensure that the everything was, like, cool and stuff because -- this is the good part, see -- we would tunnel the MySQL connection from Here to There over SSH! Brilliant! It only needs a short break in the service from Here, then all the database updates that might come from Here go There! Yeah! So I began trying that out today. It's was a bit of a pain to set up. I had to do some funky firewall-fu There to get SSH in in the first place. Then I had to figure out the right syntax for netmasks for hosts.allow (for the record, it's 255.255.255.0, not /24). Then I had to figure out how to get the MySQL client to connect to an arbitrary port. That took a while. I offer you this hard-won piece of knowledge in the spirit of Free Knowledge:

When using the MySQL client, do not confuse the-Hoption (output in HTML, please) with the-hoption (connect to the specified host, please). That's a silly mistake to make.

However, what's not a silly mistake is expecting -h localhost to do the right thing and connect. This is either an omission in the otherwise-excellent MySQL, or else a case of our nameserver not having a record for localhost. I strongly suspect the latter.

That said, it appears to be working: I can now be refused a connection to the MySQL server There from Here. Truly, I am a golden god.

Except maybe when it comes to backups or SCSI or something. I ran into some problems with AMANDA's backups last night. I saw these rather frightening messages this morning in dmesg. After sticking my tongue cutely out the side of my mouth to indicate fierce concentration and colouring in some printed log files in different flourescent colours, I was left with this series of messages:

Aug 23 23:46:57 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:46:57 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<< Aug 23 23:46:57 localhost /kernel: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Aug 23 23:46:58 localhost /kernel: (sa0:ahc0:0:3:0): Queuing a BDR SCB Aug 23 23:46:58 localhost /kernel: (sa0:ahc0:0:3:0): Bus Device Reset Message Sent Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:46:59 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<< Aug 23 23:46:59 localhost /kernel: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): no longer in timeout, status = 34b Aug 23 23:46:59 localhost /kernel: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): failed to write terminating filemark(s) Aug 23 23:47:59 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:47:59 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<<

...and on it goes.

Saint Google asserts that this is probably a case of SCSI cables not being terminated properly, or getting too close to the power supply. Sure enough, the latter may be a problem. I made what adjustments I could without taking down the server, and we'll see what happens tomorrow. Weird. I am having the strangest sense of deja vu right now looking at that log entry in vi. Huh.

What else? I'm typing this right now at a local coffee shop where I was able to pick up wireless service; unfortunately, the cheap bastards want money. I tried pinging various addresses for a while, thinking about setting up an IP-over-ICMP-or-possibly-over-DNS proxy from my home network, then gave up and turned off the wireless card. It's good to know that it works, and it's good to know that there are places left where you can hear both Lisa Stansfield and Rick Astley in the space of five minutes. And there was much rejoicing.

Cool bit of the day from the PHP docs:

<directory /var/www/html/mydatabase>
    php_value mysql.default_user fred
    php_value mysql.default_password secret
    php_value mysql.default_host server.example.com
</directory>

Graham Rule at ed dot ac dot uk, you rule.

No tags
Remember that great swooping shot out of Cam's mouth in Ferris Bueller's Day Off?
2004-08-27 14:35:16

From this June posting to the wine-users mailing list:

On Sat, Jun 19, 2004 at 06:54:09PM -0400, eternal wrote: > > The cvs sources reserve memory up front. This is incompatible with FreeBSD's > > mmap address allocation algorithm. The current Wine implementation *can't* > > work by design on FreeBSD. > > mhmm... rather weak, if you ask me... when is this as of? the > wine-20040505 port didnt have this issue, but, then again, it had alot > of other issues that made it useless..... Some time in May. Check the creation date of wine/libs/wine/mmap.c when the wine_anon_mmap() function was moved out of wine/libs/wine/loader.c. I've asked a question on FreeBSD's arch@ mailing list, but haven't had a reply yet. I'll give it a week or 10 days and if no response by then, I'll email one of the FreeBSD vm developers directly with a cc to the private developers mailing list. I see no reason why the FreeBSD algorithm can't be changed to allow Wine to function the way it is now coded. It is unlikely that a change to the mmap address allocation algorithm will ever make it into the FreeBSD4 tree though. Hopefully by the time a FreeBSD5 stable branch is created. -- John Birrell

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

No tags

RSS Feed