Carousel is a lie!

Entries tagged "mysql".

Stay on target IV...
14th August 2004

Getting closer to getting MySQL working. I came across this post today which seemed to be nearly identical to what was happening to me. I followed the suggestion and took out the --enable-static option I'd been putting into configure. Result: much happier, with hardly any crashing at all. Now if I can just get it to find the user.frm table, I'll be a happy monkey. All this to pick up a copy of libmysqlclient.so. I must be on crack.

Tags: mysql.
*NEWS FLASH
18th August 2004

When you have:

  1. a PHP-enabled Apache web server,
  2. with a working MySQL connection,
  3. already-working pages in PHP that can connect successfully to the database in question,
  4. account details for MySQL, and
  5. all the necessary privileges in MySQL and the server

you DO NOT need me to install phpMyAdmin in order to manipulate tables. Nor do you get bonus points for asking me how to connect to MySQL without phpMyAdmin. No, thank you.

Tags: mysql, rant.
File under Golden
24th August 2004

So I had a bit of a brainstorm the other day. I've got two servers: Here and There. There's some stuff Here that needs to move There. The problem is that the server Here is in use a fair bit, and part of that use involves INSERTing things in MySQL and then SELECTing them back again. It's a pain to shut down things Here altogether in preparation for moving There, particularly as the move is liable to take, oh, twenty-four hours or so. The database needs to be consistent between the two, but the length of the move makes that impractical unless Special Measures are taken.

Dark server room. Midnight. We see THE SUPERVISOR talking to THE SYSADMIN.

SUPERVISOR: That database needs to be consistent, dammit!

SYSADMIN: (tightly) I can't do that without taking...special measures.

SUPERVISOR grimaces.

SUPERVISOR: Whatever it takes, dammit. I don't want to know.

SYSADMIN: All right, then. I'll do your dirty work.

SYSADMIN turns slowly and walks out the door.

SUPERVISOR: Dammit!

I will conced that's a little dramatic. But what else would you call MILITARY-GRADE ENCRYPTION, i.e. SSH tunnels from Here to There? (It must be military grade; it's developed in Canada.) Okay, so it's not that big a deal for you people what think all the time. But it was pretty clever, I thought, and would ensure that the everything was, like, cool and stuff because -- this is the good part, see -- we would tunnel the MySQL connection from Here to There over SSH! Brilliant! It only needs a short break in the service from Here, then all the database updates that might come from Here go There! Yeah! So I began trying that out today. It's was a bit of a pain to set up. I had to do some funky firewall-fu There to get SSH in in the first place. Then I had to figure out the right syntax for netmasks for hosts.allow (for the record, it's 255.255.255.0, not /24). Then I had to figure out how to get the MySQL client to connect to an arbitrary port. That took a while. I offer you this hard-won piece of knowledge in the spirit of Free Knowledge:

When using the MySQL client, do not confuse the-Hoption (output in HTML, please) with the-hoption (connect to the specified host, please). That's a silly mistake to make.

However, what's not a silly mistake is expecting -h localhost to do the right thing and connect. This is either an omission in the otherwise-excellent MySQL, or else a case of our nameserver not having a record for localhost. I strongly suspect the latter.

That said, it appears to be working: I can now be refused a connection to the MySQL server There from Here. Truly, I am a golden god.

Except maybe when it comes to backups or SCSI or something. I ran into some problems with AMANDA's backups last night. I saw these rather frightening messages this morning in dmesg. After sticking my tongue cutely out the side of my mouth to indicate fierce concentration and colouring in some printed log files in different flourescent colours, I was left with this series of messages:

Aug 23 23:46:57 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:46:57 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<< Aug 23 23:46:57 localhost /kernel: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Aug 23 23:46:58 localhost /kernel: (sa0:ahc0:0:3:0): Queuing a BDR SCB Aug 23 23:46:58 localhost /kernel: (sa0:ahc0:0:3:0): Bus Device Reset Message Sent Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:46:59 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<< Aug 23 23:46:59 localhost /kernel: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): no longer in timeout, status = 34b Aug 23 23:46:59 localhost /kernel: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted Aug 23 23:46:59 localhost /kernel: (sa0:ahc0:0:3:0): failed to write terminating filemark(s) Aug 23 23:47:59 localhost /kernel: (sa0:ahc0:0:3:0): SCB 0xe - timed out Aug 23 23:47:59 localhost /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins < <<<<<<<<<<<<<<<<

...and on it goes.

Saint Google asserts that this is probably a case of SCSI cables not being terminated properly, or getting too close to the power supply. Sure enough, the latter may be a problem. I made what adjustments I could without taking down the server, and we'll see what happens tomorrow. Weird. I am having the strangest sense of deja vu right now looking at that log entry in vi. Huh.

What else? I'm typing this right now at a local coffee shop where I was able to pick up wireless service; unfortunately, the cheap bastards want money. I tried pinging various addresses for a while, thinking about setting up an IP-over-ICMP-or-possibly-over-DNS proxy from my home network, then gave up and turned off the wireless card. It's good to know that it works, and it's good to know that there are places left where you can hear both Lisa Stansfield and Rick Astley in the space of five minutes. And there was much rejoicing.

Cool bit of the day from the PHP docs:

<directory /var/www/html/mydatabase>
    php_value mysql.default_user fred
    php_value mysql.default_password secret
    php_value mysql.default_host server.example.com
</directory>

Graham Rule at ed dot ac dot uk, you rule.

Tags: mysql.
Aha!
6th September 2004

A while back I set up greylisting on Postfix for my home server. It works well, but I have the same concerns now that I did then. The script (smtpd-policy.pl from the examples section of Postfix' source) feels like a bit of a crock; yes, it's just the example script, but I don't like the Berkeley DB files, and comments in the code like "DO NOT create the greylist database in a file system that can run out of space" make me nervous. It hasn't been a problem -- in, oh, six months of running the file is only up to about 5.5 MB. But still: there's no provision for removing old entries, which means an awful soul-searching battle with the database if you ever need to trim it.

I had a brief look at the script tonight, hoping to find a way to maybe hack in MySQL support, but decided to check with Saint Google first. Sure enough, there's gps, the Greylist Policy Service for Postfix. Uses C++ for speed and MySQL/PostgreSQL for the backend, which is nice. I should be able to hack up a migration script for the old entries (just as soon as I hack up a migration script for all the old journal entries...), and all should be good.

One thing I'm noticing with greylisting, though, is just how many attempts are being made from multiple IP addresses within a short time; one attempt, today, had attempts from four different IP addresses within five minutes, all from the same made-up email address. The original Perl script has the advantage that I can change it easily -- I know Perl, and I'd be pretty much starting from scratch with C++ -- and maybe add the ability to track this sort of thing. It'd be nice to be able to tarpit attempts to do this, say on the third attempt.

Tarpitting...another problem with Linux. The TARPIT module for netfilter has yet to be updated to work with the 2.6 kernel, and I really don't want to switch back to 2.4 just for this. LaBrea is nice, and I'm running a lashed-together natd configuration on my FreeBSD firewall box in conjunction with LaBrea running on my desktop on a second interface. It works, but it doesn't work in the case of a Linux webserver running on its own, outside the main firewall. I'm even less a kernel hacker than I am a C++ programmer, and figuring out the compiling problems and changed skbuff route structures (say) is beyond me. It's things like this that make me want to move to OpenBSD. Yeah, rebuilding a server and learning a new firewall language is a pain in the ass, but at least it's one I can handle.

Tags: mysql, postfix, spam.
HOWTO: Recover from old MySQL data files
7th March 2005

Reminder for myself.

So you've got some backed-up MySQL table files (if that's the right term), rather than a proper dump. Untar them somewhere, and note the path to the data files -- say, /home/foo/mysql_recovery/data. Copy /etc/my.cnf to your home directory. Edit it and change the port to something different -- say, 3307. Run:

/usr/local/mysql/bin/mysqld --defaults-file=/home/foo/my.cnf --datadir=/home/foo/mysql_recover/data

Then run:

mysqldump -P 3307 --opt -u foo -p database > recovery.sql

Of course, all this could be prevented if you were running mysqldump nightly instead of just copying the data directories...

Tags: mysql.
It's a love affair...mainly Nagios and my network
28th July 2007

I can get really, really focussed sometimes. Every now and then that happens with Nagios.

Yesterday I had some time to kill before I went home, so I looked over my tickets in RT. (I work in a small shop, so a lot of the time the tickets in RT are a way of adding things to my to-do list.) There was one that said to watch for changes in our web site's main page; I'd added that one after MySQL'd had problems one time -- ran out of connections, I think -- and Mambo had displayed a nice "Whoops! Can someone please tell the sysadmin?" page (a nice change from the usual cryptic error when there's no database connection). Someone did, but it would've been nice to be paged about it.

At home I use WebSec to keep track of some pages that don't change very often (worse luck…), and I thought of using that. It sends you the new web page with the different bits highlighted, which is a nice touch. But I wanted something tied in with Nagios, rather than another separate and special system.

So I started looking at the Nagios plugins I had, and I was surprised to find that check_http has a raft of different options, including the ability to check for regexes in the content. Sweet! I added a couple strings that'll almost certainly be there until The Next Big Redesign(tm), and done.

I started looking at the other plugins, and noticed check_hpjd. A few minutes later I was checking our printers for errors...just in time to notice a weird error that someone had emailed me about 30 seconds before. Nice!

This morning (I work from home on Saturdays in return for getting Wednesdays off to take care of Arlo) I was checking Cacti (which rocks even if they do call it a solution). /home/visitors with no free space? Wha'? Someone had run a job that'd managed to fill the whole damned partition.

Well, there's check_disk, but that's only for mounted disks — and I don't want the monitoring machine freezing if there's a problem with NFS. SNMP should do this, right? Right — the net-snmp project has the ability to throw errors if there's less than a certain amount of free space on a disk. For some reason I'd never set that up before, nor got Nagios to monitor for it. A few minutes later and check_snmp was looking for non-empty error messages:

$USER1$/check_snmp -H $HOSTADDRESS$ -o UCD-SNMP-MIB::dskErrorMsg.$ARG1$ -s ""

I looked ahead in snmpd.conf and noticed the process section. Well, hell! It's all very good to check that the web server is running, but what if there are too many Apache processes? Or too few of MySQL? Or no Postfix? Can't believe I never set this up before…

I've finally come up for breath. This wasn't what I planned on doing this morning, but I love it when a plan will come together next time.

Tags: monitoring, mysql.
mmm_mysql
4th September 2009

I've spent many hours today at $WORK banging my head against the keyboard, trying to figure out why MMM-MySQL didn't work. MMM is meant to switch write roles, or master-slave roles, among different database servers for failover and such.

While the task as a whole is complex, the steps are simple enough: the monitor daemon accepts commands from a client, then forwards those commands to agents on the different MySQL servers. At its heart it's a bunch of Perl scripts that do the things this task entails: switching IP addresses, sending arp packets, toggling write-only status on the databases, and so on.

The problem came when, for example, the monitor would tell everyone to change their IP addresses and report success -- only I could see that wasn't working. Or the agent would run the command to turn the database write-only and report success, yet I could see that it wasn't working.

There were two factors at work here.

In the latter example, the agent would run the command bin/mysql_allow_write. Here's the relevant bit of code, edited for clarity:

# Read config file and status
our $config = ReadConfig("mmm_agent.conf");

print MySqlAllowWrite();

exit(0);

sub MySqlAllowWrite($) {

    [snip]

    # connect to server
    my $dsn = "DBI:mysql:host=$host;port=$port";
    my $dbh = DBI->connect($dsn, $user, $pass, { PrintError => 0 });
    return "ERROR: Can't connect to MySQL (host = $host:$port, user = $user)!" unless ($dbh);

    # set read_only to OFF
    (my $read_only) = $dbh->selectrow_array(q{select @@read_only});
    return "ERROR: SQL Query Error: " . $dbh->errstr unless (defined $read_only);
    return "OK" unless ($read_only);

    my $sth = $dbh->prepare("set global read_only=0");
    my $res = $sth->execute;
    return "ERROR: SQL Query Error: " . $dbh->errstr unless($res);
    $sth->finish;

    $dbh->disconnect();
    $dbh = undef;

    return "OK";
}

The subroutine is reporting errors but nothing watches for them. The code that calls the script itself just uses backticks and does no checking:

sub ExecuteBin {
    my $command = shift;
    my $params = shift;
    my $return_all = shift;

    my $path = "$config->{bin_path}/$command";

    return undef unless (-x $path);
    LogDebug("Core: Execute_bin('$path $params')");
    my $res = `$path $params`;

    unless ($return_all) {
        my @lines = split /\n/, $res;
        return pop(@lines);
    }

    return $res
}

The code to change IP address is much the same:

sub AddInterfaceIP($$) {
    my $if = shift;
    my $ip = shift;

    if ($^O eq 'linux') {
        `/sbin/ip addr add $ip/32 dev $if`;
    } elsif ($^O eq 'solaris') {
        `/usr/sbin/ifconfig $if addif $ip`;
        my $logical_if = FindSolarisIF($ip);
        unless ($logical_if) {
            print "ERROR: Can't find logical interface with IP = $ip\n";
            exit(1);
        }
        `/usr/sbin/ifconfig $logical_if up`;
    } else {
        print "ERROR: Unsupported platform!\n";
        exit(1);
    }
}

Needless to say I'll be filing bug reports.

The other factor that was going on was my ignorance about the tools I was using. I couldn't figure out why the ip addr add and ip addr del commands weren't working. The agent would report success adding addresses, yet ifconfig didn't show them. What I didn't realize was that ip can manipulate addresses that ifconfig doesn't seem to see. With ifconfig, you add an additional address to an interface like so:

ifconfig eth0:0 10.0.0.2

and you see a new device called eth0:0. But with ip, you do that like so:

ip add 10.0.0.2/32 dev eth0

and you don't see additional devices and ifconfig doesn't see the additional address. I wasn't thinking hard enough about what I meant by "I can see that it doesn't work" -- something I'm all to prone to take other people to task for (or at least act smugly about).

Ah well...the good news is that I learned something. The other good news is that, since at least a couple of these errors are in the latest versions of mmm_control, I should be able to spend some time at work improving them. Hasta la source, baby! (Or something like that...)

1 comments. Tags: bugs, mysql.
Well, that'll teach me
31st December 2009

While trying to figure out why Nagios was suddenly unable to check up on our databases, I suddenly realized that the permissions on /dev/null were wrong: 0600 instead of 0666. What the hell? I've had this problem before, and I was in the middle of something, so I set them back and went on with my life. Then in happened again, not half an hour later. I was in the same shell, so I figured it had to have been a command I'd run that had inadvertantly done this.

Yep: don't run the MySQL client as root. Yes yes yes, it's bad anyway, I'll go to sysadmin hell, but this is an interesting bug. The environment variable MYSQL_HISTFILE is set to /dev/null for root...and when you exit the client, it sets the permissions for the history file to 0600. So, you know, don't do that then. (Still no fix committed, btw...)

2 comments. Tags: bug, mysql.
Checks
20th January 2010

The more I work with Python, the more I don't just like it but admire it.

Ugh...not much more right now. I've got a blocked eustachian tube that I'm self-medicating with a Python script^W^Wcold medicine, and the acetominiphen in it is making me hazy.

Tags: monitoring, mysql, security.
Nothing left to make me want to stay
9th November 2010
Growing up was wall-to-wall excitement, but I don't recall
Another who could understand at all...

-- Sloan

Monday: day two of tutorials. I found Beth Lynn in the lobby and congratulated her on being very close to winning her bet; she's a great deal closer than I would have guessed. She convinced me to show up at the Fedora 14 BoF tomorrow.

First tutorial was "NASes for the Masses" with Lee Damon, which was all about how to do cheap NASes that are "mostly reliable" -- which can be damn good if your requirements are lower, or your budget smaller. You can build a multi-TB RAID array for about $8000 these days, which is not that bad at all. He figures these will top out at around 100 users...200-300 users and you want to spend the money on better stuff.

The tutorial was good, and a lot of it was stuff I'd have liked to know about five years ago when I had no budget. (Of course, the disk prices weren't nearly so good back then...) At the moment I've got a good-ish budget -- though, like Damon, Oracle's ending of their education discount has definitely cut off a preferred supplier -- so it's not immediately relevant for me.

QOTD:

Damon: People load up their file servers with too much. Why would you put MSSQL on your file server?

Me: NFS over SQL.

Matt: I think I'm going to be sick.

Damon also told us about his experience with Linux as an NFS server: two identical machines, two identical jobs run, but one ran with the data mounted from Linux and the other with the data mounted from FreeBSD. The FreeBSD server gave a 40% speed increase. "I will never use Linux as an NFS server again."

Oh, and a suggestion from the audience: smallnetbuilder.com for benchmarks and reviews of small NASes. Must check it out.

During the break I talked to someone from a movie studio who talked about the legal hurdles he had to jump in his work. F'r example: waiting eight weeks to get legal approval to host a local copy of a CSS file (with an open-source license) that added mouseover effects, as opposed to just referring to the source on its original host.

Or getting approval for showing 4 seconds of one of their movies in a presentation he made. Legal came back with questions: "How big will the screen be? How many people will be there? What quality will you be showing it at?" "It's a conference! There's going to be a big screen! Lots of people! Why?" "Oh, so it's not going to be 20 people huddled around a laptop? Why didn't you say so?" Copyright concerns? No: they wanted to make sure that the clip would be shown at a suitably high quality, showing off their film to the best effect. "I could get in a lot of trouble for showing a clip at YouTube quality," he said.

The afternoon was "Recovering from Linux Hard Drive Disasters" with Ted T'so, and this was pretty amazing. He covered a lot of material, starting with how filesystems worked and ending with deep juju using debugfs. If you ever get the chance to take this course, I highly recommend it. It is choice.

Bits:

I got to ask him about fsync() O_PONIES; he basically said if you run bleeding edge distros on your laptop with closed-source graphics drivers, don't come crying to him when you lose data. (He said it much, much nicer than that.) This happens because ext4 assumes a stable system -- one that's not crashing every few minutes -- and so it can optimize for speed (which means, say, delaying sync()s for a bit). If you are running bleeding edge stuff, then you need to optimize for conservative approaches to data preservation and you lose speed. (That's an awkward sentence, I realize.)

I also got to ask him about RAID partitions for databases. At $WORK we've got a 3TB disk array that I made into one partition, slapped ext3 on, and put MySQL there. One of the things he mentioned during his tutorial made me wonder if that was necessary, so I asked him what the advantages/disadvantages were.

Answer: it's a tradeoff, and it depends on what you want to do. DB vendors benchmark on raw devices because it gets a lot of kernel stuff out of the way (volume management, filesystems). And if you've got a SAN where you can a) say "Gimme a 2.25TB LUN" without problems, and b) expand it on the fly because you bought an expensive SAN (is there any other kind?), then you've got both speed and flexibility.

OTOH, maybe you've got a direct-attached array like us and you can't just tell the array to double the LUN size. So what you do is hand the raw device to LVM and let it take care of resizing and such -- maybe with a filesystem, maybe not. You get flexibility, but you have to give up a bit of speed because of the extra layers (vol mgt, filesystem).

Or maybe you just say "Screw it" like we have, and put a partition and filesystem on like any other disk. It's simple, it's quick, it's obvious that there's something important there, and it works if you don't really need the flexibility. (We don't; we fill up 3TB and we're going to need something new anyhow.)

And that was that. I called home and talked to the wife and kids, grabbed a bite to eat, then headed to the OpenDNS BoF. David Ulevitch did a live demo of how anycast works for them, taking down one of their servers to show the routing tables adjust. (If your DNS lookup took an extra few seconds in Amsterdam, that's why.) It was a little unsettling to see the log of queries flash across the screen, but it was quick and I didn't see anything too interesting.

After that, it was off to the Gordon Biersch pub just down the street. The food was good, the beer was free (though the Marzen tasted different than at the Fairmont...weird), and the conversation was good. Matt and Claudio tried to set me straight on US voter registration (that is, registering as a Democrat/Republican/Independent); I think I understand now, but it still seems very strange to me.

Tags: beer, lisa, mysql, scaryvikingsysadmins.
Xmas Maintenance 2010: Lessons learned
11th January 2011

Xmas vacation is when I get to do big, disruptive maintenance with a fairly free hand. Here's some of what I did and what I learned this year.

Order of rebooting

I made the mistake of rebooting one machine first: the one that held the local CentOS mirror. I did this thinking that it would be a good guinea pig, but then other machines weren't able to fetch updates from it; I had to edit their repo files. Worse, there was no remote console on it, and no time (I thought) to take a look.

Automating patching

Last year I tried getting machines to upgrade using Cfengine like so:

centos.some_group_of_servers.Hr14.Day29.December.Yr2009::
          "/usr/bin/yum -q -y clean all"
          "/usr/bin/yum -q -y upgrade"
          "/usr/bin/reboot"

This didn't work well: I hadn't pushed out the changes in advance, because I was paranoid that I'd miss something. When I did push it out, all the machines hit on the cfserver at the same time (more or less) and didn't get the updated files because the server was refusing connections. I ended up doing it by hand.

This year I pushed out the changes in advance, but it still didn't work because of the problems with the repo. I ran cssh, edited the repos file and updated by hand.

This worked okay, but I had to do the machines in separate batches -- some needed to have their firewall tweaked to let them reach a mirror in the first place, some I wanted to watch more carefully, and so on. That meant going through a list of machines, trying to figure out if I'd missed any, adding them by hand to cssh sessions, and so on.

I may need to give in and look at RHEL, or perhaps func or better Cfengine tweaking will do the job.

Staggering reboots

Quick and dirty way to make sure you don't overload your PDUs:

sleep $(expr $RANDOM / 200 ) && reboot

Remote consoles

Rebooting one server took a long time because the ILOM was not working well, and had to be rebooted itself.

Upgrading the database servers w/the 3 TB arrays took a long time: stock MySQL packages conflicted with the official MySQL rpms, and fscking the arrays takes maybe an hour -- and there's no sign of life on the console while you're doing it. Problems with one machine's ILOM meant I couldn't even get a console for it.

OpenSuSE

Holy mother of god, what an awful time this was. I spent eight hours on upgrades for just nine desktop machines. Sadly, most of it was my fault, or at least bad configuration:

Special machines

These machines run some scientific software: one master, three slaves. When the master starts up at boot time, it tries to SSH to the slaves to copy over the binary. There appears to be no, or poor, rate throttling; if the slaves are not available when the master comes up, you end up with the following symptoms:

The problem is that umpty scp processes on the slave are holding open the binary, and the kernel gets confused trying to run it.

I also ran into problems with a duff cable on the master; confusingly, both the kernel and the switch said it was still up. This took a while to track down.

Virtual Machines

It turned out that a couple of my kvm-based VMs did not have jumbo frames turned on. I had to use virt-manager to shut down the machines, turn on virtio on the drivers, then reboot. However, kudzu on the VMs then saw these as new interfaces and did not configure them correctly. This caused problems because the machines were LDAP clients and hung when the network was unavailable.

Tags: cfengine, jumboframes, mysql, rant, toptip, work.
Linux Con -- Day 2
19th August 2011

Thursday morning was the keynote from Dr. Irving Wladawsky-Berger at IBM. His memories of Linux ascendancy were interesting...possibly because of the cheerleading/"We would simply prevail" feeling I felt. But his speculation on what would come was fuzzy and handwavy...slides with things like "Smart retail / Smart traffic/ Smart cities / Smart regions / Smart planet / Intelligent oil field technology" (wait, what happened to smart?) and graphs of Efficiency vs. Transformation, with a handy downward-sloping line delineating "Reinventing Business" from "Rethinking IT", just made THE RAGE come on.

The HP speech that came after wasn't much better, so I ducked out after five minutes...perhaps a mistake, in retrospective. I will say, though, that it amazes me that multitasking, in 2011, is something to brag about.

Next up was the presentation from IBM on "Improving Storage in KVM-based clouds". Despite teh buzzwords, it boiled down to an interesting war story about debugging crappy FS performance, from verifying ("Yes, the users are right when they say it sucks") to fixes ("This long-term kernel project will add the feature we need to stop sucking!"). If I can find the slides, I highly recommend reading them...there's a lot of practical advice in there.

Next up was a presentation by the mysteriously-employed Christoph on Linux in the world of finance. It was a short presentation -- a lot of presentations at LinuxCon have been short -- but he made up for it with a lively Q&A afterward. (To be fair, he explained at the beginning that he was used to a much more hostile/loud audience and a much more interactive presentation style, and actively solicited questions.)

Right, so: Linux is used in finance a lot, because it's fast and very, very tweakable. He describes this as "Linux hotrodding", that seems to capture the attitude very well. Sadly, a lot of this stays in-house because these tweaks are considered part of the "secret sauce" that makes them money.

I asked if the traders were involved in the technical side of things, or if it was more like "Let me know when my brilliant algorithm is sufficiently fast." Answer: no, traders are very, very technical (some give keynotes at tech conferences), and there is very tight integration between the two. I asked if the culture was as loud, macho and aggressive as the stereotype. Answer: yes. Someone asked why Solaris usage had declined. Answer: neither traders ("You got bought! You're a loser!") nor techies ("Oracle kills MySQL and puppies!") liked Oracle buying Sun.

And now for an opposing view.

I spoke after the talk to three sysadmins from the same trading company, and they disputed some of Christoph's points. First, their company contributes back to open source/Free software; their CTO says it's a moral imperative. They've open-sourced their own trading software, though not the algorithms ("algos" if you're a trader type) that make them money. They admit that this makes their company unusual; in their industry, secrecy is the rule.

Second, they said the culture varies from company to company, and that anyhow it's very different now that MIT PhDs and such are being hired. It's not all "Wall Street".

And one bit they confirmed: hotrodding. Things like overclocking their chips -- but to the degree that the vendors phone them up to say "You'll burn out your CPU in a week!" Response: "Okay." Because it'll make more money in the first hour it's running than the CPU costs.

I had lunch with Chris, who I used to work with, and caught up on everything. Then I hung out in the vendor area a bit. The PandaBoard was neat: Ubuntu 10.10, playing a 1080p movie trailer and drawing less than two watts. Incredible.

I buttonholed the FreeIPA guy; complimented him on the talk, and asked some questions. Master-slave in FreeIPA LDAP server? No, multi-master only. Doesn't that make you nervous? No. Doesn't keeping config information for the LDAP server in LDAP, rather than a plain text file, make you nervous? Shrug; if you can't read LDAP, you're probably hosed anyway. Oh, and btrfs is coming to Fedora 17, probably RHEL 7. Doesn't that make you nervous? No. (Conclusion for the home listeners: I am a misinformed worrywart.)

And Rik van Riel was there, but I forgot to hug him.

In the afternoon I went to a two-hour introduction to KVM-based virtualization. This was excellent; while I'm using KVM at the moment, I'm not familiar with the tools available. (Which probably means I shouldn't be using it....) He covered tools like virt-p2v, KSM, and how to monitor performance of VMs from the host, even if you don't have root privileges. Good stuff.

Tags: linuxcon, mysql.
Compacting the Bacula catalog
13th September 2011

Just compacted the Bacula catalog, which we keep in MySQL, as the partition it was on was starting to run out of space. (Seriously, 40 GB isn't enough?)

First thing I tried was running "optimize table" on the File table; that saved me 3 GB and took about 15 minutes. After that, I ran mysqldump and reloaded the db; that saved me another 300 MB and took closer to 30 minutes. Lesson: "optimize table" does just fine.

Tags: bacula, mysql.
SELinux and Apache - MySQL connections
10th January 2012

A long-standing project at $WORK is to move the website to a new server. I'm also using it as a chance to get our website working under SELinux, rather than just automatically turning it off. There's already one site on this server, running Wordpress, and I decided to get serious about migrating the other website, which runs Drupal.

First time I fired up Drupal, I got this error:

avc:  denied  { name_connect } for  pid=30789 comm="httpd" dest=3306
scontext=system_u:system_r:httpd_t:s0
tcontext=system_u:object_r:mysqld_port_t:s0 tclass=tcp_socket

As documented here, the name_connect permission allows you to name sockets ("these are the mysql sockets, these are the SMTP sockets...") and set permissions that way. Okay, so now we're getting a note that prevented Drupal from working because SELinux has denied httpd access to the mysqld TCP port.

What suprised me is that the Wordpress site did not seem to be encountering this error. The two relevant parts of the config files are:

Hm, the only difference is that localhost-vs-127.0.0.1 thing...

After some digging, it appears to be PHP's mysqli at work. From the documentation:

host: Can be either a host name or an IP address. Passing the NULL value or the string "localhost" to this parameter, the local host is assumed. When possible, pipes will be used instead of the TCP/IP protocol.

See the difference? Without looking up the code for mysqli, I think that an IP address -- even 127.0.0.1 -- makes mysqli just try TCP connections; using "localhost" makes it try a named pipe first. Since TCP connections to the MySQL port apparently aren't allowed by default CentOS SELinux policy, the former fails.

Solution: make it "localhost" in both, and remember not to make assumptions.

Tags: mysql, selinux.
Which MySQL engine am I using?
20th September 2012

How to find out which MySQL engine you're using for a particular database or table? Run this query:

SELECT table_schema, table_name, engine FROM INFORMATION_SCHEMA.TABLES;

Thanks to Electric Toolbox for the answer.

Tags: mysql.
Migrating Bacula database to Innodb: a failed attempt
29th September 2012

On Tuesday I attempted migrating the Bacula database at work from MyISAM to InnoDB. In the process, I was also hoping to get the disk space down on the /var partition where the tables resided; I was runnig out of room. My basic plan was:

Here was the shell script I used to split the dump file, change the engine, and reload the tables:

csplit -ftable /net/conodonta-private/export/bulk/bacula/bacula.sql '/DROP TABLE/' {*}
sed -i  's/ENGINE=MyISAM/ENGINE=InnoDB/' table*
for i in table* ; do  mv $i $(head -1 $i | awk '{print $NF}' | tr -d '`' | sed -e's/;/.sql/') ; done
for i in $(du *sql | sort -n | awk '{print $NF}') ; do echo $i; mysql -u bacula -ppassword bacula < $i ; done

(This actually took a while to settle on, and I should have done this part of things well in advance.)

Out of 31 tables, all but three were trivially small; the big ones are Path, Filename and File. File in turn is huge compared with the others.

I had neglected to turn off binary logging, so the partition kept filling up with binary logs...which took me more than a few runs through to figure out. Eventually, late in the day, I switched the engines back to MyISAM and reloaded. That brought disk space down to 64% (from 85%). This was okay, but it was a stressful day and one that I'd brought on myself for now preparing well.

When next I do this, I will follow this sequence:

Tags: bacula, mysql.
Slow MySQL makes Bacula cry
5th November 2012

A while back I upgraded the MySQL for Bacula at $WORK. I tested it afterward to make sure it worked well, but evidently not thoroughly enough: I discovered today that the query needed to restore a file was taking for*ever*. I lost patience at 30 minutes and killed the query; fortunately, I was able to find another way to get the file from Bacula. This is a big slowdown from previous experience. Time for some benchmarking and exploratory tweaking...

Incidentally, the faster way to get the file was to select "Enter a list of files to restore", rather than let it build the directory tree for the whole thing. The fileset in question is not that big, so I think I must be doing something pretty damned wrong to get MySQL so slow.

Tags: bacula, mysql.
Holy God, It's Done At Last
20th November 2012

After a lot of faffing about, I've accomplished the following on the backup server at $WORK:

I encountered the following problems:

I learned:

I still have to do these things:

Tags: bacula, mysql.

RSS Feed