Quotas are on, right?
16 Oct 2012Tomorrow I've upgrading firmware on a disk array that's attached to a small cluster I manage; yesterday, in preparation for that, I ran a full backup of the disks in question. I noticed that the home directories were taking longer than I thought, so I checked out how full they were. The answer was 97%. Oh, fuck.
The prof whose cluster this is asked for quotas to be set up for everyone; he didn't have a lot of disk space to attach, and wanted to impose some discipline on his lab. And I'd done so...only somehow, the quotas were off now, probably because I'd left it off the last time I'd had to fiddle with quotas. Because of that, one user was taking up nearly half the disk, and another was taking up almost a third. To make things worse, I had not set up my usual Nagios monitoring for this machine (disk space, say) because Ganglia was set up on it, and I'd vaguely thought that two such systems would be silly...so I was not getting my usual "OMG WTF BBQ" messages from Nagios.
It gets worse. I'd put in cron scripts that maintained the quota files, nagged users by email and CC'd me...but the permissions were 544, which meant they never ran. No email? Well, then, everything must be fine, right? Sigh.
So:
I talked to the user w/half the disk space, and it turned out that almost all of it was in a directory called "old" which she could delete w/o problems. That got us space.
I whipped up a simple Nagios plugin to check that quotas were on, and made sure I got a complaint; I turned on quotas on another partition, and made sure Nagios told me it was fine.
I fixed the permmissions on the cron scripts, and made sure they ran (I left the debug setting on, and holy crap is it verbose...I'll need to fix that).
I'm considering adding a Nagios plugin that checks for cron files (/etc/cron.*) that are not executable (although if I'm lucky, maybe there's something in the cron runner that'll complain about this).
And as a reminder to myself: if repquota gives horribly wrong information, run "quotaon -p" to verify that quotas are, in fact, on.
Add a comment:
Name and email required; email is not displayed.
Related Posts
QRP weekend 08 Oct 2018
Open Source Cubesat Workshop 2018 03 Oct 2018
mpd crash? try removing files in /var/lib/mpd/ 11 Aug 2018