Carousel is a LIE!

Plus

Where I'm going, you cannot come...
"Theologians", Wilco

Silly simple lies
They made a human being out of you...
"Flair", Josh Rouse

Careful with words -- they are so meaningful
Yet they scatter like the booze from our breath...
"The White Trash Period Of My Life", Josh Rouse

And if I ever was myself,
I wasn't that night...
"Handshake Drugs", Wilco

commands:
  !wordpress_tarball_is_present::
    "/usr/bin/wget -q -O $($(params)[_tarfile]) $($(params)[_downloadurl])"
      comment => "Downloading latest version of WordPress.";

And my conscience has it stripped down to science
Why does everything displease me?
Still, I'm trying...

"Christmas with Jesus", Josh Rouse

* The easiest and best by far is for the app to do it.  It knows its
  state intimately and is in the best position to do this.  However,
  the app needs to support this.  Not necessary to have it explicitly
  save the process (as in, kernel-resident memory image, registers,
  etc); if it can look at logs or something and say "Oh, I'm 3/4
  done", then that's good too.

* The Condor scheduler supports this, *but* you have to do this by
  linking in its special libraries when you compile your program.  And
  none of the big vendors do this (Matlab, Mathematica, etc).

* BLCR: "It's 90% working, but the 10% will kill you." Segfaults,
  restarts only work 2/3 of the time, etc.  Open-source project from a
  federal lab and until very recently not funded -- so the response to
  "There's this bug..." was "Yeah, we're not funded. Can't do nothing
  for you." Funding has been obtained recently, so keep your fingers
  crossed.

One admin had problems with his nodes:  random slowdowns, not caused
by cstates or the other usual suspects.  It's a BIOS problem of some
sort and they're working it out with the vendor, but in the meantime
the only way around it is to pull the affected node and let the power
drain completely.  This was pointed out by a user ("Hey, why is my job
suddenly taking so long?") who was clever enough to write a
dirt-simple 10 million iteration for-loop that very, very obviously
took a lot longer on the affected node than the others.  At this point
I asked if people were doing regular benchmarking on their clusters to
pick up problems like this.  Answer: no.  They'll do benchmarking on
their cluster when it's stood up so they have something to compare it
to later, but users will unfailingly tell them if something's slow.

I asked about HPL; my impression when setting up the cluster was, yes,
benchmark your own stuff, but benchmark HPL too 'cos that's what you
do with a cluster.  This brought up a host of problems for me, like
compiling it and figuring out the best parameters for it.  Answers:

* Yes, HPL is a bear.  Oak Ridge: "We've got someone for that and
  that's all he does."  (Response: "That's your answer for everything
  at Oak Ridge.")

* Fiddle with the params P, Q and N, and leave the rest alone.  You
  can predict the FLOPS you should get on your hardware, and if you
  get 90% or so within that you're fine.

* HPL is not that relevant for most people, and if you tune your
  cluster for linear algebra (which is what HPL does) you may get
  crappy performance on your real work.

* You can benchmark it if you want (and download Intel's binary if you
  do; FIXME: add link), but it's probably better and easier to stick
  to your own apps.

Random:

* There's a significant number of clusters that expose interactive
  sessions to users via qlogin; that had not occurred to me.

* Recommended tools:
  * ubmod: accounting graphs
  * Healthcheck scripts (Werewolf)
  * stress: cluster stress test tool
  * munin: to collect arbitrary info from a machine
  * collectl: good for ie millisecond resolution of traffic spikes

* "So if a box gets knocked over -- and this is just anecdotal -- my
  experience is that the user that logs back in first is the one who
  caused it."

* A lot of the discussion was prompted by questions like "Is anyone
  else doing X?" or "How many people here are doing Y?"  Very helpful.

* If you have to return warranty-covered disks to the vendor but you
  really don't want the data to go, see if they'll accept the metal
  cover of the disk.  You get to keep the spinning rust.

* A lot of talk about OOM-killing in the bad old days ("I can't tell
  you how many times it took out init.").  One guy insisted it's a lot
  better now (3.x series).

* "The question of changing schedulers comes up in my group every six
  months."

* "What are you doing for log analysis?" "We log to /dev/null."
  (laughter) "No, really, we send syslog to /dev/null."

* Splunk is eye-wateringly expensive: 1.5 TB data/day =~ $1-2 million
  annual license.

* On how much disk space Oak Ridge has:  "It's...I dunno, 12 or 13 PB?
  It's 33 tons of disks, that's what I remember."

* Cheap and cheerful NFS:  OpenSolaris or FreeBSD running ZFS. For
  extra points, use an Aztec Zeus for a ZIL: a battery-backed 8GB
  DIMM that dumps to a compact flash card if the power goes out.

* Some people monitor not just for overutilization, but for
  underutilization: it's a chance for user education ("You're paying
  for my time and the hardware; let me help you get the best value for
  that").  For Oak Ridge, though, there's less pressure for that:
  scientists get billed no matter what.

* "We used to blame the network when there were problems.  Now their
  app relies on SQL Server and we blame that."

* Sweeping for expired data is important.  If it's scratch, then
  *treat* it as such: negotiate expiry dates and sweep regularly.

* Celebrity resemblances: Michael Moore and the guy from Dead Poet's
  Society/The Good Wife.  (Those are two different sysadmins, btw.)

* Asked about my .TK file problem; no insight.  Take it to the lists.
  (Don't think I've written about this, and I should.)

* On why one lab couldn't get Vendor X to supply DKMS kernel modules
  for their hardware:  "We're three orders of magnitude away from
  their biggest customer.  We have *no* influence."

* Another vote for SoftwareCarpentry.org as a way to get people up to
  speed on Linux.

* A lot of people encountered problems upgrading to Torque 4.x and
  rolled back to 2.5.  "The source code is disgusting.  Have you ever
  looked at it?  There's 15 years of cruft in there. The devs
  acknowledged the problem and announced they were going to be taking
  steps to fix things. One step: they're migrating to C++.
  [Kif sigh]"

* "Has anyone here used Moab Web Services? It's as scary as it sounds.
  Tomcat...yeah, I'll stop there." "You've turned the web into RPC. Again."

* "We don't have regulatory issues, but we do have a
  physicist/geologist issue."

* 1/3 of the Top 500 use SLURM as a scheduler.  Slurm's srun =~
  Torque's pdbsh; I have the impression it does not use MPI (well,
  okay, neither does Torque, but a lot of people use Torque + mpirun),
  but I really need to do more reading.

* lmod (FIXME: add link) is a Environment Modules-compatible (works
  with old module files) replacement that fixes some problems with old
  EM, actively developed, written in lua.

* People have had lots of bad experiences with external Fermi GPU
  boxes from Dell, particularly when attached to non-Dell equipment.

* Puppet has git hooks that let you pull out a particular branch on a node.

And finally:

Q: How do you know you're with a Scary Viking Sysadmin?

A: They ask for Thor's Skullsplitter Mead at the Google Bof.

Hotel in Arizona made us all wanna feel like stars...
"Hotel Arizona", Wilco

Wasted days, wasted nights
Try to downplay being uptight...
-- "(nothinsevergonnastandinmyway) Again", Wilco

samtools view input.bam| cut -f 3 | uniq -c | sed 's/^[\t]*//' | sort -k1,1nr > output.txt

Running the mapper.sh/reducer.sh files works fine; the problem is that
under Hadoop, it fails:

I'm unsure right now if that's [this error][3] or something else I've
done wrong.  Oh well, it'll be fun to turn on debugging and see what's
going on under the hood...

...unless, of course, unless I'm wasting my time.  A quick search
turned up a number of Hadoop-based bioinformatics tools
([Biodoop][4], [Seqpiq][5] and [Hadoop-Bam][6]), and I'm sure there
are a crapton more.

Other chores:

* Duplicating pythonbrew/modules work on another server since our
  cluster is busy
* Migrating our mail server to a VM
* Setting up printing accounting with Pykota (latest challenge:
  dealing wth usernames that aren't in our LDAP tree)
* Accumulated paperwork
* Renewing lapsed support on a Very Important Server

Oh well, at least I'm registered for [LISA][7].  Woohoo!

How many times have I tried  
Just to get away from you, and you reel me back?  
How many times have I lied  
That there's nothing that I can do?  

-- Sloan  

I missed my chance, but I think I'm gonna get another...

-- Sloan

I raise my glass to the cut-and-dried,  
To the amplified  
I raise my glass to the b-side.  

-- Sloan, "A-Side Wins"  

Growing up was wall-to-wall excitement, but I don't recall
Another who could understand at all...

-- Sloan

Hey you!
We've been around for a while.
If you'll admit that you were wrong, then we'll admit that we're right.

-- Sloan

There's been debate and some speculation
Have you heard?

Sloan

Many miles wandering from room to room
Many trees slain just to write it to you...

"Soundtrack to Mary", Soul Coughing

 Put the fake goatee on
 And it moves as cool as sugar free jazz.

 "Sugar Free Jazz", Soul Coughing

And I wondered with great admiration...

"Moon Sammy", Soul Coughing

And I hear a rumbling
I hear transmission grind
I bear witness
I have the clutch now...

"City of Motors", Soul Coughing

Los Angeles beckons the teenagers to come to her on buses
Los Angeles loves love

It is 5am, and you are listening to Los Angeles.

"Screewriter's Blues", Soul Coughing

Born to be a god among salesmen
Working the skinny tie
Slugging down fruit juice
Extra tall, extra wide

"Blueeyed Devil", Soul Coughing

I've seen the rains of the real world come forward on the plains
I've seen the Kansas of your sweet little myth...
I'm half-drunk on babble you transmit
Through your true dreams of Wichita.

"True Dreams of Wichita", Soul Coughing

  Me:  So how to the big guys test their firewall changes?
  Matt:  I dunno...probably separate routers, duplicate hardware...
  Me:  Probably golden coffee cup holders, too.
  Matt:  Jerks.

Some kind of verb, some kind of moving thing
Something unseen, some hand is motioning to rise, to rise, to rise

Too fat fat, you must cut clean
You gotta take the elevator to the mezzanine
Chump change, and it's on, super bon bon
Super bon bon, super bon bon...

"Super Bon Bon", Soul Coughing

I got the will to drive myself sleepless
I got the will to drive myself sleepless
Sleepless....

"Sleepless", Soul Coughing

Saskatoon is in the room
Pyongyang is in the room...
Is Chicago
Is not Chicago

"Is Chicago, Is Not Chicago" -- Soul Coughing

CONNECTED(00000003)

$ ifconfig
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 9000

    lladdr 00:15:17:ab:cd:ef
    media: Ethernet autoselect (1000baseT full-duplex)
    status: active
    inet6 fe80::215:17ff:feab:cdef%em0 prefixlen 64 scopeid 0x1

em1: flags=8d43<UP,BROADCAST,RUNNING,PROMISC,OACTIVE,SIMPLEX,MULTICAST> mtu 9000

    lladdr 00:15:17:ab:cd:ee:
    groups: egress
    media: Ethernet autoselect (1000baseT full-duplex)
    status: active
    inet 10.0.0.1 netmask 0xffffff80 broadcast 10.0.0.1
    inet6 fe80::215:17ff:feab:cdee%em1 prefixlen 64 scopeid 0x2

$ sudo ifconfig em1 down
$ sudo ifconfig em1 up

$ df -hl
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd0a      509M   42.4M    442M     9%    /
/dev/sd0g      106G   11.4G   89.1G    11%    /home
/dev/sd0d      3.9G    6.0K    3.7G     0%    /tmp
/dev/sd0f     15.7G    2.4G   12.5G    16%    /usr
/dev/sd0e     15.7G   13.6G    1.4G    91%    /var

Oct 28 02:46:15 bacula-fd: backup-fd JobId 3761: Fatal error: backup.c:892 Network send error to SD. ERR=Broken pipe
Oct 28 02:46:15 bacula-fd: backup-fd JobId 3761: Error: bsock.c:306 Write error sending 36841 bytes to Storage daemon:backup.example.com:9103: ERR=Broken pipe

$agent->click_button(value => 'Okay to submit');

Can't call method "header" on an undefined value at /home/admin/hugh/perl/lib/perl5/WWW/Mechanize.pm line 2003.

    my $request;
    .
    .
    .
    elsif ( $args{value} ) {
        my $i = 1;
        while ( my $input = $form->find_input(undef, 'submit', $i) ) {
            if ( $args{value} && ($args{value} eq $input->value) ) {
                $request = $input->click( $form, $args{x}, $args{y} );
                last;
            }
            $i++;
        } # while
    } # $args{value}

    return $self->request( $request );

sub request {
    my $self = shift;
    my $request = shift;

    $request = $self->_modify_request( $request );

    if ( $request->method eq "GET" || $request->method eq "POST" ) {
        $self->_push_page_stack();
    }

    $self->_update_page($request, $self->_make_request( $request, @_ ));
}

Come on, come out of the rain.

You're not oppressed, you're just too learned...

"Streets of Fire", The New Pornographers
I stole a page from your book, and a line from your page

And flew into a lesbian rage...

"Chump Change", The New Pornographers
Introducing for the first time, Pharoah on the microphone!

Sing: All hail what will be revealed today

From the fear of the great unknown, from the line to the throne.

"The Laws Have Changed", The New Pornographers
Sound of tires, sound of God...

"Electric Version", The New Pornographers.
You looked as though I'd picked your name out of a hat

Next thing I know, you're fast asleep in someone's lap...

"The Bleeding Heart Show", The New Pornographers
To wild homes we go,

To wild homes we return,

To wild homes we go.

"To Wild Homes", The New Pornographers
Cities and circles drawn perfect, complete

These are the fables on my street, my street, my street

"My Street", The New Pornographers

What the last ten minutes have taught me:

Bet the hand that your money's on

"Letter From An Occupant", The New Pornographers

Two sips from the cup of human kindness, and I'm shit-faced

Just laid to waste

If there's a choice between chance and flight,
Choose it tonight.

"Choose It", The New Pornographers

As we sift through the bones of an idol

We dig for the bones of an idol

When the will is gone

'Cause something keeps turning us on

"Bones of an Idol", The New Pornographers

Jackie, you yourself said it best when you said

 There's been a break in the continuum 

The United States used to be lots of fun... 

"Jackie", The New Pornographers 

Carousel is a LIE!

Posts tagged “lisa”