Aw hell, it's more LISA coverage

11 Nov 2009

(Turns out you need at least three good, verbose albums to come up with that many quotable lyrics.)

Thursday morning (November 5 2009):

While waiting for the room to fill up for the Planck telescope talk, I had a ponies moment and realized that Tobi Oetiker has the coolest Beatles haircut ever. That is all.

The Planck (pronounced almost like "plonk") telescope is going to give the highest resolution maps of the cosmic microwave background, and it's going to be dealing with a metric fuckton (my words) of data -- on the order of 10^12 observations, or 10^8 sky pixels, or 10^4 power spectra (which is where the really interesting data is). To do this, you need a metric fuckton of computing power, and that's NERSC...which, the presenter said, has gone from being a data producer to a data sink, as more stuff comes in to be processed. (Even that has changed; scaling limits and other constraints have changed the math that they use to analyze the data.)

To handle all this data, they use a variety of techniques and hardware:

They've got 60PB of storage in 10 Sun Ultrium 4tape libraries (but as he said later, that's a made-up number based on maximum capacity; in order to maximize retrieval times, they use a mix of Ultrium 3 and Ultrium 4)
A 130 TB disk cache (!)
About 400TB of storage in GPFS
"One of the tricks to doing large data is: don't use I/O." Fast I/O is great, but avoiding it entirely is better. One byte/s of I/O is about 1000x the cost of one FLOP/s. It's easier to calculate it and keep it in memory than to look it up again.
Having common data models across the community of users, to avoid duplication/remunging of data; it's a social challenge as much as a technical challenge, but addressing it early pays off.
And remember: data from observations and experiments tends to increase in value over time (due to new analysis techniques), while data from simulations decreases in value over time (as computing capacity increases).

One question from the audience: Do you use GPU computing? A: No; lack of ECC is the biggest reason. PCI speed also a factor, but we already deal w/different speeds in different subsystems.

After that came the presentation for Anton, which is a specially-built supercomputer for molecular dynamics simulations. It was an interesting talk, and I'll be pointing one of the faculty members I work with at the slides and paper when they're available. Top quote: "Our user community is faster than our monitoring system."

Name (required)
Email (required)
Website
The site is named after Saint ________ the Carpeted: (required)
Comment

Carousel is a LIE!

Aw hell, it's more LISA coverage

Add a comment:

Related Posts

QRP weekend 08 Oct 2018

Open Source Cubesat Workshop 2018 03 Oct 2018

mpd crash? try removing files in /var/lib/mpd/ 11 Aug 2018