Spider-spam, spider-spam

Just for fun, a couple days ago I added a link to the index page of my website to a hidden page. On that page was a mailto: link with a throwaway address for my domain. I wanted to see how quickly it would get picked up, and how quickly I would get spam for it.

Well, the first bit has happened. I created the page at 6.41am local time on November 19; at 2.07pm that same day, it was spidered, then again at 2.40am this morning (Nov. 21).

The first spidering appears to have been done by [Thunderstone|http://www.thunderstone.com/], so I don't think there's too much to worry about there. I'll have to set up a robots.txt file to keep the nice spiders out. The second, however, is from a NY ISP, so I'm guessing something will come of that.

It would be interesting to figure out the average time-to-live of a published email address: how long it can be on a webpage before it gets spammed (and will therefore be spammed unto the end of time, yea, and beyond). This would be like Lance Spitzer's research into the TTL of an unpatched Win98 system on the Internet (Dammit -- all I could find was [this link|http://amsterdam.nettime.org/Lists-Archives/nettime-l-0106/msg00126.html], but I know I've seen the original paper somewhere...), or the idea of mailpings mentioned in this excellent book (track email delivery time to a given address to monitor performance/health).

Original entry