Posts tagged “spam”

October 05, 2016 CAN I ROCK BUSINESS WITH YOU?
Today's title from the subject line of some spam I just got. ("a spam"? "a spammy email"? just "spam"?)
- Mystery flu-like illness continues, or at least its fallout; I've had lower back pain for the last ~ 4 weeks. Doctor says removing spine is "not an option" but I've done some Googling and
- $WORK continues apace. After taking a week of Python training, we're using Go for a new tool we're building. Haven't got a good sense for what it's like just yet, but so far I don't seem to be making a mess of things.
- Tried out drone.io at $WORK yesterday and holy god, is it good. Auth with our internal Github, then activate repos, and boom! it runs tests on every new commit on any branch, watches for PRs, the whole nine yards. When I think of the amount of work we had to do to get Jenkins to do this, it's insane. Plus the whole run-as-a-Docker-container, fire-up-sibling-docker-containers-for-tests thing is very, very impressive.
- Sportsball has started up again with a vengeance: practices on Monday and Wednesday, games on Fridays and Saturdays. Somebody stop this merry-go-round!
- I've registered for LISA 16, woot! This will be my fifth -- wait, sixth? -- LISA, ten years after my first time attending. Not sure who's gonna be the theme band this year -- I've done New Pornographers, Josh Rouse, Soul Coughing and Sloan. And since he's co-chair this year, it seems like a good time to pull out that picture of Matt Simmons (@standaloneSA) as a PHP dev:

March 07, 2012 Oh, so now Libya

I received this bit of spam a few days ago:

From: AUW-RSVP <melud.halasa@example.com>
To: undisclosed-recipients: ;
Subject: hi
Organization: Gen. Melud

My name is Gen. Melud Massoud Halasa,I was a Libyan army General in the
military force of Gardaffi in Libya,i have $23 Million Dollars hidden in
Libya,i need your assistance to move this money out of Libya to your
country,i have resigned from the army and i want to go into business in your
country as your partner.If interested REPLY ONLY VIA MY PERSONAL EMAIL
melud.halasa@example.org for more details.

I forwarded it off to a friend of mine and asked him if he knew anything about it. His reply:

Well, Melud's been trying to move that cash for about six months now but no
one will help him. The thing is, it's actually a physical pallet of $100
bills that we had used to buy his loyalty back during the uprising. That's
why he's starting to reach out farther and farther afield, trying to find
someone who will come to Libya and help him carry the danged pallet to the
local Western Union branch. I'd say it's not worth it ... that Western Union
is at least a mile from where he has the pallet stashed in his mom's house.
I value my lower back more than that.

And, of course, by sending this email back and forth we've guaranteed that
it's being read by some automated spy system that is trying to determine if
this counts as terrorist "chatter." In the interests of being friendly and
polite, I'd like to say hello to that automated system, and perhaps to any
actual human analyst who stumbles across it.

I'd just like to say: that is the funniest thing I hope to read all week. Maybe all month. And definitely on my top ten for the year.

September 07, 2009 Wordpress worm
Just spent the better part of five hours cleaning up four old, out-of-date Wordpress installations after they got infected with this worm. I host nine sites on my home server for friends and family; I'm cutting that down to three (just family), and maybe looking at mu-wordpress, as of Real Soon Now.

Happy Labour Day, everyone!

Update: I meant to add in here a few things I looked for, because this info was hard to track down.
- I found extra admin-level users in the wp_users table; some had their email address set to "www@www.com", some had random made-up or possibly real addresses, and some had the same email address as already-existing users.
- On one blog (possibly infected much earlier) I found 42,000 (!!) approved, spammy comments.
- I searched for infected posts using a query from here:
```
SELECT * FROM wp_posts WHERE post_content LIKE '%iframe%'
UNION
SELECT * FROM wp_posts WHERE post_content LIKE '%noscript%'
UNION
SELECT * FROM wp_posts WHERE post_content LIKE '%display:%'
```
August 07, 2008 A note on comments
About a year ago, I started using a cobbled-together system of Bash and Perl scripts and Makefiles to put together this blog. One of the reasons was my general dislike for PHP; another was my desire to try living (at least in some small way) by Saint Aardvark's Axiom of Information Utility, and try keeping this in plain text. (Another was a desire to use Emacs to write these damn things; I want the control that's thrown out when you start using a GUI to edit.)

But one of the problems that faced me was how to deal with comments, and comment spam. Having a web form that allowed comments made commenting easy, but the downside was that it made spamming easy too. WP and others keep this down to a dull roar, but it's not perfect and I've had problems with false positives — people being unable to post comments because their IP address was on some blacklist, and the plugin had made no provision for whitelisting.

I decided to lash together something that would use email. For me — a very small, low-traffic website, with a blog devoted to a rather obscure set of concerns and a tech-savvy audience (Hi Dad!) — this seemed like a good choice. Email spam, for me, has been pretty much solved by greylisting and SpamAssassin. (There's the problem of a ten — no, fourteen — year-old email address that I've been meaning to get changed for a while now, but that's another story; they don't seem to do greylisting, and SpamAssassin does catch most of it.) So taking comments by email seemed, you know, righteous, dude.

The system for comments is pretty simple: every post gets an epoch timestamp embedded in it. (I think if you look in the HTML source, you can see it.) I use it for sorting the order of the posts, and I use it to generate email addresses for post-specific comments. The format is simple: comments+(seconds since the epoch)@saintaardvarkthecarpeted.com. The address is included in the post, though I haven't done much to make it obvious. (This blog, and I think this whole website, would make baby Jacob Nielson cry.)

My thinking was that, even though I was publishing the addresses, it wouldn't matter: as I mentioned, spam for me has been mainly solved (insert disclaimers here). Between greylisting and SpamAssassin, I figured I pretty much wouldn't see any spam at all.

Turns out there's another benefit: the addresses have been picked up by spam bot crawlers, but they're screwing up the scraping. From 24 days of mail logs, I see a crapload of attempts to deliver to the wrong address:
```
$ perl -ne'/NOQUEUE/ &amp;&amp; s{.*to=&lt;(\S+?)&gt;.*}{$1} &amp;&amp; print "$_\n";' mail.log* | sort | uniq -c | sort -n
[much snippage]
```
 36 1181577610@saintaardvarkthecarpeted.com
 36 1182947701@saintaardvarkthecarpeted.com
 37 1181326150@saintaardvarkthecarpeted.com
 37 1183667208@saintaardvarkthecarpeted.com
 38 1182949918@saintaardvarkthecarpeted.com
 40 1183349604@saintaardvarkthecarpeted.com
```
```
There were more than 2500 of these messages turned away by greylisting. They've all stripped off everything up to the plus, not realizing (as I didn't until a few years ago) that a plus in an email is valid.

In fact, the only attempts to deliver to legitimate comment addresses were two actual comments to my blog…which brings up a shortcoming: I never got that many comments with WordPress, but I sure got more than I do now. It's possible my writing has just gone 'way downhill, but I think it's more likely that this system just puts people off, or they're just unable to find it with my current (crappy) design.

(One interesting problem: my wife tried to comment once, using Lotus Notes at her workplace. It converted the plus sign into an underscore. Weird.)

I still regard this setup for comments as an experiment. Its results are definitely mixed; no spam, but fewer comments as well. Given the tiresome mess that comes with the lack of an HTTP equivalent of greylisting, I'm inclined to keep doing it.

Anyhow...that's my interesting research result for the day. You may now talk amongst yourselves.
February 26, 2008 Deep thoughts
I've been listening to the presentations from LISA07, and I have a few observations.

Trey Darley's presentation reminded me a lot of my last job, but much more intense: fast growth, no control, and no budget. The difference is that he had the experience and the chops to deal with it well. Also, if he can present at LISA, so can I.

Andrew Hume's presentation, "No Terabyte Left Behind", was interesting, by which I mean frightening. People mostly just trust that hardware does what it says it does/will do when it comes to storage. But that doesn't always work: he tells the story of a prof he worked with who checksummed all his files once a week. When a checksum changed — and it did about every 6 months — he'd retrieve it from backup. His rough guess for undetectable errors: 1 per 10 terabyte-years. And we're getting to the point where that's going to be significant very soon.

Tony Cass' presentation on grid computing for CERN was fascinating. This is the place I wanted to work (though as a particle physicist). UBC/TRIUMF is doing some work for this project as well, which makes me think I should jump over.

David Josephson's presentation was interesting, as much for the Q&A afterward as for his point. Which was? Glad you asked: that focussing on IP-based spam filtering (RBLs, greylisting) provides an incentive to spammers to hijack network prefixes via BGP attacks, and generally do nasty things to the Internet; please switch to content-based filtering post-haste. (To clarify, he was talking in particular about fast naive Bayesian classifiers, not SpamAssassin.) Since IP-based filtering treats IPs as valuable things — tokens that demonstrate your email is worth accepting — spammers steal IP addresses.

I'm not sure how much I buy his argument; he kept promising that the BGP attacks he described were only part of the problem, but he never seemed to get beyond that. But during the Q&A Brad Knowles got up and said (my summary) Content filtering doesn't scale, at least in his experience (as Senior Internet Mail Systems Administrator for AOL). At that point, another guy got up and said (again, my summary) that sort of thing is heard all the time, but with no data to back it up. The responder had co-authored a paper with Josephson that got Best Paper award at LISA '04, and they'd made damn sure to include a ton of footnotes. If their conclusions were wrong, people were free to challenge them; if Knowle's were wrong, they were unchallengeable because there was no data to back it up -- it was all just story that got passed along and became myth.

Knowles' response was "I don't have time to write papers; I'm a technician, not an academic." Which is true, in lots of ways. And I don't mean any insult to Knowles; he's done things I will probably never match, we are all flooded with work, and so on. I'm one guy, working at a small shop, with none of his experience, or chops, or rep, or audience.

But there's a reason my .signature says "Because the plural of Anecdote is Myth": it's to remind me that unless you can back something up with facts, preferably written down and logged and repeatable, all you've got is a bunch of stories that become more and more True the more you repeat them.

It's obnoxious to sneer and say, "Cite, please"; it's worse to be ignorant.

Lots more listening to do. If you haven't downloaded them yet, you really should.
November 02, 2007 Greylisting bug with Exchange
Earlier this week the boss forwarded some bounced emails to me and asked me to figure out what had gone wrong. The weird thing was that the email was being greylisted, so it shouldn't have bounced:
```
This is the Symantec Mail Security program at host
mail.globalsuite.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to &lt;postmaster&gt;

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

```
                    The Symantec Mail Security program

```

&lt;example@example.com&gt;: host smtpbackup.example.com said: 451
&lt;example@example.com&gt;: Recipient address rejected: Please
try sending again. (in reply to RCPT TO command)
```
Turns out that Symantec Mail Security is meant to sit in front of an Exchange server, and it turns out that Exchange has a bug (or had; I'm unsure if it's been fixed) where doesn't requeue email that's been greylisted, and later on bounces it back to the sender without ever having retried.

From what I can tell, globalsuite.net is run by guest-tek.com, which provides high-speed access for hotels…so I'm probably not the only one being asked to explain this bug. :-)
August 14, 2007 The deluge opens
Somehow in the move of the websites and files from Linode back to Thornhill (home server on the other end of DSL; 1.5GHz Sempron and 1GB of RAM in a nice Shuttle box), I copied ~/.spamassassin to the wrong directory...and wow, did this ever make a difference to spam filtering. My mailbox was flooded with stuff coming in to an old (12 years!) address that I pretty much just use for WHOIS contacts these days.

I didn't realize what was going on at first, so I tried training it on my saved spam and ham. 90k messages later, it still didn't do it properly. I did some digging, then figured out what had happened and copied the files to the right place. Boom — the sweet, sweet sound of a nearly-empty inbox.

The user_prefs files were the same each time, so it was just the Bayes token files that were different. The only thing I can think of is that the working files were the result of training SA on its mistakes, rather than on its successes.

Of course, I should probably just get the address cancelled or changed…the last time I looked, well over 95% of the spam I've got came to that address. But still, I'm starting to think that I should be keeping the Bayes files under revision control...
April 17, 2007 Why bother
...trying to report a phishing email to a bank whose website:
- requires JavaScript for its index page, and provides no failover page, or
- does not include an easily found link to contact their security people, or
- gives a 404 error when clicking on the "Help" page?
Because it's The Right Thing(tm). But that doesn't make it fun.
April 16, 2007 In case you hadn't read the newsletter
From Bruce Schneier's newsletter comes this blog entry suggesting that there simply aren't that many serious spammers. Interesting data.

Managed to get the Perl/PHP parser extended so that it would see nested PHP arrays and translate them to the proper hash/array references in Perl. It was good to do that, but then other problems arise — like the fact that, as the parser stands right now, it simply stops parsing if it finds something it doesn't understand. This could be something like a comment in a nested array, or something like if ($debug == 1) { $foo = "bar"; } else { … }.

Again, I'm concluding that this would all be much, much easier if it was in a database…just have PHP and Perl suck out the data and do what they want. Either that, or just start writing everything in Perl…

Update: Also, this is not what I expect to see at the top of Planet Solaris — though maybe this should've prepared me. Rockwood's coworker's post is worth reading too.

Update2: Just for completeness, I'll mention that Ben's updates and comments are also worth reading. That's it from the Obvious Dep't.
March 21, 2007 It hurts, it hurts
The only thing I love more than a printer that does SMTP is a printer whose default address for an email alert is name@company.com, a legitimate (though spammy) domain. Did Donald E. Eastlake 3rd and Aliza R. Panitz die for our sins in vain? Hm?
November 02, 2005 Upgrades^3
Upgrading SpamAssassin at work; we're using 2.63, and they're up to, what, 3.1.0 now? The upgrade itself was relatively painless, but for complicated reasons it was integrated with Mimedefang, and I didn't like that. MDF is great, but:
1. It takes out the SA score header. This can be corrected, but
2. it turns the SA score into a number, rather than a series of asterisks, which makes it difficult to filter with a regex, or with Outlook. (I have SA set conservatively, but the header makes it easy to filter more aggressively if that's what you want.
3. Finally, MDF puts the SA report into the message as an attachment. Admittedly, it's a plain-text attachment, but that doesn't console the Outlook users who are worried (and rightly so) about clicking on attachments.
Hm. Will have to figure out a way around that; maybe just run spamc/spamd like I currently do.

I've also got word that, due to some old prototype equipment no longer being needed, I will have three new boxes to play with. Woohoo! I'm already planning the DRBD fileserver.

Finally, I managed to get the new version of uClinux to compile and run on the NWR04B. Sweet...except that I didn't check out a particular tag, and I'm having to guess at the date when I did check out my tree, which makes it difficult to say exactly what I've got. Currently checking out with the date set to when I think I grabbed it, then may upgrade/downgrade to the latest tag (currently 2.4.31).
March 06, 2005 Pissed off
I am fucking pissed off. Over the last few weeks, I've been noticing attempts to spam the wiki on my website. The spammers would create a new page similar to one already existing, and fill it full of links to Russian linkfarms (right term? who cares?). It was annoying, and I figured it would only get worse, but I didn't get too worried. I deleted the pages, blocked the IP address (it was all coming from one open proxy), and watched the changes page for further action. Last night I checked the changes page again. It was late (well, sort of; it had been a long day) and I was making one last check before going to bed. Just to make sure that everything was okay, you know? Every single fucking goddamned page had been vandalized. Every single page that I had put up had been replaced with spam, and there were a dozen new pages with even more spam. Over the course of maybe four hours, all my work had been removed. My only consolation is that Google had not visited the wiki since the changes had been made. There were maybe a hundred pages to revert. And PHPWiki, the software I was using, sucks ass through straws when it comes to reverting changes. Check this out, ladies and germs:
- There is no easy, documented way to revert to a specific revision of a page using the web interface. The version I was using (1.3.4) forces you to go edit an old version, then save that version. The new version I tried upgrading to (1.3.10) allegedly has "action=revert", but I was unable to get this to work: it appeared to do nothing different from "action=edit". To be fair, this may be because the spammer seemed to edit most pages multiple times, perhaps to get around action=revert. But why couldn't I find any documentation on this? All I could find was this page and the words "See action=revert".
- There is no easy way to revert to a specific revision of a page using the database directly. Check it out: The database appears to store metadata in a column dedicated to compressed, cached markup. That's right: instead of breaking out metadata like revision, author IP and so on into a separate table, it's stored in the middle of a big gzipped, serialized PHP object. This means I can't do something like "delete from version where versiondata like '%10.0.0.1%'"; going to the page I've done this on hits an assert in the code that appears to check that the revision listed in the cache column is available in the pagedata table. Whee! Let's get all our programming ideas from MS Office!
As a result, I'm pulling a backup of the database from Friday in order to get the old pages back. I'm going to dump the pages to HTML, figure out how to script whatever changes I want to make, then leave PHPWiki forever the fuck behind me. Shame, really, 'cos I do like the ease of use of Wikis. But I do not have time for this fucking nonsense. Shame on me for not remembering these words:

Someone challenged me, Well, how am I supposed to continue hosting these low-barrier discussions? I'm sorry, but I don't know. To quote Bruce Schneier, "I feel rather like the physicist who just explained relativity to a group of would-be interstellar travelers, only to be asked, 'How do you expect us to get to the stars, then?' I'm sorry, but I don't know that, either."

Those of you looking for info on the NWR04B, please continue to leave comments on my blog. I'll get the documentation from the wiki back as soon as I can.
December 22, 2004 Two good deeds
Well, I did the right thing today -- twice. Damn right I'm bragging.

First off, it turns out that the FreeBSD Foundation has run into a (good!) problem: its donations have been too big. In order to keep its US charitable status, it needs to have two-thirds of its donations be relatively small. Due to a couple of big donations, this ratio is a little out of whack at the moment, and they need a bunch of small donations.

Welp, I've been administering FreeBSD systems for a living for...well, I was gonna say four years, but it's more like two and a half or three. I've been working on them for four, though; my rent and food has been paid in large part because of the generosity of the people who put together FreeBSD. A donation went off in short order.

Then I remembered that I've been meaning to join the Free Software Foundation for a while now. The motivation is the same: I've been paying my bills for a long time now (and enjoying myself immensely in the process) because of the generosity of Free-as-in-Freedom software people: Stallman, Torvalds, Wall, and a zillion others. I have a hard time imagining what I'd be doing now without Free software; I suspect that, if I was lucky, I'd be working as a grocery store manager right now. So: off to the FSF website to sign up for an associate membership.

And what did I find but two, count 'em TWO cool things:
1. If you refer three people to the FSF for associate memberships, RMS or Eben Moglen will record a message for you, suitable for voicemail, Hallowe'en or impressing the ladies. I did a quick search on Google, but couldn't find anyone with the link...damn shame. Better than a free iPod, cooler than a CmdrTaco TiVo -- join the FSF and get RMS to say "All Hail Liddy!"
2. The FSF is looking for a senior sysadmin. God, that'd be cool. Decent enough pay (no, it's not the sort of job you take because of the money, but it's nice to think about), all the Free software you can handle, and an IBM Thinkpad to run it on. Of course, I think I'd have some 'plainin' to do about the laptop I'm writing this on...and, of course, it would mean living in the US. Frankly, that scares the crap out of me these days. Goddamned PATRIOT Act...
In other news, work continues apace. We're losing two coop students and gaining one, gaining another full-time person, and I'm still trying to get my RAID array -- credit app is with the boss, and after that's done the order'll finally go in.

Rough guess (wild hope) at this point is that it'll be in my hands in mid-January, which won't be a moment too soon. There's a new Linux server I'm setting up that I'm desperately hoping won't have problems due to proprietary kernel modules in the software I'm installing. (I'm just writing myself further and further out of that job, aren't I?)

And I'm wondering if the simplest way to get Nagios to make sure the right machines are exporting the right filesystems is to check if amd is mounting them correctly. (No matter whether the machine or amd fails, something needs to be fixed.) Or maybe I just need to figure out the right wrapper for showmount -e.)

On the spam front: good god, what a smoking hole Movable Type is turning out to be. First there were the license changes, then the comment spammers (who seem to be posting a lot more aggressive to MT than to WordPress)...Of course, comment spam affects all blogs, not just MT. Still, this whole idea of rebuilding static pages every time the stars move seems to be causing them a lot of trouble. (Yep, that last sentence was pure FUD. Or bullshit.) And okay, no, I don't use MT, so what precisely is my beef?

As I'm not going to put up, I should shut up. I still have to upgrade WP -- though according to this posting, there are still lots of XSS issues left unfixed. I'm also upgrading PHP, and I should probably use ApacheToolbox to do that automagically, rather than periodically editing my own Makefile.

The release party for Where Are They Coming From? came off JUST FINE, thank you. EVERYONE was there. Top Stars include Topo, Phil Knight and Mos Def, fresh from the set of HHGTTG. Uh huh.

Further thoughts on the MySQL + GPhoto2 thing: gphoto2 does have the ability to pipe to STDOUT, which I don't think I knew...maybe it won't be as much work to insert directly into a database as I thought. Might even be able to do it as a Perl script.

Finally: what a gorgeous day. It's downtown Vancouver on the back steps of the Art Gallery, it's sunny (in December, too) and just cold enough to make you go "brr". The skater kids are practicing their synchronised jumping -- just in time for the Olympics, I'm sure. A far-too-generous co-worker has handed out chocolate, another has handed out home-made rum and brandy balls, and I'm taking off early to go drinking with a third. Feeling pretty damned good right now.

Update: Too bad Topo's not so great -- fever of 102.8F, as of a couple minutes ago. (Still haven't figured out what that is in Celsius; bad Canuckistanian!) It's down a bit from earlier this afternoon, though, so I'm thinking good things. And these pages say to not worry if it's less than a couple days, so I'm not worrying. Nope.
November 30, 2004 Comment Spam v. SURBL
A quick Google turns up this entry on using SURBL to fight comment spam. More information here. A quick look at the WP-Blacklist plugin shows it shouldn't be that hard to add a quick DNS check...Hm. And the SURBL mailing list has discussed this too:

>The quick and easy answer, which may be wrong, is that they're >different folks, or at least different domains. > >Jeff C. > Oh please don't think that just yet!! Seriously. I'm working with some ninjas and the 6dos data and a new tool to let you look up this info! So far it ROCKS beyond belief! But more coming, and trying to keep data source anonymous of course. Also trying to tie in some other tools that other SURBL submitters have been asking for. Bottom line is that these guys ARE the same people. Data shows it.

Hm. Update, Nov. 30: Double hm
November 29, 2004 WordPress Upgrades Part Two: Comment Spammers
As I mentioned, it's been a busy weekend for Gecko and I. With anything good and joyous on the Internet come spammers. Comment spam has been a minor irritant for a while -- nothing I couldn't handle by logging into MySQL directly and running DELETE statements with extreme prejudice -- but in the last few weeks it's gone off the hook. With dozens a day, it was time to start doing something automatically.

WordPress is pretty good this way -- you can set up your comments so that everything needs to be approved by the admin, or just stuff that matches certain words in the comment or URL fields. That worked for a while -- "poker", "debt" and "cialis" took care of most things. But it isn't a very sohphisticated filter, so I started looking around for something else.

I found Fahim Farook's WPBlacklist plugin, and it works pretty damned well. It imports a copy of Jay Allen's blacklist, then holds for approval anything that matches the HOLY CRAP two thousand three hundred forty five lines of regexes (a few) and domains (the bulk of the list). Plus, you can tell it to delete a comment and harvest information from it -- so it knows to watch out for that (domain, email address) in the future. All in all, I was pretty happy.

But then Gecko pointed out this elegant solution. My first name is not so obvious ("Saint? What kinda first name is that? Damn kids..."), so I put in my own simple question.

It's a brilliant idea, really: come up with a question with an answer that's obvious a) if you're at the site and b) are not a spammer's computer. Which makes me wonder what'll happen when/if AI gets a bit more common, or if spammers will start funding natural language parsing research...shudder.

In other comment spammer news, there's a really good article here about what one guy managed to find out about a comment spammer. Finally, turns out that what I was going to say was said a year ago:

...but just like everything else, the weblogging community seems intent on (a) thinking they're special and unique and nobody has ever had their problems before, and proceeding to (b) ignore all the work that has come before and reinventing the wheel. Now, certainly some adaptation of code and algorithms will be necessary. Existing tools probably can't be used as-is. Email spam fighting relies a lot on the structure of an email, the chain of headers that give away so much information to the trained eye, and none of that information is available in weblog spam. But I see from Jay's Comment Spam Clearinghouse that the latest and greatest tool available to us is a master list of domain names and a few regular expressions. No offense to Jay or all the people who have contributed to the list so far, but how quaint! I mean really. Savor this moment, folks. You can tell your children stories of how, back in the early days of weblogging, you could print out the entire spam blacklist on a single sheet of paper. Maybe with two or three columns and a smallish font, but still. Boy, those were the days.

Holy crap. I thought I was cynical. The entire article is highly recommended.
November 09, 2004 Fetch me m'shotgun!
The sumbitches are at it agin', mother. Comment spam is infecting both my blog and my wife's. So far a relatively small number of keywords -- poker, Texas, debt -- is sufficient to keep 'em away from where Google can see 'em. Well, that and OCD-like running of SELECT statements in MySQL. But the fuckers are gonna be the death of me, or at least blog comments. Although maybe some sort of SURBL plugin for URLs in the post...that'd be cool. Someone must have something like that already.

Not that I notice a whole lot of comments, anyhow, at least away from the Slashdot side of things...although I do notice that I've made it onto somebody's blogroll. How'd that happen?

In other news: I finally decided what to do about new computers: buy a new Shuttle Sk43G, Sempron processor, and make that my web server; then, make my current webserver (older Compaq P3-500 desktop machine) my desktop and firewall: lots of room for ethernet cards, tape drives and whatnot.

I agree, it's a little silly that the more powerful box becomes the horribly underutilized server, but such is life. If there was a comparably cheap shuttle that came with two onboard ethernet interfaces, I'd be buying that instead.

So dive right in, right? I got the new box home last night, assembled it and booted w/o problems. It took little effort to move the hard drive from the web server and put it in the new, tiny box; sure, I had to recompile the kernel (8 minutes! eat that, P90!) to get the right drivers in, but nothing big. Until, that is, it froze. Hard. And only a few minutes after booting. If I ran top and set it to update continuously, I could get to freeze within seconds.

Some fiddling with Grub (boot loader of the GODS, man) showed that the problem seemed to go away if I went with the original Slackware stock 2.4.20 kernel instead of the 2.6.7 kernel I'd last compiled. (I'm a packrat, and that includes keeping every kernel compiled on this damned thing, Just In Case, because You Never Know.) We've got one of these boxes at work with an Athlon XP and it works fine; admittedly, it's not doing much, but neither is my web server. (Ba-zing!)

God only knows what's going on there, but it didn't last: I left it on overnight to see if it'd keep going, and sure enough it froze again around 10pm. I put the HD back in the P3 and left it. I'm going to see Wilco tonight (Whoo! WilCO! WHOO!), so this'll take a back seat to some serious RAWK. Except I'll probably be speculating about crappy memory or badly applied heatsink paste the whole time. No. No, I won't. It's Wilco.

Actually, I'm thinking I may have to upgrade the BIOS in order to get it to work properly with the Sempron; originally it was detected as a 900MHz Athlon, and I had to tweak the bus speed and whatnot to get it to run at 1.5GHz. (Interestingly, this seemed to have no effect whatsoever on how quickly it would crash, compared to the difference the different kernel version made.) (God, that's an awful sentence. I'm sorry, everyone.)

Anyhow, there's probably lots wrong with the settings; I never really wanted to learn about memory spacings and CPU voltages and I don't know what-all.

In other other news, I mentioned that I moved last week, but I didn't mention that I came back to two, count 'em TWO dead computers. (Before you ask: Support contracts are for the weak, and I suspect I'm about to get very weak.) One was a Linux box whose hard drive gave up the ghost. Stupid IDE hard drives in a dusty, hot environment anyway! But the other was was an old Duron whose motherboard's capacitors yearned to be one with the cosmos (ie, they blew up real good). That was running Windows, so the whole let's-just-throw-the-hard-drive-into-another-box-and-see-if-it-boots thing was good for a very, very bitter laugh but little else.

Instead, I reinstalled not only Windows but Cygwin, too. That proved to be harder; we use Cygwin to compile very particular things that depend on version 2.2 of Python. Version 2.3 makes things cry. And no matter how much you tell the Cygwin installer that you don't want to upgrade Python, it goes ahead and does so anyway like some hyperactive sugar-fueled kid who's certain he knows how to fix things.

After far too much experimentation, I did what I should have done in the first place: I found an old archive of Cygwin, with the right version of Python, and I mirrored it. One gigantic, nine-hour long sucking sound later, and I had a local copy to point the Cygwin installer at. Thank god.

Finally, just got in the first 19" LCD monitor at work. This was, of course, two weeks after assuring someone that they were too expensive to get past the boss. My bad. I'm going to get a lot of mean looks, I think. But then, if I was a people person, why would I have become a sysadmin?

Recommendation of the Day: Vicious Battle Rap, by DJ Format and Abdominal. Bow down, baby.
September 06, 2004 Aha!
A while back I set up greylisting on Postfix for my home server. It works well, but I have the same concerns now that I did then. The script (smtpd-policy.pl from the examples section of Postfix' source) feels like a bit of a crock; yes, it's just the example script, but I don't like the Berkeley DB files, and comments in the code like "DO NOT create the greylist database in a file system that can run out of space" make me nervous. It hasn't been a problem -- in, oh, six months of running the file is only up to about 5.5 MB. But still: there's no provision for removing old entries, which means an awful soul-searching battle with the database if you ever need to trim it.

I had a brief look at the script tonight, hoping to find a way to maybe hack in MySQL support, but decided to check with Saint Google first. Sure enough, there's gps, the Greylist Policy Service for Postfix. Uses C++ for speed and MySQL/PostgreSQL for the backend, which is nice. I should be able to hack up a migration script for the old entries (just as soon as I hack up a migration script for all the old journal entries...), and all should be good.

One thing I'm noticing with greylisting, though, is just how many attempts are being made from multiple IP addresses within a short time; one attempt, today, had attempts from four different IP addresses within five minutes, all from the same made-up email address. The original Perl script has the advantage that I can change it easily -- I know Perl, and I'd be pretty much starting from scratch with C++ -- and maybe add the ability to track this sort of thing. It'd be nice to be able to tarpit attempts to do this, say on the third attempt.

Tarpitting...another problem with Linux. The TARPIT module for netfilter has yet to be updated to work with the 2.6 kernel, and I really don't want to switch back to 2.4 just for this. LaBrea is nice, and I'm running a lashed-together natd configuration on my FreeBSD firewall box in conjunction with LaBrea running on my desktop on a second interface. It works, but it doesn't work in the case of a Linux webserver running on its own, outside the main firewall. I'm even less a kernel hacker than I am a C++ programmer, and figuring out the compiling problems and changed skbuff route structures (say) is beyond me. It's things like this that make me want to move to OpenBSD. Yeah, rebuilding a server and learning a new firewall language is a pain in the ass, but at least it's one I can handle.
August 13, 2004 thebulkclub.com
So a while back, Slashdot posted a story about TheBulkClub.com, an online forum for heathen cowfucking spammer scum ("Suppose you were a lying, sociopathic thief. And suppose you were a spammer. But I repeat myself." -- Mark Twain) that, sadly, left its membership list and other goodies exposed.

Being the good citizen that I am, I posted a reply that, I flatter myself, was both informative and helpful: it pointed the way to several mirrors of the information, including one on my own site. Well, what do I receive the other day but this charming email:

Date: Wed, 11 Aug 2004 10:23:03 -0700 (PDT) From: EmailSupplyNET <emailsupplynet@yahoo.com> Subject: Question about website To: aardvark@example.com Hey, I like (part) of your website, http://saintaardvarkthecarpeted.com It's informative. There was something on your site about "thebulkclub.com" Did you create that site for them or something? I run an email list site and am trying to contact them for advertising on their forums/boards... Any ideas/help? Thanks in advance, Thanks, www.EmailSupply.net EmailSupplyNet@Yahoo.Com 877.426.6636 --------------------------------- Do you Yahoo!? Yahoo! Mail is new and improved - Check it out!

It's quite the site. They offer a sample list -- 4MB of email addresses, meant to be a sample of the up to 14 million you can buy. I must warn you, it would be wrong to run this command:
```
while [ true ] ; do
wget http://www.emailsupply.net/sample.txt -O /dev/null
done
```
So don't do that. But my question is, what should I do? I'm open to ideas, suggestions, thoughts, plans and dicta.
August 29, 2003 Formmail tarpitting
I've been trying to come up with a way to tarpit formmail spammer probes/attacks, and I haven't had much luck yet. This is an outline of what I've done and what I plan on trying next. If anyone has any thoughts on this, please let me know. In particular, I'm looking for any approaches I've overlooked; I'm sure there's a lot.

Background: Matt's old version of Formmail, up to at least version 1.9, had serious terrible bad vulnerabilities that would let a spammer use it to send any email anywhere, no matter how much you tried to secure it.

At my last job (ISP helpdesk), I get complaints every now and then of spam coming from our mail server; it was almost always spammers using Formmail to do their dirty work. I'd have to track down websites where an old copy of Formmail was lurking, shut it down, and try to clear the mail queue of as much crap as possible. (This got a lot easier once I discovered [ngrep|http://www.packetfactory.net/projects/ngrep/] and had the root password ~SlashdotJournal_29August2003). Eventually I went and replaced all the copies w/the NMS version of Formmail, which did the trick wonderfully. I could drop it in to a website under attack and it would work right away: spam would stop, legitimate requests would still work.

I still get Formmail probes on my website all the time. A while back, I decided to send the spammers something more than just a [404|http://saintaardvarkthecarpeted.com/wheredidthispagego?] page. Using Apache's ~ScriptAliasMatch directive, any request with that matches "/cgi-bin/formmail" (case-insensitive) in the URL gets redirected to my copy of (ta-da!) Formmail Weasel.

Formmail Weasel is a boringly simple Perl script that parses the request made to it, logs everything to a database, and displays an innocuous "thanks for the submission" page (not that the robot ever read it). There's another script that displays the last ten requests in horrible tables. That's it so far.

Once, I got curious and sent off a fake reply to an address mentioned in one of the probes, making it look like a vulnerable Formmail script had been found. (Future plans for Formmail Weasel include the ability to send off these fake replies automatically, and x-ray vision.) Within a week, there were all kinds of attempts to send spam going on -- maybe one a minute or so. After a few weeks of this, the spammer figured out that it wasn't working, and stopped.

That was interesting, and moderately gratifying, but I wanted to cause pain. I want to imagine spammer wails of dismay. Tarpitting immediately leaped to mind. But I can't simply tarpit port 80 and be done with that: I'm still running Apache to serve a few websites, and I don't want to interfere with that. Besides, Formmail probes go by website, not IP addresses, so I need to have www.somethingorother resolve to my server in order to attract scans.

First I decided to try directing Formmail requests to a separate port. Using Apache's ~RedirectMatch directive and a separate ~VirtualHost thingy, I sent all requests for formmail to port 2348 (aka port random) where Formmail Weasel would be listening and Apache would be logging. For good measure, I set up tcpdump too.

My first hope was that the probe robots looking for Formmail scripts would follow the redirect, and I'd be able to capture the traffic on port 2348 w/tcpdump for analysis. ("Lookee here: spammers use SYN packets! Guess we know what to look for now, professor!")

My second hope was that I could provoke an attack by sending off a fake reply, and see whether the attack robots would follow the redirect. Maybe, if I was extraordinarily lucky, I could just tarpit port 2348 and be done with it.

I forgot about it for a week after sending off the decoy email. Today I checked the Apache logs and the tcpdump file: nothing on either one. But when I checked the main logs for my website, there had been half a dozen requests for formmail; the robots simply didn't follow the redirect. I made sure that the redirect was still working, then cried for a bit.

As I see it, this leaves me with a couple options that don't involve deep heavy network hacking:
- Leave Formmail Weasel the way it is: an essentially passive annoyance to spammers.
- Be a little more crafty.
From the requests I've seen, Formmail probes will look for a few common variations on the extension (.pl, .cgi) with some capitalization variations thrown in for good measure (formmail, Formmail, ~FormMail). This gives me a way of distinguishing an attack (an attempt to send spam) from a probe (seeing whether or not there's a script that can be exploited).

Formmail Weasel could designate one of these (let's say ~FormMail.cgi) as one that is the signal of an attack. Probes that came in for other variations would result in an email being sent off to the spammer, but with the attack address in it. In other words, any probes for FormMail.pl, formmail.cgi, or Formmail.cgi would result in an email back to the spammer indicating that ~FormMail.cgi was successful. At that point, the spammer (hopefully) takes the bait and begins the attack.

At this point we can use the [Linux iptables string matching kernel module|http://www.netfilter.org/documentation/pomlist/pom-extra.html#string] to look for packets that have the request in them, and tarpit them. You'd have to be specific about what exactly to look for: something like "GET ~or /cgi-bin/~FormMail.cgi", plus the host/site/whatever directive. But this is a small enough part of the request, and close enough to the beginning, that it should serve as a way of flagging that address as one that should be tarpitted.

Another option, and probably an easier and more effective one, would be to have Formmail Weasel set up separate iptables rules to tarpit the addresses that are part of the attack. You could age them and phase them out after a short/long while. One possible problem with this is that I've seen Formmail attacks that have come from many different IP addresses simultanteously; these usually end up being open proxies. You'd have to take care not to flood your firewall with tarpitting rules.

Original entry
February 13, 2003 Wow
My prayers are answered. Submitted this as a story, but got rejected; in case it doesn't show up, have a look: SpamAssassin for Windows, Perl Artistic License, easy to set up. Just trying it out now. Slow so far, but it's in beta.

Found out about it here. And read this while you're at it.

Original entry.
January 20, 2003 At last!
Didn't think it was ever going to happen, but I finally got spam today on my [spider-trap address|SlashdotJournal_21November2002]. Helen Baker, who appears to be pretty active, emailed me today. About time, too. Can't believe I posted that back in November.

They're located in [San Jose|http://www.coolstats.com/helpdesk/contactus.html], though their servers appear to be in China (surprise). Sadly, the California Attorney General's office is only interested in spam that, among other things, is received by California residents. Fair enough, I guess.

Now if only Ms Helen had a Slashdot account and I could mod her down. Heh...wonder if there are any spammers w/accounts on Slashdot. That'd make for an interesting time...

Original entry
November 21, 2002 Spider-spam, spider-spam
Just for fun, a couple days ago I added a link to the index page of my website to a hidden page. On that page was a mailto: link with a throwaway address for my domain. I wanted to see how quickly it would get picked up, and how quickly I would get spam for it.

Well, the first bit has happened. I created the page at 6.41am local time on November 19; at 2.07pm that same day, it was spidered, then again at 2.40am this morning (Nov. 21).

The first spidering appears to have been done by [Thunderstone|http://www.thunderstone.com/], so I don't think there's too much to worry about there. I'll have to set up a robots.txt file to keep the nice spiders out. The second, however, is from a NY ISP, so I'm guessing something will come of that.

It would be interesting to figure out the average time-to-live of a published email address: how long it can be on a webpage before it gets spammed (and will therefore be spammed unto the end of time, yea, and beyond). This would be like Lance Spitzer's research into the TTL of an unpatched Win98 system on the Internet (Dammit -- all I could find was [this link|http://amsterdam.nettime.org/Lists-Archives/nettime-l-0106/msg00126.html], but I know I've seen the original paper somewhere...), or the idea of mailpings mentioned in this excellent book (track email delivery time to a given address to monitor performance/health).

Original entry
October 03, 2002 SpamAssassin
SpamAssassin is set up now on our new front-end mail server, and it pretty much rox. Got it going this afternoon, and it hasn't fallen over or anything. We even took the other front-end box out of round-robin dns, and the new box has held up perfectly well.

For the record, we've got a 1.4GHz Athlon w/512MB RAM doing about 100 messages a minute right now (in + out), and sending 'em all through SpamAssassin via spamd/spamc. Threshold is set to 15; not as aggressive as I run it at home (8) or as it runs out of the box (5), but we have had some false positives in the first little while (only a few). Load is noticeably up, but not obnoxious by any means.

We've caught about 6500 messages since turning it on at noon, which is a little -- no, wait, just fired up bc -- a lot better than our previous average. (Please note that this graph will now be hopelessly messed up until I get it set up again to monitor spamcatchin' on the new server.)

Tired. Enough for now.

Original entry.
September 25, 2002 Fucking Spammers
Update time.

I got into work today and found that the mail server had just come up after *half a fucking hour* of being down because of the insane load placed on it by spam -- just spam -- coming in. The owner of the company couldn't send email. I started setting up the new mail server.

And it was nice. I got to go away, away from the help desk, sit down and figure out how to make it work. FreeBSD's vinum + Promise raid controller == kernel panic (details later on). Finally got vinum figured out -- I've only worked w/it once before -- and before I was grabbed back to help desk had the disk setup about 80% done.

So some more details: there's 4 x 40GB maxtor IDE drives. (Yeah yeah yeah SCSI.) We've got an onboard Promise controller chip; I'll put in the mobo tomorrow and make this all seamless. First it turns out we've got the Promise Lite (Less Filling!) BIOS, which means we can only have one (1) array of two disks; the other two disks can be single arrays on their own, which is useful in some alternate universe I'm sure. So okay, try setting up one mirrored (Raid 1? 0? I can't keep 'em straight) array, and we'll use vinum to tie it together with the other single drives...

Only as soon as I try using vinum to do _anything_ with the Promise'd arrays, BANG: kernel panic. This is 4.6, not the latest (4.7RC1 as I type), but still. Arghh. Doesn't matter whether vinum tries raid 0, 1 or 5 -- just panics right away. If I had more time and a box of my own to fool around with, I'd try [Michael Lucas'|http://www.oreillynet.com/pub/a/bsd/2002/03/21/Big_Scary_Daemons.html] SlashdotJournal_25September2002-02 (Buy his book!) and contribute something useful to the FreeBSD folk. Alas, it's not my box or my time, and if I were to post this message to freebsd-hackers-important-vinum-people tomorrow I'd (deservedly) get laughed at so hard I'd feel it over the ether.

Anyway. Point is I can't get vinum to play nice w/the Promise'd chip even as an IDE controller. The BIOS of the box allows you to turn the Promise chip on, off, or to ATA/IDE; but even set to the latter, it panics once vinum touches /dev/ar*. You have been warned.

So get vinum using the four drives on the first two IDE channels, and that works fine once I learn the intricacies of disklabel (set type to vinum, kids!) and vinum init (and that takes a long time w/3*35GB partitions^H^H^H^H^H^H^H^H^subsooperplexen). 1 5m 5o 133t!

OT: One of my side notes was going to be about how I'm posting this w/Lynx 'cos Mozilla won't let me use vi, editor of the Elder Gods, as an editor. Then I realized I could have just fired up a shell and used vi in there. Sigh. Rumours of my cleverness have been exaggerated.

Original entry.
August 11, 2002 Honeypot Fun
So I set up a honeypot here at home, to try and learn a bit about computer security. I don't know a whole lot about security beyond the obvious (strong passwords, ssh, turn off services, firewall), so I figured this would be a good way to learn. I took an old Pentium, installed Red Hat 6.2, and away I went.

Welp, as the good folks at Project Honeynet suggested, the first while was spent making mistakes and learning from them. First, I went for the default workstation install -- which meant no services running. After a day, I took it down and installed a default server install. Next, I watched as there were a million probes for NetBIOS or IIS (there's a guy at work with a Win98 box at home on cable w/no firewall...I should show him the logs), and then...aha! SunRPC probes! Whee! ...only I was firewalling the replies. D'oh!

That was last weekend. I didn't want to leave it running w/o me being around to keep an eye on it, so I left it 'til this weekend to turn it back on. Friday night I booted and watched.

...and then it happened: inside of *ten seconds* the cracker detected the ftp server and rooted me. I was agog; all of a sudden I was watching commands being typed in by the cracker, who had logged in with the new user ID he'd just added for himself.

Unfortunately, the timing was bad (silly cracker!). My wife's company was having a [boat cruise|http://www.konawindscharters.com/] that afternoon, and he got in literally ten minutes before I had to leave. I watched for a little while, then shut everything down and ran out the door. (Not that I was sad to go. The boat cruise took us up Indian Arm and it was absolutely amazing: beautiful weather, free food and Bheer...a gorgeous day.)

I'll add more on my honeypot later, but it was pretty stock: RH6.2, firewall, tcpdump, Bash patch to log commands, logging offsite. The one thing I forgot to do was run tripwire.

Music: such a cliched thing to add to something like this (can't even bring myself to say "weblog" or "journal entry"), but: Harry Belafonte and Kate Bush. Old Harry Belafonte is so very much fun; Kate Bush's "Running Up That Hill" is incredible.

Original entry.