04 Jan 2005
Okay, so as I mentioned I'm trying to get a Subversion repository
working in a way that a) keeps the repository safely on an
NFS-exported, mirrored set of drives, and b) does not require
YAFPF. Today I've been banging my head against Apache2 +
mod_auth_pam. The problem is that while passwords are successfully
checked (hurray! one less FPF!), group membership is not. this does
not work:
AuthPAM_Enabled on
AuthPAM_FallThrough on
AuthGROUP_Enabled on
AuthGROUP_FallThrough on
AuthType Basic
AuthGroupFile /etc/group
AuthName "secure area"
Require group subversion
(For one brief, spastic moment I thought Satisfy any
was the missing
magic. Then I tried it without typing in a password. Sigh.) We're
using FreeBSD and NIS; from what I've been able to find so far, that
might be problematic. OTOH, I might have the entirely wrong idea about
PAM and its ability to check group membership.
UPDATE: Logical as it seems, AuthGroupFile has no place in the
modern kitchen. Removing that directive allowed everything to
work. Whee!
Tags:
security
04 Jan 2005
We're going to switch from CVS to Subversion at work. I don't make a
whole lot of use of CVS, so the finer points of change management are
more academic to me than anything else. But authentication...ah,
that's a different story. Right now, Unix clients access the CVS
repository by NFS; Windows users use the pserver
protocol/authentication. NFS access does cause some problems for CVS,
but it's completely out of the question for Subversion if you use
their Berkely DB filesystem. It's okay for read-only access if you use
their FSFS (actual real filesystem files filesystem; the equivalent of
CVS' bunch of directories and files). This leads to questions about
how we'll allow access over the network, and how we'll authenticate
users. Here's my thinking so far.
- Daemon + DB2
- Pro: Can restrict access through file permissions to prevent access by NFS.
- Con: Plain text password file. YAFPF.
- svn+ssh + DB2
- Pro: Secure access from home. SSH key-based authentication.
- Con: The mirrored drive where the repository should be kept is available by NFS and Samba; this can't change. Since file permissions would need to be open to allow read/commit, there's nothing preventing access by NFS and resultant corruption. The other alternative is putting it on a non-mirrored drive, which isn't an option either.
- Apache + PAM
- Pro: Can restrict file permissions to prevent NFS/Samba access. Uses already existing FPF, and since we're not using PAM now we can eliminate AFPF. Prod to switch Samba to PAM, which would be AFPF gone.
- Con: Haven't worked with A2, DAV or mod_auth_foo before. Since will need to coexist for a while with A1, possibility of calcification.
- Apache + LDAP
- Pro: Full buzzword compliance. One FPF to bind them all. Get ready for the groupware that will someday be coming down the pike. Can restrict file permissions to prevent NFS/Samba access.
- Con: Haven't worked with LDAP, either. Will need to convert current password file rather than access directly, creating YAFPF (at least in the short term). Much bigger change, so even bigger danger of calcifictation. (Heh...I like that typo.)
I think I can do Daemon + FSFS, but I need to reread the Subversion
book (truly excellent, BTW). This might be the best way to get
things going quickly. And of course, any insights or hints are
welcome.
Tags:
revisioncontrol
01 Jan 2005
I found a link on Gecko's blog to Project Honeypot. Turns
out it's a project to watch for, and attempt to track, spammer-run
robots that scrape pages for email. I was intrigued, but a little put
off by the terms of use. I did a big more digging around, and
found I wasn't the only person who thought that way. However, there
were some strong rebuttals from the SpamCop forums, discussion
on SURBL mailing list, and from one of the principals (who
also replied here).
Reassured, I signed up. It's still in the early stages, so there
hasn't been a lot of spam received yet (350-odd pieces, according to
the stats page on the site). Still, I'm hopeful it'll be a Good
Thing.
Another approach: a Java SMTP honeypot. Huh.
Tags:
security
email
01 Jan 2005
Firewalled off from NTP? HTP to the rescue!
HTP is not really a protocol, but uses a feature from HTTP, aka web
traffic. According the specifications of HTTP (RFC 2616) a web
server needs to put a timestamp in a response to a web browser
request. In web browsers you don't see the HTTP headers, but these
headers contain a timestamp in Greenwich Mean Time (GMT), accurate
in seconds.
Available in Perl or C. My compliments to Eddy Vervest.
Tags:
29 Dec 2004
Got a bad feeling in the pit of my stomach this morning when I came
back to work. I'd deliberately stayed away from the usual non-Slashdot
news sources (Internet Storm Center, Bugtraq, Full Disclosure), so
there was a lot of catching up to do. Let's see: eighty-four new
remote holes in Windows -- always fun -- and it turns out the phpBB
worm is no longer a phpBB worm but a PHP worm. Jesus
Christ.
I checked the logs on my home server, and sure enough there were tons
of the little bastards hitting me. (The server at work was completely
clean.) It looked like there was nothing there, but I couldn't be sure
without more time spent on it than a few minutes' grepping -- which
meant leaving it 'til I got home tonight. (Update: looks like I was
fine. I tried the URLs in the logs, and none of them tried to fetch
anything. Dodged a bullet there.)
OpenBSD has the right idea when it chroots Apache, but there's also
the matter of initiating connections out. And yes, I'm guilty of this:
Thornhill + port 80 + tcp syn should be firewalled off, but was
not. Changed now, of course. Still, it would be nice to have Thornhill
not be locked down entirely. Why not let me initiate a connection
out, but prevent Apache from doing the same?
This gets back to What's Wrong With Unix?, and I still say a
good part of it is the lack of fine-grained permissions on both ports
and files. (That, and my inability to type a good post when I'm in a
hurry...God, that was incoherent.) The sheer idiocy of continuing to
insist on root permissions to open a port under 1024 is just
ridiculous. Why do we do this? In a world of Unix on the desktop,
where anyone can get root, what does this mean anymore? Nothing at
all: it's a totem, a fetish, and the Unix equivalent of knocking on
wood for luck.
Worse, by insisting that you need to be root to open port 80, you
invite all sorts of security problems. Better hope you drop privileges
effectively; better hope no one figures out a way to extract r00t from
any lingering privileges; better hope you didn't make one single
mistake, or you'll get 0wned. Serving web pages, answering DNS queries
or answering QOTD requests (ports 80, 53 and 17, respectively) do not
require root permissions. (This is quite a different question from
whether or not J. Random User should be able to modify web pages, zone
files, or the QOTD database.) qmail, Postfix and others have shown
that delivering mail doesn't need root, either. (Other applications
can be taken on a port-by-port basis; the full extent of my
hand-waving is left as an exercise to the reader.)
So why is there no way to let UID www send a syn+ack, but not a syn?
Or to let some range of UIDs do both? Why, Lord, can't I change
ownership, groups and permissions on /proc/net/ipv4/tcp/port/80 so
that UID www can open this port and nothing else? How long, O Lord,
how long?
There is a patch I came across today that supposedly offers this
sort of thing, but again: it SHOULD NOT be an option; it SHOULD NOT be
a patch; it SHOULD be built-in and used, just like we use UIDs to
restrict privileges now. (The key words "MUST", "MUST NOT", "SHOULD",
"SHOULD NOT", and "MAY" are to be interpreted as described in RFC
2119.)
Ahem. In other news: At Staples today I picked up a Network
Everywhere BWR04B 802.11b wireless router. --I'm sorry, "Network
Everywhere"? Looks like Cisco/Linksys in disguise. But it was 18
Soviet Canuckistan pesos! Boxing Day special! How could I possibly
resist? Better yet, it turns out that the damn thing can run
Linux. It's got 8MB of RAM, 2MB of flash memory, and something
like a 60MHz ARM CPU.
The folks over at the Hardware Recycling Initiative are working
on getting this and other broadband router boards running
Linux. Sweet! Now to figure out how the hell to get it to work on this
thing...I can identify a soldering iron six times out of ten, but
that's about it.
Tags:
nwr04b
26 Dec 2004
title: Hm.
date: 2004-12-26 21:32:50
I don't like the WPBlacklist plugin as much as I used to. Reasons why:
- Stupid insistance on banning IPs. There are so many zombie PCs out there acting as open proxies that this is a waste of time. I've watched the traffic, and comments don't come from the same IP twice.
- Stupid insistance on looking at email addresses. When it's so easy to make up an email address, why bother? Tracking 1600 variations on byob@[some number].com is just filling up the tables.
- Problems tonight with Topo's blog.
Topo mentioned tonight that not only had it been a while since a new
comment was posted on her blog, but a test comment posted tonight
never showed up. Part of the problem turned out to be a stupid PHP
syntax error I'd introduced; I'd been editing one of the files in an
attempt to force WPBlacklist stop emailing re: deleted posts. (Yes,
we'd turned off all possible email-me-please settings, and it still
kept filling up her inbox.)
And then somehow, our IP address and my URL got put into the blacklist
tables. There's no note of when an entry is added to the blacklist
(FIXME!), so I can't tell when it was added -- but every test comment
I made was getting caught by this. Finally, there were at least two
blank entries in the tables, and I'm afraid they might have been
destroying everything in sight, too.
After a little bit of browsing around, it turns out the
let's-insert-a-blank-line problem has been addressed in the 2.8
version of WPBlacklist, available from the new download page. I'll
give that a try. Also, it'd be nice to clear up the license -- no
mention of how WPBlacklist is available, and if I'm going to work on
this (and hopefully improve it), I want to make sure I can distribute
any changes. I'll post a question on the forum and see what happens.
Tags:
23 Dec 2004
title: No BR for you!
date: 2004-12-23 08:18:06
Thanks to these two posts, I've finally managed to turn
off WP's stupid, borked,
let's-throw-in-a-<BR>-tag-every-time-a-line-ends-in-the-editing-box
behaviour. Since I use Mozex, Firefox plugin of the gods, this
was seriously pissing me off. I agree with OtherMichael: this
behaviour is a bug, and should be option-controllable.
Tags:
23 Dec 2004
title: It's deja vu all over again
date: 2004-12-23 23:39:56
Holy crap:
IP addresses are easy to fake as well. The design principles of
TCP/IP allows the sender of a packet to specify its IP address. The
message will still be routed to its destination using the fake
origin address. Return packets would be mis-routed, however, because
TCP/IP would send responses to the true location of the IP address
rather than where it actually came from. This means that IP spoofing
is ineffective in situations where you need to interact with a
remote server, but very effective in a one-way conversation. I can't
retrieve a Web page using a spoofed IP address because I need to
make the request and then have the server send me the page. But I
can send requests all day long if I don't care about the response.
I thought this was just a slight muddying of the waters. But no. The
VERY NEXT PARAGAPH:
Posting a comment (or TrackBack) doesn't require interaction. I can
send a comment in a POST or GET message and not worry about the
response if I don't care about receiving acknowledgment that it was
successful.
...what, has Apache moved to UDP all of a sudden? Sweet Zombie Jesus!
(And don't talk to me about guessing SYN numbers; that is not what
this idiot is talking about.) (Although to give him his due, he is
talking about this in an article explaining why blocking IP addresses
from blogs won't work, and he comes up with a great summary: "This
[approach] is fundamentally flawed because it assumes IP addresses are
both unique and hard to come by.") (But oh, this is a very painful
case of bending over backwards to be fair.) And then:
Now spammers have turned their attention to weblogs and comment
forms. In order to increase search engine rankings you are posting
advertisements to our Web pages. What you failed to understand is
that bloggers are smarter, better connected, and more
technologically savvy than the average email user. We control the
medium that you are now attempting to exploit. You've picked a fight
with us and it's a fight you cannot win. Bloggers will track you down
and notify your hosting providers about your activities. We will
tell your ISPs what you are using their connections for. We will let
the makers of the products you are advertising know of your
despicable sales methods. We will hit you where it hurts by
attacking your source of income. You can move to a new host, find a
new ISP, or sign up for a different affiliate plan. The end result
will be the same. Each time you rise out of the muck we will strike
you down and send you back to the hole you crawled out of.
Do you smell that? That is the sound of sweet, virgin superiority,
fresh and and naive and unmingled. This is from Dive Into Mark. I
quoted it before, but here's a bit more context:
If you want to be an anti-spam advocate, if you want to write
software or maintain a list or provide a service that identifies
spam or blocks spam or targets spam in any way, you will be
attacked. You will be attacked by professionals who have more money
than you, more resources than you, better programmers than you, and
no scruples at all. They want to make money, this is how they have
decided to make money, they really can make a lot of money, and
you're getting in their way. This is old hat to anyone whos been
involved in anti-spam efforts in other domains (Usenet and email
spring to mind), but just like everything else, the weblogging
community seems intent on (a) thinking they're special and unique
and nobody has ever had their problems before, and proceeding to (b)
ignore all the work that has come before and reinventing the
wheel. [....]Someone challenged me, Well, how am I supposed to
continue hosting these low-barrier discussions? I'm sorry, but I
don't know. To quote Bruce Schneier, "I feel rather like the
physicist who just explained relativity to a group of would-be
interstellar travelers, only to be asked, 'How do you expect us to
get to the stars, then?' I'm sorry, but I don't know that, either."
The low barrier is exactly the problem here. We got away with it
(please, come post random links on my site which is well indexed,
poorly managed, and open to unlimited anonymous contributions!)
because we were collectively very young and naive and thought no one
could hurt us. Now it's like were turning 30 and being told we need
to go on a diet and asking, "Well when can I go back to my old
eating habits?" Um, you can't. Your old eating habits don't work
anymore. Weblogging is growing up. Oh wait, you thought that would
be a good thing? You must still be young.
It is still worth reading every single depressing and true sentence in there, if only to keep yourself from being drowned in bullshit, nonsense and fairy tales.
Tags:
23 Dec 2004
title: Debian Irritants
date: 2004-12-23 08:21:19
Yes, it's trouble in paradise time:
- WTF does Vim try to connect to X?
- WTF does grip come with gconf?
- WTF does gconf refuse to give up information?
ARGHHH.
Tags:
22 Dec 2004
Well, I did the right thing today -- twice. Damn right I'm
bragging.
First off, it turns out that the FreeBSD Foundation has run into
a (good!) problem: its donations have been too big. In order to keep
its US charitable status, it needs to have two-thirds of its donations
be relatively small. Due to a couple of big donations, this ratio is
a little out of whack at the moment, and they need a bunch of
small donations.
Welp, I've been administering FreeBSD systems for a living
for...well, I was gonna say four years, but it's more like two and a
half or three. I've been working on them for four, though; my rent
and food has been paid in large part because of the generosity of the
people who put together FreeBSD. A donation went off in short
order.
Then I remembered that I've been meaning to join the Free Software
Foundation for a while now. The motivation is the same: I've been
paying my bills for a long time now (and enjoying myself immensely in
the process) because of the generosity of Free-as-in-Freedom
software people: Stallman, Torvalds, Wall, and a
zillion others. I have a hard time imagining what I'd be
doing now without Free software; I suspect that, if I was lucky, I'd
be working as a grocery store manager right now. So: off to the FSF
website to sign up for an associate membership.
And what did I find but two, count 'em TWO cool things:
If you refer three people to the FSF for associate memberships, RMS
or Eben Moglen will record a message for you, suitable for voicemail,
Hallowe'en or impressing the ladies. I did a quick search on Google,
but couldn't find anyone with the link...damn shame. Better than a
free iPod, cooler than a CmdrTaco TiVo -- join the FSF and get
RMS to say "All Hail Liddy!"
The FSF is looking for a senior sysadmin. God, that'd be
cool. Decent enough pay (no, it's not the sort of job you take
because of the money, but it's nice to think about), all the Free
software you can handle, and an IBM Thinkpad to run it on. Of course,
I think I'd have some 'plainin' to do about the laptop I'm writing
this on...and, of course, it would mean living in the US. Frankly,
that scares the crap out of me these days. Goddamned PATRIOT Act...
In other news, work continues apace. We're losing two coop students
and gaining one, gaining another full-time person, and I'm still
trying to get my RAID array -- credit app is with the boss, and
after that's done the order'll finally go in.
Rough guess (wild hope) at this point is that it'll be in my hands in
mid-January, which won't be a moment too soon. There's a new Linux
server I'm setting up that I'm desperately hoping won't have problems
due to proprietary kernel modules in the software I'm installing. (I'm
just writing myself further and further out of that job, aren't I?)
And I'm wondering if the simplest way to get Nagios to make sure the
right machines are exporting the right filesystems is to check if amd
is mounting them correctly. (No matter whether the machine or amd
fails, something needs to be fixed.) Or maybe I just need to figure
out the right wrapper for showmount -e
.)
On the spam front: good god, what a smoking hole Movable Type is
turning out to be. First there were the license changes, then the
comment spammers (who seem to be posting a lot more
aggressive to MT than to WordPress)...Of course, comment
spam affects all blogs, not just MT. Still, this whole idea of
rebuilding static pages every time the stars move seems to be causing
them a lot of trouble. (Yep, that last sentence was pure FUD. Or
bullshit.) And okay, no, I don't use MT, so what precisely is my beef?
As I'm not going to put up, I should shut up. I still have to upgrade
WP -- though according to this posting, there are still lots of
XSS issues left unfixed. I'm also upgrading PHP, and I should
probably use ApacheToolbox to do that automagically, rather than
periodically editing my own Makefile.
The release party for Where Are They Coming From? came off JUST
FINE, thank you. EVERYONE was there. Top Stars include Topo,
Phil Knight and Mos Def, fresh from the set of HHGTTG. Uh huh.
Further thoughts on the MySQL + GPhoto2 thing: gphoto2 does have
the ability to pipe to STDOUT, which I don't think I knew...maybe it
won't be as much work to insert directly into a database as I
thought. Might even be able to do it as a Perl script.
Finally: what a gorgeous day. It's downtown Vancouver on the back
steps of the Art Gallery, it's sunny (in December, too) and just cold
enough to make you go "brr". The skater kids are practicing their
synchronised jumping -- just in time for the Olympics, I'm sure. A
far-too-generous co-worker has handed out chocolate, another has
handed out home-made rum and brandy balls, and I'm taking off early
to go drinking with a third. Feeling pretty damned good right
now.
Update: Too bad Topo's not so great -- fever of 102.8F, as of
a couple minutes ago. (Still haven't figured out what that is in
Celsius; bad Canuckistanian!) It's down a bit from earlier this
afternoon, though, so I'm thinking good things. And these
pages say to not worry if it's less than a couple days, so I'm
not worrying. Nope.
Tags:
wontyoupleaselendahand
bsd
politics
meta
rant
hardware
spam
16 Dec 2004
title: tcpdrop
date: 2004-12-16 16:56:38
tcpdrop looks 'way cool. More and more reasons to make my next
server run OpenBSD.
Tags:
16 Dec 2004
title: Random reminders
date: 2004-12-16 16:53:20
- When compiling a Linux kernel, you need to run
make config
(with your saved .config
file, of course!) after running make mrproper
.
- NFSv3 support is not included in the kernel just because you compiled in NFS support.
- In the 2.4.28 kernel, at least, serial ATA drives are available at
/dev/sd[abcd]
, not /dev/hd[efgh]
like in Knoppix...at least, when using the libata interface.
Tags:
15 Dec 2004
title: Fun with awk
date: 2004-12-15 22:58:52
As I've mentioned before, I've set up Greylisting on my mail
server. The basic principle is simple: if you haven't seen an IP and
email address combo before, you give them a 450 ("Come back later")
error. If they come back later, you let 'em in and whitelist 'em in
the future. The theory is that spamming depends on volume, and a
spammer bot won't try again. One thing I've been noticing,
though, is that spammers are trying again -- but from different IP
addresses, which means they still don't get past the Greylisting. How
many IP addresses? Looking at my logs over the last week, here's what
I see:
```
Number of connections from separate IPs |
Number of occurrences |
```
```
1 |
102 |
```
```
2 |
26 |
```
```
3 |
24 |
```
```
4 |
24 |
```
```
5 |
15 |
```
```
9 (!) |
1 |
```
```
Total: |
190 |
```
This means that more than half try once, then give up -- but more than
46% try again. It's only because they're trying from different IP
addresses that Greylisting still works. What happens when someone
decides to make their bot try again from the same proxy? BTW, all this
reminds me that, while it's okay doing this with awk
and sort
, I
still need to get msyslog working...this'd be a whole lot
easier in SQL.
Tags:
11 Dec 2004
After a lot of consideration, and some reassurance from JWSmythe,
I'm going with the Promise VTrak 15100 array for work. It has
almost everything I want: serial ATA, dual SCSI adapters, and an
ethernet interface. The downside is that Promise doesn't have an
office in Canada, so there's the possibility that getting parts across
the border could be a problem. However, there's a local company
that'll do service, so that makes me feel better.
The other options just weren't as good: one was parallel ATA and had
no ethernet interface. The other was the Fastora DAS-315, which
certainly looked good -- but the local resellers couldn't be
bothered to give me the time of day, let alone answer the questions I
had. Best bit: when I asked for a copy of the service level agreement,
the sales guy replied that he'd "have to see" if he could release it.
And at home, I've been running into problems with bridging, the 2.6.9
kernel and the 8139too driver. I thought I would enable bridging on
Thornhill for some User-mode Linux fun, so I enabled it as a
module, then rebuilt and reinstalled the modules. However, when I
tried inserting it, I got unknown symbol:
br_handle_frame_hook
. Okay, what about rebuilding the kernel and
including bridging within it? Tried that; when I booted, the kernel
panicked as soon as it came time for the onboard 8139 interface to
grab an address by DHCP.
It was similar to the earlier problems I had with the Shuttle, in
that if I took out the ethernet cable everything was fine -- it was
only when the response came in that the kernel panicked. And keep in
mind this was without setting up a bridge at boot time, or anything
like that. I had to go to the backup 2.6.7 kernel in order to calm
things down.
I found this thread on LKML, and it seems to match pretty closely
what I saw -- the stack trace matches what I saw; I wasn't able to see
the whole message, because it would scroll off the screen. However,
I'm reluctant to try this patch; I spent a whole evening rebooting
(Sorry, Aaron) and trying different things before I finally confirmed
that having bridging in the kernel was just a bad thing.
Interesting bit: I didn't realize that Linux does not have panic core
dumping built into the kernel, as FreeBSD does; it's only
available as a separate patch. Minus one for Linux.
Finally, it's the day after the office Xmas party, and what am I
doing? Heading into work to unplug everything. The power is being shut
off in our building (thirty-floor or so high-rise) while upgrades are
done, so I'm shutting everything down and disconnecting it just in
case. Tomorrow I go back in to reverse the process. Whee!
Tags:
hardware
11 Dec 2004
title: Cool little utility
date: 2004-12-11 10:48:00
SWAKS, a command-line utility for testing SMTP. Available as a Debian package. Sample output:
$ swaks
To: aardvark at thingy saintaardvarkthecarpeted communist
=== Trying thornhill.saintaardvarkthecarpeted.com:25...
=== Connected to thornhill.saintaardvarkthecarpeted.com.
< - 220 thornhill.saintaardvarkthecarpeted.com ESMTP Postfix All Hail Liddy!
-> EHLO rearden.saintaardvarkthecarpeted.com
< - 250-thornhill.saintaardvarkthecarpeted.com
<- 250-PIPELINING
<- 250-SIZE 52428800
<- 250-ETRN
<- 250 8BITMIME
-> MAIL FROM:<aardvark at="at" communist="communist" rearden="rearden" saintaardvarkthecarpeted="saintaardvarkthecarpeted" thingy="thingy">
< - 250 Ok
-> RCPT TO:<aardvark at="at" communist="communist" saintaardvarkthecarpeted="saintaardvarkthecarpeted" thingy="thingy">
< - 250 Ok
-> DATA
< - 354 End data with <cr><lf>.<cr><lf>
-> Date: Sat, 11 Dec 2004 09:45:49 -0800
-> To: aardvark at thingy saintaardvarkthecarpeted communist
-> From: aardvark at thingy rearden saintaardvarkthecarpeted communist
-> Subject: test Sat, 11 Dec 2004 09:45:49 -0800
-> X-Mailer: swaks v20040404.1 jetmore.org/john/code/#swaks
->
-> This is a test mailing
->
-> .
< - 250 Ok: queued as 594AA2CF
-> QUIT
< - 221 Bye
=== Connection closed by foreign host.
Wish I'd known about this a long time ago.
Tags:
30 Nov 2004
A quick Google turns up this entry on using SURBL to fight
comment spam. More information here. A quick look at the
WP-Blacklist plugin shows it shouldn't be that hard to add a
quick DNS check...Hm. And the SURBL mailing list has discussed
this too:
>The quick and easy answer, which may be wrong, is that they're
>different folks, or at least different domains. > >Jeff
C. > Oh please don't think that just yet!! Seriously. I'm working
with some ninjas and the 6dos data and a new tool to let you look up
this info! So far it ROCKS beyond belief! But more coming, and
trying to keep data source anonymous of course. Also trying to tie
in some other tools that other SURBL submitters have been asking
for. Bottom line is that these guys ARE the same people. Data shows
it.
Hm. Update, Nov. 30: Double hm
Tags:
spam
30 Nov 2004
title: My chance for fame and fortune gone...
date: 2004-11-30 18:35:02
Dammit! Someone's already integrated SpamAssassin with Wordpress!
Now I'll have to show my legs to get attention... Actually, something
that could be useful here is a blog honeypot in order to figure out
how effective different mechanisms are. That could be interesting...
Tags:
29 Nov 2004
As I mentioned, it's been a busy weekend for Gecko and
I. With anything good and joyous on the Internet come
spammers. Comment spam has been a minor irritant for a while --
nothing I couldn't handle by logging into MySQL directly and running
DELETE statements with extreme prejudice -- but in the last few weeks
it's gone off the hook. With dozens a day, it was time to start
doing something automatically.
WordPress is pretty good this way -- you can set up your comments so
that everything needs to be approved by the admin, or just stuff
that matches certain words in the comment or URL fields. That worked
for a while -- "poker", "debt" and "cialis" took care of most
things. But it isn't a very sohphisticated filter, so I started
looking around for something else.
I found Fahim Farook's WPBlacklist plugin, and it works pretty
damned well. It imports a copy of Jay Allen's blacklist, then
holds for approval anything that matches the HOLY CRAP two thousand
three hundred forty five lines of regexes (a few) and domains (the
bulk of the list). Plus, you can tell it to delete a comment and
harvest information from it -- so it knows to watch out for that
(domain, email address) in the future. All in all, I was pretty
happy.
But then Gecko pointed out this elegant solution. My first name
is not so obvious ("Saint? What kinda first name is that? Damn
kids..."), so I put in my own simple question.
It's a brilliant idea, really: come up with a question with an answer
that's obvious a) if you're at the site and b) are not a spammer's
computer. Which makes me wonder what'll happen when/if AI gets a bit
more common, or if spammers will start funding natural language
parsing research...shudder.
In other comment spammer news, there's a really good article here
about what one guy managed to find out about a comment
spammer. Finally, turns out that what I was going to say was said a
year ago:
...but just like everything else, the weblogging community seems
intent on (a) thinking they're special and unique and nobody has
ever had their problems before, and proceeding to (b) ignore all the
work that has come before and reinventing the wheel. Now, certainly
some adaptation of code and algorithms will be necessary. Existing
tools probably can't be used as-is. Email spam fighting relies a lot
on the structure of an email, the chain of headers that give away so
much information to the trained eye, and none of that information is
available in weblog spam. But I see from Jay's Comment Spam
Clearinghouse that the latest and greatest tool available to us is a
master list of domain names and a few regular expressions. No
offense to Jay or all the people who have contributed to the list so
far, but how quaint! I mean really. Savor this moment, folks. You
can tell your children stories of how, back in the early days of
weblogging, you could print out the entire spam blacklist on a
single sheet of paper. Maybe with two or three columns and a
smallish font, but still. Boy, those were the days.
Holy crap. I thought I was cynical. The entire article is highly
recommended.
Tags:
spam
28 Nov 2004
So Gecko and I have been doing some interesting work with
WordPress this weekend. My wife and I visited him and the lovely
Arwen on Friday, drank too much wine, and when we woke up in the
morning had a lovely Logitech USB headset, originally meant for a
Sony PlayStation, sitting in our laps. We also had fuzzy memories of
barked instructions to start "podcasting". Wha'?
First, we had to get the headset working. On my box (Debian
Testing with a 2.6.9 kernel) it was a simple matter of getting
ALSA modules compiled. Since I still hadn't got around to getting
my sound card going after the big move, this had the pleasant
bonus of being able to listen to music again.
When everything was done, I was able to run:
arecord -D plughw:Headset | oggenc - -o foo.ogg & sleep 60 killall arecord
and have a tasty OGG file at the end of it. Sweet!
But what about Ms Topo's computer? She's running RH9, and I had no
interest in picking this weekend to migrate her to something newer. I
knew (well, okay, I Googled and found out) that RH didn't do ALSA, so
that left me with the fun of trying to bolt it on. I tried following
these instructions, and it didn't work: whenever I tried to
modprobe the new ALSA modules, I got lots of "unresolved symbol"
errors. NFG.
Well, what about a new kernel? Could try upgrading to 2.6.9, right?
Nope: RH uses initrd when booting, and I've never wrapped my head
around that. But guess what? When booting back into 2.4, kudzu found
and configured the USB headset automagically. Teach me to
underestimate RH...
Okay, so that part solved. Next part was to figure out what the hell
"podcasting" is. And for the love o' Linus, it's just a URI in an RSS
2.0 feed that points to a thing: an image, an MP3 file,
whatever. They call it an enclosure, but it's just a fucking
link! RSS is the new HTML. Somebody, somewhere, is going to figure
out how to do TCP over RSS, and I won't know whether to laugh or cry.
(Hey! Google finds no pages with the phrase "TCP over RSS". You
heard it here first, kids.)
But back to our story. So how the hell do you get an enclosure in your
RSS 2.0 feed? Well, if you're using the ever-lovin' WordPress,
you can either get the Alpha nightly releases, or you can make
some judicous modifications to a few files. I backed up the
originals, copied the others into place, made the right changes to the
database, and baaaaaaaaaaaaaam!
Last step: oh yeah, an MP3. (Stupid patented file formats...) Quick
look around found Audacity, and holy crap is that cool. The
first time I started it, I got a little popup:
There was an error initializing the audio i/o layer. You will not be
able to play or record audio. Error: Host error.
but turning off XMMS fixed that right up. I quickly recorded two
tracks, exported the mess to MP3, put it up on the server, and hey-ho,
let's go! Sir Gecko checked it out, and it worked on his
iPod. What is it with Apple people, anyway?
Next step: Topo and Gecko do the ADD show. Watch for updates.
Tags: