A note on tracking Postfix messages through the logs

Recently I had to track messages to make sure they were being delivered, and this turned out to be much harder than I thought. Some of this is due to the way we have our mail set up, and some of it is just due to confusion (mine) about how Postfix logs things. So in the spirit of Chris Siebenmann, here's my explanation for my future self of how this works.

First, an explanation of the flow of messages at my workplace, whether sent or received from a local network or a remote network:

Here's a log with an example message. It was sent from pepperoni@pepperoni.pizza to otheruser@example.net, which we forward off to ilikesandwiches@club.sandwich. (Rememver the days when you could be assured that "club.sandwich" was a totally bogus TLD?)

   Apr  9 13:25:13 liddy postfix/smtpd[12203]: 4DC3914C086: client=mail.example.com[192.168.0.2]
   Apr  9 13:25:13 liddy postfix/cleanup[12206]: 4DC3914C086: message-id=<deadbeef@example.com>
   Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: from=<pepperoni@pepperoni.pizza>, size=2019, nrcpt=1 (queue active)
   Apr  9 13:25:13 liddy spamd[5110]: spamd: processing message <deadbeef@example.com> for nobody:99
   Apr  9 13:25:13 liddy spamd[5110]: spamd: result: . 0 - scantime=0.2,size=1981,user=nobody,uid=99,required_score=5.0,rhost=localhost.localdomain,raddr=127.0.0.1,rport=42925,mid=<deadbeef@example.com>,autolearn=failed
   Apr  9 13:25:13 liddy postfix/pickup[11886]: 85E1F14C089: uid=99 from=<pepperoni@pepperoni.pizza>
   Apr  9 13:25:13 liddy postfix/cleanup[12206]: 85E1F14C089: message-id=<deadbeef@example.com>
   Apr  9 13:25:13 liddy postfix/pipe[12207]: 4DC3914C086: to=<otheruser@club.sandwich>, orig_to=<ilikesandwiches@example.org>, relay=spamassassin, delay=0.25, delays=0.05/0/0/0.2, dsn=2.0.0, status=sent (delivered via spamassassin service)
   Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: removed
   Apr  9 13:25:13 liddy postfix/qmgr[995]: 85E1F14C089: from=<pepperoni@pepperoni.pizza>, size=2323, nrcpt=1 (queue active)
   Apr  9 13:25:43 liddy postfix/smtp[12212]: 85E1F14C089: to=<otheruser@club.sandwich>, relay=mail-relay.isp.tld[10.0.0.1]:25, delay=30, delays=0.01/0.01/30/0.35, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 92585180F0)
   Apr  9 13:25:43 liddy postfix/qmgr[995]: 85E1F14C089: removed

My confusion stemmed from thinking of the Postfix queue ID as the message ID. THIS IS WRONG. The message ID does not change. The queue ID changes if the message goes through the queue again. And, the way we have things set up, it goes through the queue twice: once when received, then again when SpamAssassin is done with it.

(Incidentally, I had wondered why SpamAssassin did not include the Postfix ID. This makes perfect sense when you realize it's a queue ID, and SA is not part of the Postfix queue.)

The other thing that makes this difficult is that the entries for a particular message are:

Thus, the process for tracking down what's happened to a particular mail goes like this:

Look for the bit you have (who the message is from, let's say):

$ grep pepperoni@pepperoni.pizza /var/log/maillog
Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: from=<pepperoni@pepperoni.pizza>, size=2019, nrcpt=1 (queue active)
Apr  9 13:25:13 liddy postfix/pickup[11886]: 85E1F14C089: uid=99 from=<pepperoni@pepperoni.pizza>
Apr  9 13:25:13 liddy postfix/qmgr[995]: 85E1F14C089: from=<pepperoni@pepperoni.pizza>, size=2323, nrcpt=1 (queue active)

Note the two queue IDs and search for those:

$ grep -E '85E1F14C089|4DC3914C086' /var/log/maillog.1
Apr  9 13:25:13 liddy postfix/smtpd[12203]: 4DC3914C086: client=mail.example.com[192.168.0.2]
Apr  9 13:25:13 liddy postfix/cleanup[12206]: 4DC3914C086: message-id=<deadbeef@example.com>
Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: from=<pepperoni@pepperoni.pizza>, size=2019, nrcpt=1 (queue active)
Apr  9 13:25:13 liddy postfix/pickup[11886]: 85E1F14C089: uid=99 from=<pepperoni@pepperoni.pizza>
Apr  9 13:25:13 liddy postfix/cleanup[12206]: 85E1F14C089: message-id=<deadbeef@example.com>
Apr  9 13:25:13 liddy postfix/pipe[12207]: 4DC3914C086: to=<otheruser@club.sandwich>, orig_to=<ilikesandwiches@example.org>, relay=spamassassin, delay=0.25, delays=0.05/0/0/0.2, dsn=2.0.0, status=sent (delivered via spamassassin service)
Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: removed
Apr  9 13:25:13 liddy postfix/qmgr[995]: 85E1F14C089: from=<pepperoni@pepperoni.pizza>, size=2323, nrcpt=1 (queue active)
Apr  9 13:25:43 liddy postfix/smtp[12212]: 85E1F14C089: to=<otheruser@club.sandwich>, relay=mail-relay.isp.tld[10.0.0.1]:25, delay=30, delays=0.01/0.01/30/0.35, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 92585180F0)
Apr  9 13:25:43 liddy postfix/qmgr[995]: 85E1F14C089: removed

Note the message ID and add it to the mix to get what SpamAssassin thought of it:

$ grep -E '85E1F14C089|4DC3914C086|deadbeef@example.com' /var/log/maillog.1
Apr  9 13:25:13 liddy postfix/smtpd[12203]: 4DC3914C086: client=mail.example.com[209.85.160.41]
Apr  9 13:25:13 liddy postfix/cleanup[12206]: 4DC3914C086: message-id=<deadbeef@example.com>
Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: from=<pepperoni@pepperoni.pizza>, size=2019, nrcpt=1 (queue active)
Apr  9 13:25:13 liddy spamd[5110]: spamd: processing message <deadbeef@example.com> for nobody:99
Apr  9 13:25:13 liddy spamd[5110]: spamd: result: . 0 - scantime=0.2,size=1981,user=nobody,uid=99,required_score=5.0,rhost=localhost.localdomain,raddr=127.0.0.1,rport=42925,mid=<deadbeef@example.com>,autolearn=failed
Apr  9 13:25:13 liddy postfix/pickup[11886]: 85E1F14C089: uid=99 from=<pepperoni@pepperoni.pizza>
Apr  9 13:25:13 liddy postfix/cleanup[12206]: 85E1F14C089: message-id=<deadbeef@example.com>
Apr  9 13:25:13 liddy postfix/pipe[12207]: 4DC3914C086: to=<otheruser@club.sandwich>, orig_to=<ilikesandwiches@example.org>, relay=spamassassin, delay=0.25, delays=0.05/0/0/0.2, dsn=2.0.0, status=sent (delivered via spamassassin service)
Apr  9 13:25:13 liddy postfix/qmgr[995]: 4DC3914C086: removed
Apr  9 13:25:13 liddy postfix/qmgr[995]: 85E1F14C089: from=<pepperoni@pepperoni.pizza>, size=2323, nrcpt=1 (queue active)
Apr  9 13:25:43 liddy postfix/smtp[12212]: 85E1F14C089: to=<otheruser@club.sandwich>, relay=mail-relay.isp.tld[137.82.224.74]:25, delay=30, delays=0.01/0.01/30/0.35, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 92585180F0)
Apr  9 13:25:43 liddy postfix/qmgr[995]: 85E1F14C089: removed

And now you know...the rest of the story.