.qmail delivery filtering by adress and subject

My goal was simple, to filter incoming emails by subject and to address (on a catch-all address…) in the server and move the matches to a specific folder instead of the normal delivery. I’m running a Qmail+Vpopmail system, and these directions should be valid for similar setups.

Probably i could do this with Procmail or Maildrop, but it seemed all so complicated to do just a simple one time task that i opted in for the fun route, doing it myself… for a recurrent email filtering task, multiple accounts, customization, any kind of heavy email filtering i strongly suggest to stop here and go read about Procmail or Maildrop.

Still here? Good. First, add a .qmail (see dot.qmail man) in the user Maildir folder that you want to setup a filter:

| /home/vpopmail/domains/domain.com/user/filter
/usr/home/vpopmail/domains/domain/user/Maildir/

Save it and take care with the permissions, vpopmail user should be able to read it. What we are doing here is really simple, in the first line we pipe incoming emails trough filter, and according to filter exit code qmail execute (or not) the second line and proceed with the normal delivery.

Now setup the filter itself, it’s written in dirty and messy PHP but it gets the job done. Also it depends on PEAR Mail_mimeDecode, so go ahed and install it:

pear install Mail_mimeDecode

Now the filter script itself, it must be customized to your needs:

#! /usr/local/bin/php
<?php

/*
 * QMAIL FILTER EMAILS
 *
 * invoked by .qmail files
 * | /path/to/this/script
 *
 * parses email to address and subject against target strings
 * if BOTH are matched email is saved in $save_matched_dir
 * and qmail is instructed to ignore further .qmail lines
 *
 * CONFIG:
 */

$max_bytes        = 262144;            // mail size > 256Kb is not scanned
$to_address       = 'to@domain.com';   // to address filter
$subjects         = array('subject 1',
                          'match this subject',
                          'other subject');
$save_matched_dir = '/home/vpopmail/domains/domain.com/save-matched-emails/';

/*
 * QMAIL EXIT CODES
 *
 * 0 - Success (go to next .qmail line)
 * 99 - Success and abort (do not execute next lines)
 * 100 - permanent error (bounce)
 * 111 - soft error (retry later)
 */

try {
  function decodeHeader($input) {
    // Remove white space between encoded-words
    $input = preg_replace('/(=\?[^?]+\?(q|b)\?[^?]*\?=)(\s)+=\?/i', '\1=?',
                         $input);

    // For each encoded-word...
    while (preg_match('/(=\?([^?]+)\?(q|b)\?([^?]*)\?=)/i', $input, $matches)) {

      $encoded  = $matches[1];
      $charset  = $matches[2];
      $encoding = $matches[3];
      $text     = $matches[4];
      
      switch (strtolower($encoding)) {
        case 'b':
          $text = base64_decode($text);
          break;

        case 'q':
          $text = str_replace('_', ' ', $text);
          preg_match_all('/=([a-f0-9]{2})/i', $text, $matches);
          foreach($matches[1] as $value)
            $text = str_replace('='.$value, chr(hexdec($value)), $text);
          break;
      }

      $input = str_replace($encoded, $text, $input);
    }

    if (! isset($charset))
      $charset = 'ASCII';

    $input = strtolower(
               preg_replace('/[^a-z ]/i', 
                            '', 
                 iconv($charset, 
                       'ASCII//TRANSLIT//IGNORE', 
                       $input)
               )
             );
    return $input;
  }

  $mail  = '';
  $bytes = 0;

  $fr = fopen("php://stdin", "r");
  while (!feof($fr)) {
    $mail .= fread($fr, 1024);
    $bytes += 1024;
    if ($bytes > $max_bytes) {
      fclose($fr);
      exit(0);
    }
  }
  fclose($fr);

  require_once 'Mail/mimeDecode.php';
  $decoder   = new Mail_mimeDecode($mail);
  $structure = $decoder->decode(array('decode_headers' => true));

  // check from address
  $patt = '/[a-z0-9]+([_\\.-][a-z0-9]+)*@([a-z0-9]+([\.-][a-z0-9]+)*)+\\.[a-z]{2,}/i';
  preg_match($patt, 
             $structure->headers['to'], $matches);
  if (isset($matches[0]) && $matches[0] == $to_address) {
    // check subject
    $structure->headers['subject'] = decodeHeader($structure->headers['subject']);

    foreach ($subjects as $subject) {
       if (strpos($structure->headers['subject'], $subject) !== false) {
         $fw = fopen($save_matched_dir.
                     time().
                     '.'.
                     rand(1000, 99999).
                     '.'.
                     gethostname().
                     'S='.
                     strlen($mail).
                     ':2', 'w');
         fwrite($fw, $mail);
         fclose($fw);
         exit(99);
         // exit(0);
       }
    }
  }
} catch (Exception $e) {
}


// default, continue normal processing
exit(0);

?>

save, add the php hash bang, mark it executable and vpopmail owned. Also, the $save_matched_dir is not created and should already be present in your system.

Thats it. After this setup you should start seeing a steady flow of matched emails being saved in the matched directory and not delivered in your Inbox. As usually this works like a charm to me but can work incredible bad for you, so use at your own peril.

(auto magically) Deleting old Spam emails

If you followed My Qmail installation guide, or are using some kind of webmail or IMAP client that puts “SPAM” marked emails in some special folder, you can provide auto delete of old spam emails.

The popular Roundcube webmail has the movespam plugin (actually it’s broken, but is an easy fix) that moves spam to …/user/Maildir/.Junk folder. And this makes very easy to have a cron invoked script that deletes old spam emails.

So (in PHP), and with a threshold of 30 days (older are deleted):

$threshold = time() - (60*60*24*30); // 30 days

$junk_folders = shell_exec('find /home/vpopmail/domains/ -name .Junk -type d');
$junk_folders = explode("\n", $junk_folders);
$junk_folders = array_filter($junk_folders);

foreach ($junk_folders as $junk_folder) {
  foreach (array('new','cur') as $subfolder) {
    $d = dir($junk_folder.'/'.$subfolder);
    while (false !== ($entry = $d->read())) {
      if ($entry != '.' && $entry != '..' && is_file($d->path.'/'.$entry)) {
        if (filemtime($d->path.'/'.$entry) < $threshold)
          unlink($d->path.'/'.$entry);
      }
    }
  }
}

When we are deleting emails directly like this, you are screwing the user quota so probably it’s not a bad idea to rebuild quotas after running the old spam delete script.

Parsing Media RSS with PHP SimpleXML

Parsing XML docs with PHP SimpleXML is pretty straightforward. Yesterday i lost around 5 minutes to parse a Media RSS XML, and that was weird because normally with SimpleXML you take like 30 seconds… A Media RSS (MRSS) document is just a RSS with media extensions:

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
  <channel>
    <title>RSS Title</title>
    <link>http://www.domain.com/mylink</link>
    <description>My description</description>
    <item>
      <title>Title item 1</title>
      <link>http://www.domain.com/item_1.html</link>
      <description>Item 1 description</description>
      <guid>http://www.domain.com/item_1.html</guid>
      <media:content url="http://www.domain.com/item_1.jpg" height="240" width="320" />
    </item>
    <item>
      <title>Title item 2</title>
      <link>http://www.domain.com/item_2.html</link>
      <description>Item 2 description</description>
      <guid>http://www.domain.com/item_2.html</guid>
      <media:content url="http://www.domain.com/item_2.jpg" height="240" width="320" />
    </item>
    .... etc 
  </channel>
</rss>

The “problem” is to access the media:content or the other media:* elements. But don’t worry I’m going to show you how to do it 🙂

$xml = simplexml_load_file('http://domain.com/mrss.xml');
$namespaces = $xml->getNamespaces(true); // get namespaces

// iterate items and store in an array of objects
$items = array();
foreach ($xml->channel->item as $item) {

  $tmp = new stdClass(); 
  $tmp->title = trim((string) $item->title);
  $tmp->link  = trim((string) $item->link);
  // etc... 
  // now for the url in media:content
  //
  $tmp->media_url = trim((string) 
                    $item->children($namespaces['media'])->content->attributes()->url);

  // add parsed data to the array
  $items[] = $tmp;
}

There, a piece of cake!

UPDATE

I received a comment about Picasa RSS feed, where you have to dig just a bit deeper, as the media:url is inside a media:group. The XML feed is as follows

<?xml version='1.0' encoding='UTF-8'?>
<rss xmlns:atom='http://www.w3.org/2005/Atom' 
xmlns:media='http://search.yahoo.com/mrss/' 
xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'>
  <channel>
    <atom:id>https://picasaweb.google.com/data/feed/base/user/103218581909188195000</atom:id>
    <lastBuildDate>Wed, 16 Apr 2014 07:28:42 +0000</lastBuildDate>
    <title>Galerie fotografií uživatele Jiřetín JINAK</title>
    .... etc
    <item>
      <pubDate>Thu, 10 Apr 2014 07:16:22 +0000</pubDate>
      <atom:updated>2014-04-16T07:28:42.202Z</atom:updated>
      <author>Jiřetín JINAK</author>
      .... etc
      <media:group>
        <media:content url='https://lh6.googleusercontent.com/-C6WmXjRnV8Y/U0ZFRnm-ujE/AAAAAAAAAPQ/AbwIc0Ycugk/s100-c/RizikovaMistaVHornimJiretine.jpg' type='image/jpeg' medium='image'/>
        <media:credit>Jiřetín JINAK</media:credit>
        <media:description type='plain'/>
        <media:keywords/>
        <media:thumbnail url='https://lh6.googleusercontent.com/-C6WmXjRnV8Y/U0ZFRnm-ujE/AAAAAAAAAPQ/AbwIc0Ycugk/s160-c/RizikovaMistaVHornimJiretine.jpg' height='160' width='160'/>
        <media:title type='plain'>Riziková místa v Horním Jiřetíně</media:title>
      </media:group>
    </item>
    .... etc
  </channel>
</rss>

The PHP code follows the same logic, just add another step to take into account media:group

$xml = simplexml_load_file('http://picasaweb.google.com/data/feed/...&prettyprint=true');
$namespaces = $xml->getNamespaces(true); // get namespaces

$items = array();
foreach ($xml->channel->item as $item) {

  $tmp = new stdClass();
  $tmp->title = trim((string) $item->title);
  $tmp->link  = trim((string) $item->link);
  // etc...

  // now for the data in the media:group
  //
  $media_group = $item->children($namespaces['media'])->group;

  $tmp->media_url =    trim((string)
                       $media_group->children($namespaces['media'])->content->attributes()->url);
  $tmp->media_credit = trim((string)
                       $media_group->children($namespaces['media'])->credit);
  // etc

  // add parsed data to the array
  $items[] = $tmp;
}

DNS server with djbdns (VegaDNS GUI + bonus replication)

djbdnsThis is my personal guide to setup djbdns tinydns DNS server to publish domain name information to the Internets. If djbdns, tinydns, name server, dns are strange words just move along, move along…

Continue reading “DNS server with djbdns (VegaDNS GUI + bonus replication)”

Setting up SSH access with server keys (no password) in FreeBSD

You want to login from user@host-a to user@host-b automatically (with no password prompt).

Install ssh-copy-id in the host-a. This little handy script takes care of all the dirty details from your back.

cd /usr/ports/security/ssh-copy-id
make install clean

then run (still in host-a)

ssh-copy-id user@host-b

If you get the error “ERROR: No identities found” then you must generate your keys first and then re-run ssh-copy-id. To generate a pair of private/public SSH keys you must issue:

mkdir ~/.ssh
chmod 700 ~/.ssh
ssh-keygen -t rsa

You can leave the pass-phrase empty for automatic logins (no password prompt) or the more secure but less practical password prompt each and every time that you use the newly generated keys.

Next login from host-a to user@host-b will be made with SSH keys.

The dirty details

  • the user@host-a public key, usually ~/.ssh/id_rsa.pub is copied to host-b user/.ssh/authorized_keys
  • the host-b public host key (/etc/ssh/ssh_host_rsa_key.pub) is copied to host-a known hosts