Sunday, May 20, 2018

OpenDKIM Key Retrieval Failed

I set up OpenDKIM on my mailserver years ago, but while I got it working for signing just fine, I could never get it working for verification. Whenever I'd receive a message signed with DKIM, I'd see an error message like this in my mailserver logs:

May  4 01:20:20 mail opendkim[24874]: 7EE5982132: key retrieval failed (s=20161025, d=gmail.com): '20161025._domainkey.gmail.com' query timed out

When I tried dig on the DNS record listed in the logs, however, I was able to retrieve it just fine:

dig TXT 20161025._domainkey.gmail.com

Recently I set aside some time to "dig" into it further. I found I could use the opendkim-testkey command to at least reproduce the issue (instead of having to keep sending test emails from other accounts to myself). For example, the following command tries to retrieve the DKIM key from the 20161025._domainkey.gmail.com DNS TXT record (20161025 is the DKIM selector and gmail.com is the signing domain):

opendkim-testkey -s 20161025 -d gmail.com

This command gave me the same "query timed out" error message that I saw in my logs. Through a lot of trial and error, I figured out that I could avoid the error by setting the Nameservers property in my /etc/opendkim.conf file to an external DNS server (any external one will do), and restarting the OpenDKIM daemon. I've been running my mailserver on an Ubuntu EC2 instance, and apparently OpenDKIM does not like something about the combination of the Ubuntu DNS resolver and the internal EC2 DNS servers.

So, I added this line to my /etc/opendkim.conf, restarted the OpenDKIM daemon, and now I no longer see the "key retrieval failed" error message in my logs (instead I get a nice Authentication-Results header added by OpenDKIM to my incoming mail)!:

Nameservers 1.1.1.1

1.1.1.1 is Cloudflare's DNS servers — but any external DNS servers should work. One thing to keep in mind with using external DNS servers is that if your network has a stateless firewall, you need to allow inbound access to UDP in the "ephemeral" port range. If you're using EC2's Network ACLs (and not just using the defaults), this means adding a rule like the following to the ACL for the subnet in which your mailserver lives (32768-61000 is Ubuntu's ephemeral port range):

Rule #: [any number lower than your DENY rules]
Type: Custom UDP Rule
Protocol: UDP (17)
Port Range: 32768-61000
Source: 1.1.1.1/32
Allow/Deny: ALLOW