Stop Overcomplicating Local DNS: Advanced Dnsmasq
8 mins read

Stop Overcomplicating Local DNS: Advanced Dnsmasq

I have a confession to make. Despite all the fancy service meshes, CoreDNS setups, and the omnipresent systemd-resolved trying to take over my life, I still deploy dnsmasq on almost every gateway I manage. Call me old-fashioned. Call me stubborn. But when you need a lightweight DHCP server and DNS forwarder that doesn’t require a PhD in BIND zone files to configure, it’s still the best tool for the job.

Most people install it, leave the defaults, and forget it exists. That’s fine. It works.

But you’re leaving a lot of power on the table. I’ve spent the last decade debugging weird network issues, and half the time, the solution was just a smarter dnsmasq.conf. So, rather than giving you the “Hello World” tour, I want to share the specific configurations that have saved my bacon in production environments. We’re talking about split-horizon DNS, clever DHCP tagging, and debugging queries without filling your disk with logs.

The “Sanity Check” Base Config

Before we get to the cool stuff, let’s talk about the basics. Default configs are usually too permissive or weirdly restrictive. Here is the boilerplate I drop onto every new server before I even think about specific requirements.

# /etc/dnsmasq.conf

# Never forward plain names (without a dot or domain part)
domain-needed

# Never forward addresses in the non-routed address spaces.
bogus-priv

# Don't read /etc/resolv.conf. Get upstream servers from command line or this file.
no-resolv
server=1.1.1.1
server=8.8.8.8

# Set the cache size. Default is 150, which is tiny for a modern network.
cache-size=1000

# Bind only to the interface you explicitly want. 
# Prevents accidental DNS exposure to the WAN if your firewall rules slip.
interface=eth1
bind-interfaces

Simple, right? The bogus-priv and domain-needed options are non-negotiable for me. They stop your local typos (like ping server1) from leaking out to upstream DNS servers. Cloudflare doesn’t need to know you’re trying to reach your local NAS.

Wildcards: The Local Dev Savior

If you are a developer, or you support them, you know the pain of /etc/hosts. It’s a mess. You add an entry, you forget it, you wonder why things break three months later.

I stopped editing host files years ago. Instead, I use dnsmasq to handle entire wildcard subdomains pointing to local ingress controllers or load balancers. It’s cleaner, and it works for every device on the network, not just my laptop.

# Point all .lab.internal domains to my K8s ingress IP
address=/lab.internal/192.168.10.50

# You can even force specific domains to specific upstream servers
# Useful if you have a corporate VPN DNS that handles internal names
server=/corp.example.com/10.0.0.5

The address= line is magic. Now api.lab.internal, db.lab.internal, and whatever.lab.internal all resolve to 192.168.10.50 instantly. No more syncing host files across five different VMs.

Network server room data center - Server racks in computer network security server room data center ...
Network server room data center – Server racks in computer network security server room data center …

DHCP Tagging: Different Rules for Different Devices

This is where things get interesting. Most people treat DHCP as a binary thing: you either get an IP, or you don’t. But dnsmasq has a tagging system that lets you apply logic to your leases.

I had a scenario recently where I needed my IoT devices to use a specific gateway (for traffic isolation) while my main workstations used the faster, direct gateway. I didn’t want to mess with VLANs because the switch was unmanaged dumb hardware.

Enter tagging.

# Define a tag "iot" for specific MAC addresses
dhcp-host=aa:bb:cc:dd:ee:ff,set:iot
dhcp-host=11:22:33:44:55:66,set:iot

# Default gateway for everyone else
dhcp-option=3,192.168.1.1

# Gateway for devices tagged "iot"
dhcp-option=tag:iot,3,192.168.1.254

# Force IoT devices to use a different DNS server too (maybe a Pi-hole)
dhcp-option=tag:iot,6,192.168.1.5

It’s logic-based networking without the overhead of heavy routing protocols. You can even tag based on the vendor class identifier sent by the client. So if you want all VoIP phones to get a specific boot server option, you just match their signature string. It feels like cheating, honestly.

Debugging Without Going Blind

DNS issues are notoriously annoying to troubleshoot because the failure mode is usually just “it hangs.” Is it the firewall? Is the upstream down? Is dnsmasq ignoring the request?

When things break, I don’t guess. I turn on the firehose. But—and this is key—I don’t leave it on.

# Log all queries. WARNING: This generates massive logs.
log-queries
log-facility=/var/log/dnsmasq.log

I usually toggle this by sending a SIGUSR1 signal to the process, which forces it to dump cache stats to the log, but for query logging, you usually need a restart or a config reload.

A trick I use involves log-async. If you enable logging on a busy server, the disk I/O from writing logs can actually block DNS processing. log-async pushes logging to a separate thread (or effectively unblocks the main loop). If you are debugging a high-traffic environment, this prevents your debugging tool from becoming the cause of the outage.

Split-Horizon Without the Headache

Network server room data center - Free Technology Network Servers Image - Technology, Network ...
Network server room data center – Free Technology Network Servers Image – Technology, Network …

Split-horizon DNS (where myserver.com resolves to a private IP when you’re inside the LAN, and a public IP when you’re outside) is often the reason people reach for BIND. They think they need “views.”

You don’t.

With dnsmasq, it’s just a matter of defining the local behavior. The public DNS records exist out there in the world (Cloudflare, Route53, etc.). Inside your network, you just override it.

# Inside the LAN, this domain is local
host-record=git.example.com,10.0.0.15

# Everything else for example.com goes to the upstream nameservers
server=/example.com/8.8.8.8

Wait, why host-record and not address?

Because host-record adds both the DNS record and the PTR (reverse DNS) record in one shot. If you use address, reverse lookups might still fail or return nothing. I prefer my logs to show hostnames, not just IPs, so host-record is my default choice these days.

The PXE Boot Trap

I run a small homelab cluster that I re-provision fairly often. PXE booting is finicky. The number of times I’ve fought with TFTP timeouts is embarrassing.

dnsmasq includes a TFTP server. Use it. Don’t install tftpd-hpa unless you have a really weird edge case. Keeping the DHCP and TFTP logic in the same binary simplifies the “next-server” logic immensely.

enable-tftp
tftp-root=/var/lib/tftpboot

# The magic needed for UEFI HTTP boot (if you're fancy) or legacy PXE
dhcp-boot=pxelinux.0

One gotcha I hit last year: some modern UEFI clients are extremely picky about the block size. If you see transfers starting and then stalling immediately, try clamping the MTU or checking your tftp-no-blocksize option. It’s rare, but it happens.

Why I Stick With It

Look, dnsmasq isn’t perfect. It’s single-threaded (mostly), so it won’t scale to millions of queries per second. If you are an ISP, use something else.

But for the rest of us? For the edge gateways, the branch offices, the dev labs, and the home networks? It is the Swiss Army knife of connectivity. It’s a single binary that handles DNS, DHCP, TFTP, and IPv6 Router Advertisements with a configuration file you can actually read.

The next time you’re about to spin up a Docker container just to run a DNS forwarder, ask yourself if you really need that complexity. Usually, the answer is already installed on your system, waiting in /etc/dnsmasq.conf.

Leave a Reply

Your email address will not be published. Required fields are marked *