When rate limiting (and firewalling) goes wrong


Recently I experienced a few power failures that lasted hours. This means that when the power is back, all of my infrastructure reboots and reconnects. For the most part this is 100% automatic, but the last time I ran into an interesting problem.

My pi-hole was running with the default rate limiting of 1000/60. This means that each device can make up to 1000 requests per minute, and if it exceeds that it will be put on a deny list for 60 seconds.

It turns out that my main server that runs a bunch of docker containers makes a lot of DNS requests when everything is starting up all at once. This creates a storm of requests to the pi-hole and the server ends up being blocked for DNS requests (responding with REFUSED) due to rate limiting.

Unfortunately the behaviour of enough of the containers is to retry when this happens. This causes more DNS requests to be made as the retry logic runs. These retries cause another wave of requests which cause the server to be blocked again. Some of my containers entered error conditions due to unexpected DNS failures, so these needed to later be restarted but at least they stopped contributing to the problem.

My email container was pretty unhappy, it really wants to be able to use DNS, even when receiving email. Since my server had been unavailable for a while, there were external email servers trying to deliver mail that had been queued – this contributed to the load. Additionally I couldn’t connect any email clients to the server which left me scratching my head a little, more on that later on.

The ‘fix’ was easy enough. Modify the pi-hole DNS rate-limiting setting to 0/0 to remove any rate limiting. This is imperfect, but at one point I saw 30,000 requests in a minute from my struggling server and I think I’d rather have no limit and deal with that problem than hit the limit and run into this denial of service issue.

Now that the pi-hole was happy, I was able to get most of my containers to be happy with a little poking at them. Email was still sad, and this took me a coffee break to realize what was wrong. The email container was receiving email just fine, but I could not connect with a client. This felt like a networking problem, but how could that be?

I had forgotten (again) – that the email server has fail2ban running in it. This scans logs looking for suspicious activity and will ban an IP for a period of time by inserting a firewall rule. Furthermore, as I use the domain name to configure my email client – this resolves to the external IP. The external IP means that the client talks to my OpenWRT router which provides NAT and then redirects/maps that external IP back into my network. This has the effect that the originating IP looks like it is my router, not the client machine on the internal IP address. This process is called NAT reflection, or NAT hairpinning.

While NAT reflection is a super handy feature for my OpenWRT router to have, allowing me to easily from inside my home network visit a machine I’ve exposed via port mapping to the outside world using the same DNS entry that points at the external IP address — it means that services on that machine see my router IP as the client IP. When any of the machines in my house have problems connecting to my email server, in this case because I had DNS REFUSED errors on the email server, fail2ban decides that is a bad client and bans it. Thus banning all traffic originating from my home network.

This is easy to fix once you understand what is happening, I just needed to unban my router IP and my email clients could connect.

Wireguard – self hosted VPN

After my recent adventures setting up IoT devices with local only access, I now needed to sometimes be able to talk to those devices when I’m not home. There are plenty of solutions, including setting up SSH tunnels which I’ve done in the past. Wireguard seems like a nice solution and it was high time I had VPN access to my home network.

The linuxserver.io folks have a nicely curated wireguard container with documentation. There are also plenty of good tutorials on installing wireguard. You can even go deeper and build your own, or explore alternatives.

Here is a makefile – based on my template for docker makefiles.

Once you create this – go pull the .png files for the QR codes from the config directory. This will make it trivial to setup your phone.

On mobile data – this just works. The local only Tasmota devices I can now control when away from home and it’s super easy. What doesn’t work with this setup is accessing other docker containers on the same host as the wireguard container.

I explored a few options to solve this, but it boils down to the problem of containers not easily being able to see each other. This bugs me, because while I can appreciate the security of containers being isolated from each other – if I expose a port on the host to a container – then other containers should be able to see that same port – but they can’t. This means that containers actually have less visibility into the host than an external machine – that seems wrong.

You can solve the network visibility problem by giving the container a unique IP address. Here is a brief recap of creating a macvlan docker network – details can be found in my previous post on this topic

Now from the makefile above, all we need to do is add --network myNewNet to the docker flags and update the container and we’re good to go.

It’s interesting that the docker ps command seems to not show as much about the container when it is run in this mode (No port information – but yes, ports are exposed).

One thing to keep in mind, if you first setup the container just on the docker host without macvlan you may need to adjust your port mapping to account for the new IP.

If I want the docker host machine to be able to see this container on the new IP we will need to use that --aux-address to build a network path. This is optional, but useful so it’s worth doing.

The version of Ubuntu I’m using doesn’t ship with rc.local enabled. I started down the path of enabling rc.local, but the further I got the more it seemed this was the wrong answer. This post talking about rc.local, pointed me at cron’s ability to execute on commands on reboot. The cron @reboot capability seems like the easy path here, the other choice being to create a systemd service which is effectively what the rc.local solution is.

Let’s create a script in /usr/local/bin/macvlansetup, making sure it’s executable.

Then we’ll edit root’s crontab to call this on reboot

Adding the new job

Now we’re set. The wireguard container has a unique IP address and no visibility problems to any of my other containers on the same host. The IoT devices can also be seen just fine when I am remote and enable the VPN. The one trade-off is a slightly more complicated networking setup.

With the default wireguard settings, this acts like a full tunnel VPN – meaning all of the network traffic runs over the tunnel. This is useful as a security measure if I’m on an untrusted wifi network – all the traffic will flow securely from my device to my home network then back out again to the internet. In my case with my pi-hole configured as the DNS server, I get ad-blocking over the VPN.

OpenWRT Guest Network (and beyond)

As my kids get to the age where both they and their friends have devices, this means granting access to our internet to a growing circle of people. OpenWRT has the ability to support guest networks and I’ve been meaning to set this up for some time.

Beyond simply having a guest network, I also want to setup an IoT network where I can isolate some of the network enabled things but not give them wide access to the rest of my internal infrastructure.

Let’s start with a simple guest network setup. This is well documented on the OpenWRT site. I’ll be using the CLI instructions to make these changes. FWIW this is based on the 21.02.0 version of OpenWRT.

The first part of this will be pretty much copy and paste from the OpenWRT instructions:

I of course modified the ipaddr for my setup, but pretty much used the command as is.

On the Web UI (LuCI) you should now see under Network->Interfaces a new Guest network. All of the changes landed in the /etc/config/network file.

Now we setup a wireless network configuration with the first radio

Again, I’ve pretty much followed the directions but have customized the SSID and not shared the password. I’ve rolled in the extras to isolate guest users and use encryption.

The LuCI web UI now looks a little more concerning (Network->Wireless)

But.. the previous Network->Interfaces now seems to have come to life.

Still the expected changes are reflected in /etc/config/wireless.

It took a bit of head-scratching, but I figured out what was wrong. I had not specified a wireless.guest.key that met the minimum length (8 to 63 characters) – this apparently caused everything to go sideways. Once I fixed this my new wireless interface came to life.

Let’s continue on with the DHCP configuration

And the Firewall

Now not only should you be able to see the new WiFi SSID available to connect to, but when you do you will be isolated from all other devices on the network and only able to see the internet. A network scan will turn up the existence of the router, but attempts to connect the web UI fail – that’s pretty cool isolation.

Devices connected to the guest network still show up in the OpenWRT status page. They are assigned a DHCP address from the network.guest.ipaddr subnet, which is distinct from my normal network.

Apparently I do give up a little performance having two (or more) networks hung off of the same radio, but the utility of having a restrictive guest network is pretty cool.

The Archer C7 has two radios, and we’ve not configured the guest network on the second radio. Let’s do that now.

Cool – now I have a guest network that is on both radio bands. You’ll note that I run both access points with the same SSID, this mostly just works and devices figure it out. I even run my dumb AP with the same SSID. This is one approach that works and let’s people move around the house with seamless connections.

I do know that some people try to force a device to a particular radio type, and will run their legacy network on a different SSID. This is of course also valid, it really depends what you’re looking to achieve. I’m taking the simple to configure devices approach, and giving the devices the responsibility to work out which radio band and which access point to connect to.

Now let’s create a second ‘guest’ like network for IoT devices. This time I’ll just combine all of the steps together

Nice – now I have a third subnet which will hand out DHCP addresses valid for 24hrs. The devices are all isolated from each other.

For devices on the IoT network, while I don’t want the device to be able to see anything other than the internet – for my own monitoring and use, I’d like to be able to see the devices on the IoT network from my normal lan network. This turns out to be very easy.

Connecting to the IoT network and doing as scan, shows me that I can only see myself and the router (because the device has to send traffic somewhere). Again, with the isolation I can’t connect to the web interface of the router. However, with this new “zone->forwardings” I can from my lan network see devices on the IoT network. Super cool, and actually very easy.

There plenty more tweaking we can do here, but to avoid going too far down the rabbit hole we’ll stop here.