Computing – Page 5

When Mirrors Break: RAID1 failure and recovery

A couple of years ago I added a second drive to my server in a RAID1 (mirror) configuration. Originally I was using the single drive for logs, but with a more durable mirror setup I moved more (important) data to it.

RAID is not a backup story, if you really care about the data you want to back it up. There are two hard lessons I learned with this recent failure (and my recovery). Two valuable to me bits of data I’m storing on this mirrored volume are email, and photoprism storage (but not the photos themselves). Stupidly I did not have regular backups of either of these, please learn from my mistake.

The two lessons I hope to learn from this are:

Backup your data, even a bad backup is better than nothing
Do not ignore any signs of problems, replace any suspicious hardware ASAP

If you read the comments on my previous post, you will see a history of minor failures that I clearly willfully ignored. I mean, hey – it’s a mirrored setup and mostly I had 2 drives working fine.. right? Stupid me.

The replacement 500GB SSD drive cost me $56.49 taxes in, it even has a 5 year manufacturer warranty in comparison to the 3 year warranty on the failed ADATA drive. Sadly checking the ADATA warranty shows me it made it just path the 3 year mark (not that a ‘free’ replacement drive would fix my problem)

While ADATA has been mostly reliable for me in the past, I’ll pick other brands for my important data. The ADATA products are often very cheap which is attractive, but at the current cost of SSDs it’s easy to pay for the premium brands.

Here is a brief replay of how the disaster rolled out. The previous day I had noticed that something was not quite right with email, but restarting things seemed to resolve the issue. The next morning email wasn’t flowing, so there was something wrong.

Looking at the logs, I was seeing a lot of messages “structure needs cleaning” – which is an indicator that there is some sort of ext4 filesytem problem and it needs to run a check to clean things up. It also appeared that the ADATA half of the mirror had failed in some way. Rebooting the system seemed like a good idea and everything seems to have come back.

Checking the logs for the mail system showed all was well, but then I checked email on my phone, and there were no messages? Stupidly I then opened up my mail client on my laptop, which then proceeded to synchronize with the mail server and delete all of the email stored on my laptop to mirror the empty mailbox on the server.

What was wrong? It took a while, but I figured out that my RAID1 array had completely failed to initialize, both volumes were marked as ‘spare’.

$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdf1[2](S) sde1[0](S)
      937435136 blocks super 1.2
       
unused devices: <none>

$ cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : inactive sdf1[2](S) sde1[0](S)

937435136 blocks super 1.2

unused devices: <none>

Ugh, well that explains what happened. When the system reboot the mount failed – and my mail server just created new data directories on the mount point (which are on my root volume).

At this point I realize I’m in a bad place, having potentially flushed decades of email. Have I mentioned that running your own email is a bad idea?

Time to start capturing things for recovery. I did a copy of the two drives using dd:

$ sudo dd if=/dev/sde1 of=/other/volume/sde1-dd.img
$ sudo dd if=/dev/sdf1 of=/other/volume/sdf1-dd.img

1 2	$ sudo dd if=/dev/sde1 of=/other/volume/sde1-dd.img $ sudo dd if=/dev/sdf1 of=/other/volume/sdf1-dd.img

In the process of doing this, it became obvious that sdf (the ADATA drive) had hard read errors, where in contrast I was able to complete the image creation of sde (a Kingston drive).

Once I had some time to think about the situation, I was able to re-add the good drive to the array to make it become active. This let me mount the volume and make a copy of the email for backup purposes. Once this was done I unmounted and ran a fschk -y /dev/md0 to fix all of the filesystem errors.

I then stopped the currently running mail server, renamed the mount point directory to keep the email that had come into the system while I was doing repairs, and created a new (empty) mount point. Then a reboot.

Sigh of relief as all of my mail appeared back. Sure, I’m running with a degraded RAID1 array and the fschk clearly removed some corrupted files but at least the bulk of my data is back.

Fixing the broken mirror was relatively straight forward. I bought a new drive. Then I captured the output of ls dev/disk/by-id/ before powering down the system and physically swapping the bad drive for the good drive. I could then repeat the ls dev/disk/by-id/ and look at the diffs, this allowed me to see the new drive appear, and inspect which drive letter it mapped to.

ls -l /dev/disk/by-id/ata-WD_Blue_SA510_2.WD_Blue_SA510_2.5_500GB_224753806202 
lrwxrwxrwx 1 root root 9 Aug  9 19:12 /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202 -> ../../sdf

1 2	ls -l /dev/disk/by-id/ata-WD_Blue_SA510_2.WD_Blue_SA510_2.5_500GB_224753806202 lrwxrwxrwx 1 root root 9 Aug 9 19:12 /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202 -> ../../sdf

Nice, it appears to have slotted in just where the previous ADATA drive was, not important but comforting. I then dumped the fdisk information of the healthy Kingston drive.

$ sudo fdisk -l /dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8
Disk /dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: KINGSTON SA400S3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6C260AF2-796D-5E49-8CB0-6E95DA5C3900

Device                                                           Start       End   Sectors   Size Type
/dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8-part1  2048 937701375 937699328 447.1G Linux filesystem

$ sudo fdisk -l /dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8

Disk /dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8: 447.13 GiB, 480103981056 bytes, 937703088 sectors

Disk model: KINGSTON SA400S3

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: gpt

Disk identifier: 6C260AF2-796D-5E49-8CB0-6E95DA5C3900

Device Start End Sectors Size Type

/dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026841D62B77E8-part1 2048 937701375 937699328 447.1G Linux filesystem

We want our new drive to be partitioned the same way, luckily the new SSD is even bigger. Mostly this is accepting defaults with the exception of typing in the last sector to match the Kingston drive.

$ sudo fdisk /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202

Welcome to fdisk (util-linux 2.34).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xad299882.

Command (m for help): p
Disk /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202: 465.78 GiB, 500107862016 bytes, 976773168 sectors
Disk model: WD Blue SA510 2.
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xad299882

Command (m for help): g
Created a new GPT disklabel (GUID: 300BCC0D-C0F3-A640-B717-DFBB3311378F).

Command (m for help): n
Partition number (1-128, default 1): 
First sector (2048-976773134, default 2048): 
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-976773134, default 976773134): 937701375

Created a new partition 1 of type 'Linux filesystem' and of size 447.1 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

$ sudo fdisk /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202

Welcome to fdisk (util-linux 2.34).

Changes will remain in memory only, until you decide to write them.

Be careful before using the write command.

Device does not contain a recognized partition table.

Created a new DOS disklabel with disk identifier 0xad299882.

Command (m for help): p

Disk /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202: 465.78 GiB, 500107862016 bytes, 976773168 sectors

Disk model: WD Blue SA510 2.

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: dos

Disk identifier: 0xad299882

Command (m for help): g

Created a new GPT disklabel (GUID: 300BCC0D-C0F3-A640-B717-DFBB3311378F).

Command (m for help): n

Partition number (1-128, default 1):

First sector (2048-976773134, default 2048):

Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-976773134, default 976773134): 937701375

Created a new partition 1 of type 'Linux filesystem' and of size 447.1 GiB.

Command (m for help): w

The partition table has been altered.

Calling ioctl() to re-read partition table.

Syncing disks.

This is similar to the original creation of the RAID1 post, but we can now skip to step 8 and add the new volume.

sudo mdadm /dev/md0 --add /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202-part1

1	sudo mdadm /dev/md0 --add /dev/disk/by-id/ata-WD_Blue_SA510_2.5_500GB_224753806202-part1

And that’s it, now we just wait for the mirror to re-sync. It is interesting to note that while I can talk about the device ‘by-id’, mdstat uses the legacy drive letters.

$ cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdf1[2] sde1[0]
      468717568 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  0.1% (862656/468717568) finish=36.1min speed=215664K/sec
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

$ cat /proc/mdstat

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdf1[2] sde1[0]

468717568 blocks super 1.2 [2/1] [U_]

[>....................] recovery = 0.1% (862656/468717568) finish=36.1min speed=215664K/sec

bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

And a short while later, it’s nearly done.

$ cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdf1[2] sde1[0]
      468717568 blocks super 1.2 [2/1] [U_]
      [===================>.]  recovery = 97.5% (457392384/468717568) finish=3.7min speed=50854K/sec
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

$ cat /proc/mdstat

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdf1[2] sde1[0]

468717568 blocks super 1.2 [2/1] [U_]

[===================>.] recovery = 97.5% (457392384/468717568) finish=3.7min speed=50854K/sec

bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

At this point my email appears to be working correctly. The ext4 filesystem corruption I blame on the failing ADATA drive in the mirror, but this is a guess. The corruption caused a few emails to be ‘lost’, but had a bigger impact on the photoprism data which in part was the mariadb storage. I also noticed that both my prometheus data and mimir data were corrupted, neither of these are critical though.

Backups are good, they don’t have to be perfect – future you will be thankful.

OpenWRT 21.02 to 22.03 upgrade

Here are my notes on upgrading OpenWRT, they are based on my previous post on upgrading.

In this case I’m upgrading specifically TP-Link Archer C7 v2 – the process will be similar for other OpenWRT devices but it’s always worth reviewing the device page. I’ve also got some v5 versions, and this means a slightly different firmware, but the same exact process.

For a major version upgrade it is worth reading the release notes First start by reading the release notes – nothing seems to be specific to my device that requires any special considerations, so I can just proceed.

An upgrade from OpenWrt 21.02 or 22.03 to OpenWrt 22.03.5 is supported in many cases with the help of the sysupgrade utility which will also attempt to preserve the configuration.

I personally prefer the cli based process, so we’ll be following that documentation.

Step 1. While I do nightly automated backups, I should also just do a web UI based backup – this is mostly for peace of mind

Step 2. Download the correct sysupgrade binary -the easy way to do this is by using the firmware selector tool. I recommend that you take the time to verify the sha256sum of your download, this is rarely an issue but I have experienced bad downloads and it’s hard to debug after the fact.

It is recommend to check you have enough RAM free – thankfully the archer has a lot of RAM (which is used for the /tmp filesystem too) so I have lots of space.

Step 3. Get ready to flash – if you review the post install steps, you’ll see that while the sysupgrade will preserve all of our configuration files – it won’t preserve any of the packages.

This script will print out all of the packages you’ve installed.

cat << "EOF" > /tmp/listuserpackages.awk
#!/usr/bin/awk -f
BEGIN {
    ARGV[ARGC++] = "/usr/lib/opkg/status"
    cmd="opkg info busybox | grep '^Installed-Time: '"
    cmd | getline FLASH_TIME
    close(cmd)
    FLASH_TIME=substr(FLASH_TIME,17)
}
/^Package:/{PKG= $2}
/^Installed-Time:/{
    INSTALLED_TIME= $2
    # Find all packages installed after FLASH_TIME
    if ( INSTALLED_TIME > FLASH_TIME ) {
        cmd="opkg whatdepends " PKG " | wc -l"
        cmd | getline WHATDEPENDS
        close(cmd)
        # If nothing depends on the package, it is installed by user
        if ( WHATDEPENDS == 3 ) print PKG
    }
}
EOF
 
# Run the script
chmod +x /tmp/listuserpackages.awk
/tmp/listuserpackages.awk

cat << "EOF" > /tmp/listuserpackages.awk

#!/usr/bin/awk -f

BEGIN {

ARGV[ARGC++] = "/usr/lib/opkg/status"

cmd="opkg info busybox | grep '^Installed-Time: '"

cmd | getline FLASH_TIME

close(cmd)

FLASH_TIME=substr(FLASH_TIME,17)

}

/^Package:/{PKG= $2}

/^Installed-Time:/{

INSTALLED_TIME= $2

# Find all packages installed after FLASH_TIME

if ( INSTALLED_TIME > FLASH_TIME ) {

cmd="opkg whatdepends " PKG " | wc -l"

cmd | getline WHATDEPENDS

close(cmd)

# If nothing depends on the package, it is installed by user

if ( WHATDEPENDS == 3 ) print PKG

}

EOF

# Run the script

chmod +x /tmp/listuserpackages.awk

/tmp/listuserpackages.awk

Save the list away so you can easily restore things post install. There is a flaw with this script as I’ll point out later, but in many cases it’ll work fine for you.

On my dumb access points I get this list of packages

prometheus-node-exporter-lua-netstat
prometheus-node-exporter-lua-wifi
prometheus-node-exporter-lua-openwrt
prometheus-node-exporter-lua-wifi_stations
rsync
prometheus-node-exporter-lua-nat_traffic
libpcap1

prometheus-node-exporter-lua-netstat

prometheus-node-exporter-lua-wifi

prometheus-node-exporter-lua-openwrt

prometheus-node-exporter-lua-wifi_stations

rsync

prometheus-node-exporter-lua-nat_traffic

libpcap1

Mostly I have the prometheus exporter (for metrics) and rsync (for backups) installed. My main gateway has a few more packages (vnstat and sqm) but it’s similar.

Step 4. Time to flash. Place the firmware you downloaded onto the openwrt router in /tmp and run sysupgrade.

# Flash firmware
sysupgrade -v /tmp/openwrt-22.03.5-ath79-generic-tplink_archer-c7-v5-squashfs-sysupgrade.bin

1 2	# Flash firmware sysupgrade -v /tmp/openwrt-22.03.5-ath79-generic-tplink_archer-c7-v5-squashfs-sysupgrade.bin

This is a bit scary — because you lose your ssh connection as part of the upgrade. It took about a minute and a half of radio silence before the device came back. However, I was then greeted with the new web UI – and over ssh I get the 22.03.5 version splash.

Step 5. Check for any package updates – usually I leave things well enough alone, but we just did a full upgrade so it’s worth making sure we are fully current. Note, this may mess with the script in step 3 since the install dates will change for other components.

opkg update
opkg list-upgradable

1 2	opkg update opkg list-upgradable

If you get any packages listed, we can easily upgrade using opkg upgrade <pkg name>

Step 6. Install packages captured in step 3. Do this by creating a simple script to opkg install <pkg name> for each package.

opkg install prometheus-node-exporter-lua-netstat
opkg install prometheus-node-exporter-lua-wifi
opkg install prometheus-node-exporter-lua-openwrt
opkg install prometheus-node-exporter-lua-wifi_stations
opkg install rsync
opkg install prometheus-node-exporter-lua-nat_traffic
opkg install libpcap1

opkg install prometheus-node-exporter-lua-netstat

opkg install prometheus-node-exporter-lua-wifi

opkg install prometheus-node-exporter-lua-openwrt

opkg install prometheus-node-exporter-lua-wifi_stations

opkg install rsync

opkg install prometheus-node-exporter-lua-nat_traffic

opkg install libpcap1

Post install, take a careful look at the output of the installs, and look for any *-opkg files in /etc/config or /etc. These are config files which conflicted with local changes.

Sometimes you will want to keep your changes – others you’ll want to replace your local copy with the new -opkg file version. Take your time working through this as it will avoid tricky problems to debug later.

When I upgraded my main router, vnstat seems to have been busted in some way. The data file was no longer readable (and it’s backup) – I suspect that some code change caused the format to be incompatible. I had to remove and recreated a new one. Oh well.

Things mostly went smoothly, it took about 30mins per openwrt device and I was going slowly and taking notes. There was one tiny glitch in the upgrade. The /root/.ssh directory was wiped out – I use this to maintain a key based ssh/scp from each of my dumb AP to the main router.

Bonus. I found a new utility: Attended Sysupgrade. This is pretty slick as it makes it very easy to roll minor versions (so 22.03.02 -> 22.03.05 for example) but it will not do a major upgrade (21.03 -> 22.03). I’ve installed this on all of my openwrt devices and will use it to stay current. It takes care of all of the upgrade steps above.. but it does suffer the same ‘glitch’ in that /root/.ssh is wiped out. The other downside is that the custom firmware that is built, breaks the script in step 3 – since the flash install date is the same for all of the components. I’ll need to go refactor that script for my next upgrade.

OpenWRT as a wireguard client

Previously I’ve written about running wireguard as a self hosted VPN. In this post I’ll cover how to connect a remote site back to your wireguard installation allowing that remote site to reach machines on your local (private) network. This is really no different than configuring a wireguard client on your phone or laptop, but by doing this on the router you build a network path that anyone on the remote network can use.

I should probably mention that there are other articles that cover a site-to-site configuration, where you have two wireguard enabled routers that extend your network across an internet link. While this is super cool, it wasn’t what I wanted for this use case. I would be remiss in not mentioning tailscale as an alternative if you want a site-to-site setup, it allows for the easy creation of a virtual network (mesh) between all of your devices.

In my case my IoT devices can all talk to my MQTT installation, and that communication not only allows the gathering of data from the devices, but offers a path to controlling the devices as well. What this means is that an IoT device at the remote site, if it can see the MQTT broker I host on my home server – will be controllable from my home network. Thus setting up a one way wireguard ‘client’ link is all I need.

I will assume that the publicly visible wireguard setup is based on the linuxserver.io/wireguard container. You’ll want to add a new peer configuration for the remote site. This should generate a peer_remote.conf file that should look something like:

[Interface]
Address = 10.13.13.4
PrivateKey = SECRETPRIVKEY=
ListenPort = 51820
DNS = 10.0.0.8

[Peer]
PublicKey = SECRETPUBKEY=
PresharedKey = SECRETPRESHARE=
Endpoint = mydomain.com:51820
AllowedIPs = 0.0.0.0/0, ::/0

[Interface]

Address = 10.13.13.4

PrivateKey = SECRETPRIVKEY=

ListenPort = 51820

DNS = 10.0.0.8

[Peer]

PublicKey = SECRETPUBKEY=

PresharedKey = SECRETPRESHARE=

Endpoint = mydomain.com:51820

AllowedIPs = 0.0.0.0/0, ::/0

This is the same conf file you’d grab and install into a wireguard client, but in our case we want to setup an OpenWRT router at a remote location to use this as it’s client configuration. The 10.13.13.x address is the default wireguard network for the linuxserver.io container.

I will assume that we’re on a recent version of OpenWRT (21.02 or above), as of this writing 23.03.2 is the latest stable release. As per the documentation page on setting up the client you’ll need to install some packages. This is easy to do via the cli.

# Install packages
opkg update
# 21.02 or above
opkg install wireguard-tools luci-app-wireguard luci-proto-wireguard

# Install packages

opkg update

# 21.02 or above

opkg install wireguard-tools luci-app-wireguard luci-proto-wireguard

Now there are some configuration parameters you need to setup (again in the cli, as we’re going to set some environment variables then use them later).

# Configuration parameters
WG_IF="wg0"
WG_SERV="mydomain.com"
WG_PORT="51820"
WG_ADDR="10.13.13.4/32"
WG_KEY="SECRETPRIVKEY="
WG_PSK="SECRETPRESHARE="
WG_PUB="SECRETPUBKEY="

# Configuration parameters

WG_IF="wg0"

WG_SERV="mydomain.com"

WG_PORT="51820"

WG_ADDR="10.13.13.4/32"

WG_KEY="SECRETPRIVKEY="

WG_PSK="SECRETPRESHARE="

WG_PUB="SECRETPUBKEY="

Now this is where I got stuck following the documentation. It wasn’t clear to me that the WG_ADDR value should be taken from the peer_remote.conf file as I’ve done above. I thought this was just another private network value to uniquely identify the new wg0 device I was creating on the OpenWRT router. Thankfully some kind folk on the OpenWRT forum helped point me down the right path to figure this out.

Obviously WG_SERV points at our existing wireguard installation, and the three secrets WG_KEY, WG_PSK, and WG_PUB all come from the same peer_remote.conf file. I do suspect that one of these might be allowed to be unique for the remote installation however, I know that this works – and I do not believe we are introducing any security issues.

At this point we have all the configuration we need, and can proceed to configure the firewall and network

# Configure firewall
uci rename firewall.@zone[0]="lan"
uci rename firewall.@zone[1]="wan"
uci del_list firewall.wan.network="${WG_IF}"
uci add_list firewall.wan.network="${WG_IF}"
uci commit firewall
/etc/init.d/firewall restart

# Configure network
uci -q delete network.${WG_IF}
uci set network.${WG_IF}="interface"
uci set network.${WG_IF}.proto="wireguard"
uci set network.${WG_IF}.private_key="${WG_KEY}"
uci add_list network.${WG_IF}.addresses="${WG_ADDR}"
 
# Add VPN peer
uci -q delete network.wgserver
uci set network.wgserver="wireguard_${WG_IF}"
uci set network.wgserver.public_key="${WG_PUB}"
uci set network.wgserver.preshared_key="${WG_PSK}"
uci set network.wgserver.endpoint_host="${WG_SERV}"
uci set network.wgserver.endpoint_port="${WG_PORT}"
uci set network.wgserver.route_allowed_ips="1"
uci set network.wgserver.persistent_keepalive="25"
uci add_list network.wgserver.allowed_ips="0.0.0.0/0"
uci commit network
/etc/init.d/network restart

# Configure firewall

uci rename firewall.@zone[0]="lan"

uci rename firewall.@zone[1]="wan"

uci del_list firewall.wan.network="${WG_IF}"

uci add_list firewall.wan.network="${WG_IF}"

uci commit firewall

/etc/init.d/firewall restart

# Configure network

uci -q delete network.${WG_IF}

uci set network.${WG_IF}="interface"

uci set network.${WG_IF}.proto="wireguard"

uci set network.${WG_IF}.private_key="${WG_KEY}"

uci add_list network.${WG_IF}.addresses="${WG_ADDR}"

# Add VPN peer

uci -q delete network.wgserver

uci set network.wgserver="wireguard_${WG_IF}"

uci set network.wgserver.public_key="${WG_PUB}"

uci set network.wgserver.preshared_key="${WG_PSK}"

uci set network.wgserver.endpoint_host="${WG_SERV}"

uci set network.wgserver.endpoint_port="${WG_PORT}"

uci set network.wgserver.route_allowed_ips="1"

uci set network.wgserver.persistent_keepalive="25"

uci add_list network.wgserver.allowed_ips="0.0.0.0/0"

uci commit network

/etc/init.d/network restart

This sets up a full tunnel VPN configuration. If you want to permit a split-tunnel then we need to change one line in the above script.

uci add_list network.wgserver.allowed_ips="0.0.0.0/0"

1	uci add_list network.wgserver.allowed_ips="0.0.0.0/0"

The allowed_ips needs to change to specify the subnet you want to route over this wireguard connection.

One important note. You need to ensure that your home network and remote network do not have overlapping IP ranges. This would introduce confusion about where to route what. Let’s assume that the home network lives on 192.168.1.0/24 – we’d want to ensure that our remote network did not use that range so let’s assume we’ve configure the remote OpenWRT setup to use 192.168.4.0/24. By doing this – we make it easy to know which network we mean when we are routing packets around.

Thus if we wanted to only send traffic destined for the home network over the wireguard interface, we’d specify:

uci add_list network.wgserver.allowed_ips="192.168.1.0/24"

1	uci add_list network.wgserver.allowed_ips="192.168.1.0/24"

As another way of viewing this configuration, let’s go take a peek at the config files on the OpenWRT router.

/etc/config/network will have two new sections

config interface 'wg0'
	option proto 'wireguard'
	option private_key 'SECRETPRIVKEY='
	list addresses '10.13.13.4/32'

config wireguard_wg0 'wgserver'
	option public_key 'SECRETPUBKEY='
	option preshared_key 'SECRETPRESHARE='
	option endpoint_host 'mydomain.com'
	option endpoint_port '51820'
	option route_allowed_ips '1'
	option persistent_keepalive '25'
	list allowed_ips '192.168.1.0/24'

config interface 'wg0'

option proto 'wireguard'

option private_key 'SECRETPRIVKEY='

list addresses '10.13.13.4/32'

config wireguard_wg0 'wgserver'

option public_key 'SECRETPUBKEY='

option preshared_key 'SECRETPRESHARE='

option endpoint_host 'mydomain.com'

option endpoint_port '51820'

option route_allowed_ips '1'

option persistent_keepalive '25'

list allowed_ips '192.168.1.0/24'

and the /etc/config/firewall will have one modified section

config zone 'wan'
	option name 'wan'
	list network 'wan'
	list network 'wan6'
	list network 'wg0'
	option input 'REJECT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option masq '1'
	option mtu_fix '1'

config zone 'wan'

option name 'wan'

list network 'wan'

list network 'wan6'

list network 'wg0'

option input 'REJECT'

option output 'ACCEPT'

option forward 'REJECT'

option masq '1'

option mtu_fix '1'

You’ll note that the wg0 device is part of the wan zone.

It really is pretty cool to have IoT devices at a remote site, magically controlled over the internet – and I don’t need any cloud services to do this.