Ubuntu adding a 2nd data drive as a mirror (RAID1)

Over the years I’ve had the expected number of hard drive failures. Some have been more catastrophic to me as I didn’t have a good backup strategy in place, others felt avoidable if I’d paid attention the warning signs.

My current setup for data duplication is based on Snapraid, a non-traditional RAID solution. It allows mixed sizes of drives, and the replication is done via regularly running the sync operation. Mine is done daily, files are sync’d across the drives and a data validation is done from time to time as well. This means while I might lose up to 24hrs of data if the primary drive fails, I have lower usage of the main parity drive and I get the assurance that file corruption hasn’t happened.

Snapraid is very bad when you have either: many small files, frequently changing files. It is ideal for backing up media like photos or movies. To deal with the more rapidly changing data I’ve got a SSD drive for storage. I haven’t yet had a SSD fail on me, but that is assured to happen at one point. Backblaze is already seeing some failure rate information that is concerning. Couple this with the fact that my storage SSD started throwing errors the other day and only a full power cycle of the machine brought it backĀ  – it’s fine now, but for how long? Time to setup a mirror.

For this storage I’m going back to traditional RAID. The SSD is a 480GB drive, and thankfully the price of them has dropped to easily under $70. This additional drive now fills all 6 of the SATA ports on my motherboard, the next upgrade will need to be an SATA port expansion card. I’ve written about RAID a few times here.

I’ve moved away from specifying drives as /dev/sdbX because these values can change. Even this new SSD caused the drive that was at /dev/sdf to move to /dev/sdg allowing the new drive to use /dev/sdf. My /etc/fstab is now setup using /dev/disk/by-id/xxx because these are persistent. Most of the disk utilities understand this format just fine as you an see with this example with fdisk.

Granted, working with /dev/disk/by-id is a lot more verbose – but that id will not change if you re-organize the SATA cables.

Let’s get going on setting up the new drive as a mirror for the existing one. Here’s the basic set of steps

  1. Partition the new drive so it is identical to the existing one
  2. Create a RAID1 array in degraded state
  3. Format and mount the array
  4. Copy the data from the existing drive to the new array
  5. Un-mount both the array and the original drive
  6. Mount the array where the original drive was mounted
  7. Make sure things are good – the next step is destructive
  8. Add the original drive to the degraded RAID1 array making it whole

It may seems like a lot of steps, and some of them are scary – but on the other side we’ll have a software RAID protecting the data. The remainder of this post will be the details of those steps above.

Step 1 – Partitioning the new drive

Any time you’re about to partition (or re-partition) a drive, it is important to be careful. We could very easily target the wrong one and then we have a big problem. Above you can see that I’ve done a sudo fdisk -l /dev/disk/by-id/ata-KINGSTON_SA400S37480G_50026B77841D62E8 and have been able to see it is not yet partitioned. This is a good way to confirm we have the right device. I also want to look at the drive we want to mirror and figure out how it is partitioned.

We want the new drive to look like that once we are done.

Note that I selected a non-default last sector to match the existing (smaller) drive. If the situation had been reversed, I’d be re-partitioning the existing drive to match the smaller new one. Net, to keep things sane with the mirror we want the same layout for both. It’s not a bad idea now to compare theĀ  partition layouts for the two drives to make sure we got this right.

Step 2 – Create a RAID1 array

We are going to create a RAID1 array, but with a missing drive – thus it will be in a degraded state. For this we need to look in /dev/disk/by-id and select the partition name we just created in step1. This will be -part1 at the end of the device name we used.

I can use /dev/md0 because this is the first RAID array on this system.

We can look at our array via /proc/mdstat — we should also expect to get an email from the system informing us that there is a degraded RAID array.

Step 3 – Format and mount

We can now treat the new /dev/md0 as a drive partition. This is standard linux formatting and mounting. I’ll be using ext4 as the filesystem.

And now we mount it on the /mnt endpoint.

Step 4 – Copy the data

For this step, I’m going to use rsync and probably run it multiple times because right now I have live workload changing some of those files. I’ll have to shut down all processes that are updating the original volume before doing the final rsync.

This will run for some time, depending on how much you’ve got on the original disk. Once it is done, shut down anything that might be changing the original drive and run the same rsync command again. Then you can move on to the next step.

Step 5 – Un-mount both the array and the original drive

Un-mounting /mnt was easy, because this was a new mount and my repeated rsync runs were the only thing targeting it.

In the previous step I’d already stopped the docker containers that were using the volume as storage, so I thought it’d be similarly trivial to unmount. I was wrong.

To track down what was preventing the unmount required digging through the verbose output of sudo lsof which will show all open files. It turned out that I had forgotten about the logging agent I have running which is reading some log files that live on this storage. Once I’d also stopped that process that was reading data I was good to go.

Step 6 – Mount the array where the original drive was mounted

This should be as easy as modifying /etc/fstab to point to /dev/md0 where we used to point to the physical disk by-id

Once /etc/fstab is fixed – we can just mount /mounted/original and restart everything we shut down.

Step 7 – Make sure things are good

At this point we have a degraded RAID1 array and a full (but aging) copy of the data on a second physical drive. This isn’t a bad place to be, but we should make sure that everything is working as expected. Check that all of the workloads you have are not generating unusual logs and whatever else you can think of to check that the new copy of the data appears to be good to go.

Step 8 – Complete the RAID1 array

We are now going to add the original drive to the RAID1 array, changing it from degraded to whole. This is a little scary because we are about to destroy the original copy of the data, but the trade-off is that we’ll end up with a resilient mirrored drive setup of the newly mounted /dev/md0 filesystem.

Again, you will notice that I’m using the by-id specification of the original drive partition which ends in -part1 as there is only one partition.

Once we’ve done this, we can monitor the progress of the two drives being mirrored (aka: recovering to RAID1 status)

Once things have completed and the system is stable, a reboot isn’t a bad idea to ensure that everything will start up fine. This is generally a good thing to do whenever you are making changes to the system.

While this isn’t perfect protection from any sort of data loss, it should allow us to gracefully recover when one of the SSD drives stops working. Having a backup plan that you test regularly is a very good thing to add as another layer of data protection.

One thought on “Ubuntu adding a 2nd data drive as a mirror (RAID1)”

  1. Nice write-up!

    I’ve been using e2fslabel and mounting by label, rather than mounting by drive ID. It’s a little less verbose, and lets me use automated scripts with multiple copies of a backup disk that I iterate through (mount –label weeklybackup /mnt).

Leave a Reply

Your email address will not be published. Required fields are marked *