Add drive to RAID5 on Ubuntu

Some time ago I migrated from a RAID1 setup to RAID5, this was on the minimum 3 drives. At some point this summer I spotted a good deal on a matching 1TB drive to what I had in my array and bought it. My purchase sat in my desk drawer for a month (or two) then I finally got around to installing it into the server. At least another couple of months went by until I got to adding it to my array – it turns out to be really simple and I’m kicking myself for dragging my feet.

With any hardware upgrade (specifically drives) it’s a good idea to capture what the system thinks things look like before you make any changes. For the most part Ubuntu talks about UUIDs for drives, but a couple of places (at least in my install) use the /dev/sd*# names and can trip you up when you shuffle hardware around. Capturing the drive assignments is simply a matter of:

$ sudo fdisk -l | grep ^/dev

Post hardware installation I was surprised at how much of a shuffle the /dev/sd*#‘s changed around. I was glad I had a before and after capture of the data, it also let me identify the new drive pretty easily.

Early in my notes I have “could it be this simple?” and a link to the kernel.org wiki on RAID. It turns out that yes, it really is that simple — but you do need to follow the steps carefully. I did also find an Ubuntu Forum post that was a good read for background too.

The new drive I had temporarily used on an OSX system to do some recovery work, so fdisk wasn’t very happy about working with the drive that had a GUID partition table (GPT). It turns out parted was happy to work with the volume and let me even change it back into something fdisk could work with.

I puzzled over the fact that this new drive wanted to start at 2048 instead of 63, I was initially under the incorrect assumption this had something to do with the GPT setup that I hadn’t been able to fix. Consider two basically identical volumes (old followed by new)

$ sudo fdisk -l /dev/sdb

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 63 1953520064 976760001 83 Linux

$ sudo fdisk -l /dev/sdc

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdc1 2048 1953525134 976761543+ 83 Linux

I’ve highlighted the key differences in bold, you can see the physical sector size is 4096 vs. 512 and that is the reason for the different start position. Ok, diversion over – let’s actually follow the wiki and get this drive added to the RAID array.

Start by looking at what we have:

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md_d3 : active raid5 sdf1[1] sdd1[0] sdb1[2]
1953519872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

So, my RAID5 array is /dev/md_d3, and I know my new drive is /dev/sdc1 after my parted/fdisk adventure above.

$ sudo mdadm --add /dev/md_d3 /dev/sdc1

Now we look at mdstat again and it shows we have a spare. This is honestly what I should have at least done with the drive immediately after installing it – having a spare lets the RAID array fail over to the spare drive with no administrator intervention.

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md_d3 : active raid5 sdc1[3](S) sdf1[1] sdd1[0] sdb1[2]
1953519872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

Next we grow the array across the new device

$ sudo mdadm --grow --raid-devices=4 /dev/md_d3

You can peek at /proc/mdstat from time to time (or use the watch command) to monitor progress. This may take a while.

Once this is done, don’t forget to modify /etc/mdadm/mdadm.conf as per the wiki: “To make mdadm find your array edit /etc/mdadm.conf and correct the num-devices information of your Array”

At this point we now have our data spread across more drives, but don’t have a larger volume. We need to resize the volume to take advantage of the new space. It’s recommended you do the resize with the RAID5 volume unmounted (offline). I set about to do this and hit problems unmounting the volume: this turned out to be samba holding on to the volume, turning that service off fixed things.

Then I hit a show stopper, the resize2fs command failed:

$ sudo resize2fs -p /dev/md_d3
resize2fs 1.42 (29-Nov-2011)
resize2fs: Device or resource busy while trying to open /dev/md_d3
Couldn't find valid filesystem superblock.

Huh? This is something I’ll one day sort out I suppose, but it really beats me what is going on here. You can resize RAID5 while it’s online too, it’s slower and a bit scarier, but it works.

$ sudo resize2fs /dev/md_d3
resize2fs 1.42 (29-Nov-2011)
Filesystem at /dev/md_d3 is mounted on /stuff; on-line resizing required
old_desc_blocks = 117, new_desc_blocks = 175
Performing an on-line resize of /dev/md_d3 to 732569952 (4k) blocks.

This was followed by a few moments of terror as I realized that I was doing this over a SSH connection – what if the connection is lost? Next time I’ll use screen, or nohup the process.

It was neat to watch the free space on the drive creep upwards. It was running at about 1Gb every 2 seconds. Once this finishes, you’re done. My RAID volume went from 1.9T to 2.8T with the new drive.

Leave a Reply

Your email address will not be published. Required fields are marked *