{"id":1939,"date":"2021-11-08T09:52:04","date_gmt":"2021-11-08T13:52:04","guid":{"rendered":"https:\/\/lowtek.ca\/roo\/?p=1939"},"modified":"2021-11-08T09:52:04","modified_gmt":"2021-11-08T13:52:04","slug":"ubuntu-adding-a-2nd-data-drive-as-a-mirror-raid1","status":"publish","type":"post","link":"https:\/\/lowtek.ca\/roo\/2021\/ubuntu-adding-a-2nd-data-drive-as-a-mirror-raid1\/","title":{"rendered":"Ubuntu adding a 2nd data drive as a mirror (RAID1)"},"content":{"rendered":"<p>Over the years I&#8217;ve had the expected number of hard drive failures. Some have been more catastrophic to me as I didn&#8217;t have a good backup strategy in place, others felt avoidable if I&#8217;d paid attention the warning signs.<\/p>\n<p>My current setup for data duplication is based on <a href=\"https:\/\/www.snapraid.it\/\">Snapraid<\/a>, a non-traditional RAID solution. It allows mixed sizes of drives, and the replication is done via regularly running the sync operation. Mine is done daily, files are sync&#8217;d across the drives and a data validation is done from time to time as well. This means while I might lose up to 24hrs of data if the primary drive fails, I have lower usage of the main parity drive and I get the assurance that file corruption hasn&#8217;t happened.<\/p>\n<p>Snapraid is very bad when you have either: many small files, frequently changing files. It is ideal for backing up media like photos or movies. To deal with the more rapidly changing data I&#8217;ve got a SSD drive for storage. I haven&#8217;t yet had a SSD fail on me, but that is assured to happen at one point. <a href=\"https:\/\/www.backblaze.com\/b2\/hard-drive-test-data.html\">Backblaze<\/a> is already seeing some <a href=\"https:\/\/www.backblaze.com\/blog\/are-ssds-really-more-reliable-than-hard-drives\/\">failure rate information that is concerning<\/a>. Couple this with the fact that my storage SSD started throwing errors the other day and only a full power cycle of the machine brought it back\u00a0 &#8211; it&#8217;s fine now, but for how long? Time to setup a mirror.<\/p>\n<p>For this storage I&#8217;m going back to traditional <a href=\"https:\/\/en.wikipedia.org\/wiki\/RAID\">RAID<\/a>. The SSD is a 480GB drive, and thankfully the price of them has dropped to easily under $70. This additional drive now fills all 6 of the SATA ports on my motherboard, the next upgrade will need to be an SATA port expansion card. I&#8217;ve written about RAID <a href=\"https:\/\/lowtek.ca\/roo\/2012\/add-volum-to-raid5-on-ubuntu\/\">a few<\/a> <a href=\"https:\/\/lowtek.ca\/roo\/2010\/mirrored-drives-with-ubuntu\/\">times<\/a> <a href=\"https:\/\/lowtek.ca\/roo\/2011\/how-to-migrate-from-raid1-to-raid5\/\">here<\/a>.<\/p>\n<p>I&#8217;ve moved away from specifying drives as <code>\/dev\/sdbX<\/code> because these values can change. Even this new SSD caused the drive that was at <code>\/dev\/sdf<\/code> to move to <code>\/dev\/sdg<\/code> allowing the new drive to use <code>\/dev\/sdf<\/code>. My <code>\/etc\/fstab<\/code> is now setup using <code>\/dev\/disk\/by-id\/xxx<\/code> because these are persistent. Most of the disk utilities understand this format just fine as you an see with this example with fdisk.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo fdisk -l \/dev\/sdf\r\nDisk \/dev\/sdf: 447.1 GiB, 480103981056 bytes, 937703088 sectors\r\nUnits: sectors of 1 * 512 = 512 bytes\r\nSector size (logical\/physical): 512 bytes \/ 512 bytes\r\nI\/O size (minimum\/optimal): 512 bytes \/ 512 bytes\r\n$ sudo fdisk -l \/dev\/disk\/by-id\/ata-KINGSTON_SA400S37480G_50026841D62B77E8\r\nDisk \/dev\/disk\/by-id\/ata-KINGSTON_SA400S37480G_50026841D62B77E8: 447.1 GiB, 480103981056 bytes, 937703088 sectors\r\nUnits: sectors of 1 * 512 = 512 bytes\r\nSector size (logical\/physical): 512 bytes \/ 512 bytes\r\nI\/O size (minimum\/optimal): 512 bytes \/ 512 bytes<\/pre>\n<p>Granted, working with <code>\/dev\/disk\/by-id<\/code> is a lot more verbose &#8211; but that id will not change if you re-organize the SATA cables.<\/p>\n<p>Let&#8217;s get going on setting up the new drive as a mirror for the existing one. Here&#8217;s the basic set of steps<\/p>\n<ol>\n<li>Partition the new drive so it is identical to the existing one<\/li>\n<li>Create a RAID1 array in degraded state<\/li>\n<li>Format and mount the array<\/li>\n<li>Copy the data from the existing drive to the new array<\/li>\n<li>Un-mount both the array and the original drive<\/li>\n<li>Mount the array where the original drive was mounted<\/li>\n<li>Make sure things are good &#8211; the next step is destructive<\/li>\n<li>Add the original drive to the degraded RAID1 array making it whole<\/li>\n<\/ol>\n<p>It may seems like a lot of steps, and some of them are scary &#8211; but on the other side we&#8217;ll have a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mdadm\">software RAID<\/a> protecting the data. The remainder of this post will be the details of those steps above.<\/p>\n<p><!--more--><\/p>\n<p><strong>Step 1 &#8211; Partitioning the new drive<\/strong><\/p>\n<p>Any time you&#8217;re about to partition (or re-partition) a drive, it is important to be careful. We could very easily target the wrong one and then we have a big problem. Above you can see that I&#8217;ve done a <code>sudo fdisk -l \/dev\/disk\/by-id\/ata-KINGSTON_SA400S37480G_50026B77841D62E8<\/code> and have been able to see it is not yet partitioned. This is a good way to confirm we have the right device. I also want to look at the drive we want to mirror and figure out how it is partitioned.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo fdisk -l \/dev\/disk\/by-id\/ata-ADATA_SU650_2K1220083359\r\nDisk \/dev\/disk\/by-id\/ata-ADATA_SU650_2K1220083359: 447.1 GiB, 480103981056 bytes, 937703088 sectors\r\nUnits: sectors of 1 * 512 = 512 bytes\r\nSector size (logical\/physical): 512 bytes \/ 512 bytes\r\nI\/O size (minimum\/optimal): 512 bytes \/ 512 bytes\r\nDisklabel type: gpt\r\nDisk identifier: 0BAA6941-2577-4204-992E-CE9310B75D0C\r\n\r\nDevice                                             Start       End   Sectors   Size Type\r\n\/dev\/disk\/by-id\/ata-ADATA_SU650_2K1220083359-part1  2048 937701375 937699328 447.1G Linux filesystem<\/pre>\n<p>We want the new drive to look like that once we are done.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo fdisk \/dev\/disk\/by-id\/ata-KINGSTON_SA400S37480G_50026841D62B77E8\r\n\r\nWelcome to fdisk (util-linux 2.31.1).\r\nChanges will remain in memory only, until you decide to write them.\r\nBe careful before using the write command.\r\n\r\nDevice does not contain a recognized partition table.\r\nCreated a new DOS disklabel with disk identifier 0x5167495b.\r\n\r\nCommand (m for help): g\r\nCreated a new GPT disklabel (GUID: 6C260AF2-796D-5E49-8CB0-6EA5C3995D00).\r\n\r\nCommand (m for help): n\r\nPartition number (1-128, default 1): \r\nFirst sector (2048-937703054, default 2048): \r\nLast sector, +sectors or +size{K,M,G,T,P} (2048-937703054, default 937703054): 937701375\r\n\r\nCreated a new partition 1 of type 'Linux filesystem' and of size 447.1 GiB.\r\n\r\nCommand (m for help): w\r\nThe partition table has been altered.\r\nCalling ioctl() to re-read partition table.\r\nSyncing disks.\r\n<\/pre>\n<p>Note that I selected a non-default last sector to match the existing (smaller) drive. If the situation had been reversed, I&#8217;d be re-partitioning the existing drive to match the smaller new one. Net, to keep things sane with the mirror we want the same layout for both. It&#8217;s not a bad idea now to compare the\u00a0 partition layouts for the two drives to make sure we got this right.<\/p>\n<p><strong>Step 2 &#8211; Create a RAID1 array<\/strong><\/p>\n<p>We are going to create a RAID1 array, but with a missing drive &#8211; thus it will be in a degraded state. For this we need to look in \/dev\/disk\/by-id and select the partition name we just created in step1. This will be <code>-part1<\/code> at the end of the device name we used.<\/p>\n<p>I can use <code>\/dev\/md0<\/code> because this is the first RAID array on this system.<\/p>\n<pre class=\"lang:default decode:true \">sudo mdadm --create --verbose \/dev\/md0 --level=mirror --raid-devices=2 \/dev\/disk\/by-id\/ata-KINGSTON_SA400S37480G_50026841D62B77E8-part1 missing\r\nmdadm: Note: this array has metadata at the start and\r\n    may not be suitable as a boot device.  If you plan to\r\n    store '\/boot' on this device please ensure that\r\n    your boot-loader understands md\/v1.x metadata, or use\r\n    --metadata=0.90\r\nmdadm: size set to 468717568K\r\nmdadm: automatically enabling write-intent bitmap on large array\r\nContinue creating array? y\r\nmdadm: Defaulting to version 1.2 metadata\r\nmdadm: array \/dev\/md0 started.<\/pre>\n<p>We can look at our array via <code>\/proc\/mdstat<\/code> &#8212; we should also expect to get an email from the system informing us that there is a degraded RAID array.<\/p>\n<pre class=\"lang:default decode:true \">$ cat \/proc\/mdstat \r\nPersonalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] \r\nmd0 : active raid1 sdf1[0]\r\n      468717568 blocks super 1.2 [2\/1] [U_]\r\n      bitmap: 0\/4 pages [0KB], 65536KB chunk\r\n<\/pre>\n<p><strong>Step 3 &#8211; Format and mount<\/strong><\/p>\n<p>We can now treat the new <code>\/dev\/md0<\/code> as a drive partition. This is standard linux formatting and mounting. I&#8217;ll be using ext4 as the filesystem.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo mkfs -t ext4 \/dev\/md0\r\nmke2fs 1.44.1 (24-Mar-2018)\r\nDiscarding device blocks: done                            \r\nCreating filesystem with 117179392 4k blocks and 29302784 inodes\r\nFilesystem UUID: 45ac1aed-396e-4bb8-82db-abb876d3bf87\r\nSuperblock backups stored on blocks: \r\n        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, \r\n        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, \r\n        102400000\r\n\r\nAllocating group tables: done                            \r\nWriting inode tables: done                            \r\nCreating journal (262144 blocks): done\r\nWriting superblocks and filesystem accounting information: done<\/pre>\n<p>And now we mount it on the \/mnt endpoint.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo mount \/dev\/md0 \/mnt\r\n$ ls \/mnt\r\nlost+found<\/pre>\n<p><strong>Step 4 &#8211; Copy the data<\/strong><\/p>\n<p>For this step, I&#8217;m going to use <a href=\"https:\/\/superuser.com\/a\/307542\">rsync<\/a> and probably run it multiple times because right now I have live workload changing some of those files. I&#8217;ll have to shut down all processes that are updating the original volume before doing the final rsync.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo rsync -avxHAX --progress \/mounted\/original\/. \/mnt\/.<\/pre>\n<p>This will run for some time, depending on how much you&#8217;ve got on the original disk. Once it is done, shut down anything that might be changing the original drive and run the same <code>rsync<\/code> command again. Then you can move on to the next step.<\/p>\n<p><strong>Step 5 &#8211; Un-mount both the array and the original drive<\/strong><\/p>\n<p>Un-mounting <code>\/mnt<\/code> was easy, because this was a new mount and my repeated rsync runs were the only thing targeting it.<\/p>\n<p>In the previous step I&#8217;d already stopped the docker containers that were using the volume as storage, so I thought it&#8217;d be similarly trivial to unmount. I was wrong.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo umount \/mounted\/original \r\numount: \/mounted\/original: target is busy.<\/pre>\n<p>To track down what was preventing the unmount required digging through the verbose output of <code>sudo lsof<\/code> which will show all open files. It turned out that I had forgotten about the logging agent I have running which is reading some log files that live on this storage. Once I&#8217;d also stopped that process that was reading data I was good to go.<\/p>\n<p><strong>Step 6 &#8211; Mount the array where the original drive was mounted<\/strong><\/p>\n<p>This should be as easy as modifying <code>\/etc\/fstab<\/code> to point to <code>\/dev\/md0<\/code> where we used to point to the physical disk by-id<\/p>\n<pre class=\"lang:default decode:true \"># Mirrored 480GB SSDs for storage\r\n\/dev\/md0        \/mounted\/original  ext4    defaults        0       2<\/pre>\n<p>Once <code>\/etc\/fstab<\/code> is fixed &#8211; we can just mount \/mounted\/original and restart everything we shut down.<\/p>\n<p><strong>Step 7 &#8211; Make sure things are good<\/strong><\/p>\n<p>At this point we have a degraded RAID1 array and a full (but aging) copy of the data on a second physical drive. This isn&#8217;t a bad place to be, but we should make sure that everything is working as expected. Check that all of the workloads you have are not generating unusual logs and whatever else you can think of to check that the new copy of the data appears to be good to go.<\/p>\n<p><strong>Step 8 &#8211; Complete the RAID1 array<\/strong><\/p>\n<p>We are now going to add the original drive to the RAID1 array, changing it from degraded to whole. This is a little scary because we are about to destroy the original copy of the data, but the trade-off is that we&#8217;ll end up with a resilient mirrored drive setup of the newly mounted <code>\/dev\/md0<\/code> filesystem.<\/p>\n<pre class=\"lang:default decode:true \">$ sudo mdadm \/dev\/md0 --add \/dev\/disk\/by-id\/ata-ADATA_SU650_2K1220083359-part1<\/pre>\n<p>Again, you will notice that I&#8217;m using the <code>by-id<\/code> specification of the original drive partition which ends in <code>-part1<\/code> as there is only one partition.<\/p>\n<p>Once we&#8217;ve done this, we can monitor the progress of the two drives being mirrored (aka: recovering to RAID1 status)<\/p>\n<pre class=\"lang:default decode:true \">$ cat \/proc\/mdstat \r\nPersonalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] \r\nmd0 : active raid1 sdg1[2] sdf1[0]\r\n      468717568 blocks super 1.2 [2\/1] [U_]\r\n      [&gt;....................]  recovery =  0.3% (1754496\/468717568) finish=35.4min speed=219312K\/sec\r\n      bitmap: 4\/4 pages [16KB], 65536KB chunk\r\n\r\nunused devices: &lt;none&gt;<\/pre>\n<p>Once things have completed and the system is stable, a reboot isn&#8217;t a bad idea to ensure that everything will start up fine. This is generally a good thing to do whenever you are making changes to the system.<\/p>\n<p>While this isn&#8217;t perfect protection from any sort of data loss, it should allow us to gracefully recover when one of the SSD drives stops working. Having a backup plan that you test regularly is a very good thing to add as another layer of data protection.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Over the years I&#8217;ve had the expected number of hard drive failures. Some have been more catastrophic to me as I didn&#8217;t have a good backup strategy in place, others felt avoidable if I&#8217;d paid attention the warning signs. My current setup for data duplication is based on Snapraid, a non-traditional RAID solution. It allows &hellip; <a href=\"https:\/\/lowtek.ca\/roo\/2021\/ubuntu-adding-a-2nd-data-drive-as-a-mirror-raid1\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Ubuntu adding a 2nd data drive as a mirror (RAID1)&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,12],"tags":[],"class_list":["post-1939","post","type-post","status-publish","format-standard","hentry","category-computing","category-how-to"],"_links":{"self":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/1939","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/comments?post=1939"}],"version-history":[{"count":4,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/1939\/revisions"}],"predecessor-version":[{"id":1943,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/1939\/revisions\/1943"}],"wp:attachment":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/media?parent=1939"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/categories?post=1939"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/tags?post=1939"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}