A while ago I realized one of my RAID arrays was running out of space. Since I didn't have the space required to take a backup of everything I needed to perform the upgrade in place. In this post I replace all drives in my RAID5 array, resize the mdadm array, resize both the LVM physical volume and logical volume, the LUKS container and lastly the file system.
Upgrading the hardware
I have no more space left in my case nor do I have any SATA connections left on my motherboard, thus I bought a SATA to USB adapter which I used for replacing drives.
In this section we replace one drive at a time, we assume the old drive is
/dev/sda and the new drive is called
/dev/sdb. The mdadm volume we
operate on is /dev/md126.
Partitioning the new drives
I partitioned the drives with the same sized partitions as the old drives, I don't think thats required (anything bigger works), I just wanted to postpone taking a decision on the partition-size.
# Check the size of the partition of the previous drive $ sudo parted /dev/sda unit s print Model: ATA ST4000DM000-1F21 (scsi) Disk /dev/sda: 7814037168s Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 2048s 2930277160s 2930275113s primary # Partition the new drive in the same manner. $ sudo parted /dev/sdb (parted) mktable gpt (parted) mkpart primary 2048s 2930277160s # Check that it looks alright $ sudo parted /dev/sdb unit s print Model: ATA WDC WD40EFRX-68W (scsi) Disk /dev/sdb: 7814037168s Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 2048s 2930277160s 2930275113s primary
Replacing the drives
Its easy to replace drives with mdadm, especially since version 3.3.
$ sudo mdadm --manage /dev/md126 --add-spare /dev/sdb1 mdadm: added /dev/sdb1 $ sudo mdadm /dev/md126 --replace /dev/sda1 mdadm: Marked /dev/sda1 (device 4 in /dev/md126) for replacement
--replace will replace the drive as soon as a replacement is available. When
the drive is replaced the old drive will be marked as faulty. Using
compared to simply swapping the drives has the advantage of never putting the
array in a degraded state, as the old drive will continue to be used until it
Once the drive is replaced an email is generated with a Fail event, we then need to remove the old drive from the array:
$ sudo mdadm --manage /dev/md126 --remove failed mdadm: hot removed 8:1 from /dev/md126
Increasing available space
In this section the mdadm device is still called
/dev/md126 and both the LVM
logical volume and physical volume is called
NOTE: I suggest you simply create the partitions as big as you want them from the beginning, it makes the upgrade simpler.
I do not know why it would be beneficial to not use the partitions when resizing them. However since there seem to be an equal divide between people saying you should and people saying it doesn't matter my goal was to take the safe path and not use the partitions while resizing. I did however forget to stop the RAID when resizing 4 out of my 5 partitions and never noticed any issues.
$ sudo umount /mnt/frej $ sudo lvchange -an /dev/frej/frej $ sudo vgchange -an frej 0 logical volume(s) in volume group "frej" now active $ sudo cryptsetup close /dev/mapper/frej $ sudo mdadm --stop /dev/md126 mdadm: stopped /dev/md126
When resizing with parted you provide the end of the partition, not the size of it. I started specifying this as 4TB but then, as the start of my partition is 1048576B, my partition only became 3999998951936B big. Not having 4TB would have annoyed me so I resized it to the first multiple of 1 MiB above 4TB. Moreover we want the end of the sector, not the beginning of the next one, so we subtract one:
⌈(4*10¹² + 1024²)/1024²⌉*1024² - 1 = 4000001818623.
$ sudo parted /dev/sdb resizepart 1 4000001818623B $ sudo parted /dev/sdb unit b print Model: ATA WDC WD40EFRX-68W (scsi) Disk /dev/sdb: 4000787030016B Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1048576B 4000001818623B 4000000770048B primary
After resizing all partitions lets see if it worked:
$ sudo mdadm --assemble --scan mdadm: /dev/md/frej has been started with 5 drives. $ sudo vgscan Reading all physical volumes. This may take a while... Found volume group "frej" using metadata type lvm2 $ sudo vgchange -ay frej 1 logical volume(s) in volume group "frej" now active $ sudo mount -a
It seemed I didn't have to reopen the LUKS container, I'm guessing it is related to me having the device in crypttab and something causes the device to be reopened.
Growing the RAID
$ sudo mdadm --grow /dev/md126 --size max mdadm: component size of /dev/md126 has been set to 3906249728K unfreeze
This will take a while, since it needs to resync the unused space.
$ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [raid1] md126 : active raid5 sdb1 sdf1 sde1 sdd1 sdc1 15624998912 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] [=======>.............] resync = 37.5% (1468675572/3906249728) finish=767.9min speed=52901K/sec
For me it took about 13 hours to complete the resync, thus mdadm's estimate was quite accurate.
Resizing the LUKS container and the LVM physical volume
$ sudo cryptsetup resize /dev/mapper/frej $ sudo pvresize /dev/mapper/frej Physical volume "/dev/mapper/frej" changed 1 physical volume(s) resized / 0 physical volume(s) not resized
Filling the empty space with random data
Since we have newly allocated space we should fill it with random data to make sure no information is leaked through our LUKS container. I have chosen to do this by simply creating a new logical volume with the remaining free space, create an encrypted device over it and fill that devices with zeros. Since the zeros are encrypted with a random key the final data should be random.
$ sudo lvcreate --extents 100%FREE --name filltemp frej $ sudo cryptsetup --key-file=/dev/urandom create filltempcrypt /dev/frej/filltemp $ sudo dd if=/dev/zero of=/dev/mapper/filltempcrypt bs=1M # This takes a _long_ time $ sudo cryptsetup close /dev/mapper/filltempcrypt $ sudo lvchange -an /dev/frej/filltemp $ sudo lvremove /dev/frej/filltemp Logical volume "filltemp" successfully removed
You can check the progress of dd by sending it the SIGUSR1 signal:
$ sudo pkill -f '^dd if=/dev/zero'
The output from dd will look something like this:
220340+0 records in 220340+0 records out 112814080 bytes (113 MB) copied, 0.521177 s, 116 MB/s
NOTE: I ran into the same performance issue I did years ago when setting up the array, since I had reinitialized the array after boot my fixes did not get applied (they are run from rc.local).
Resize the LVM logical volume and file system
Now we need to extend the size of our logical volume and file system. I choose to add 3 TiB as thats what I needed to migrate the data I had on a different volume.
$ sudo lvextend --size +3T /dev/frej/frej Size of logical volume frej/frej changed from 5.46 TiB (1430796 extents) to 8.46 TiB (2217228 extents). Logical volume frej successfully resized $ sudo resize2fs /dev/mapper/frej-frej resize2fs 1.42.12 (29-Aug-2014) Filesystem at /dev/mapper/frej-frej is mounted on /mnt/frej; on-line resizing required old_desc_blocks = 350, new_desc_blocks = 542 The filesystem on /dev/mapper/frej-frej is now 2270441472 (4k) blocks long.
Using modern technologies such as mdadm, LVM and LUKS it is really easy to increase the storage capabilities of a server. Most of the steps can also be performed online. My chassi does not have the ability to easily replace drives, which means I risk damaging them if I try to replace a drive physically while the system is still online.
Had I only had a better chassi I could have performed this entire procedure online, completely without downtime.