Saturday, January 18, 2020

How-To Safely Remove A Storage Device From A Linux System?

There are situations where we would be required to remove a storage device or LUN from a system and attach it to another system for some other important purpose. So, how could this be achieved? Yes, we could remove a storage device attached to a system safely provided the system has enough bandwidth to accommodate data on another available volume. In this blog page, I’ve tried to document all the steps with screen-shots to demonstrate the same procedure. I hope this helps someone and if so, please leave a comment or hit Like button. Thank you!

Demo Setup 
In this demonstration, I've taken the vg "datavg" as an example, which is made up of two physical volumes (pv) (/dev/sdb & /dev/sdc) and this is used to create the logical volume (lv) "datalv". Consider that there is a requirement to remove one of the pv’s which is "/dev/sdc” completely from the system, so that it would be used by storage team for other purposes. 
A snapshot of the device structure is shown here:
In this setup, we would plan to move all the extents of the device ‘/dev/sdc’ into the available (free) extents of the device ‘/dev/sdd’. The systematic procedure with screenshots is documented in this blog page.

Step 1: Close all users connected with file system.
        Intimate everyone not to access the particular file system or not to get involved with any activities involved with the targeted file system. Since all such maintenance activities are pre-planned and would follow company standard IT guidelines, this is not a big concern. Though we are not taking the file system down, however, as a precautionary measure it is better not to perform any operation on the specified file system. We could otherwise trigger a broadcast a message and one such message sent to everyone by the root user shown here:

Step 2: Backup data.
This is another important process which has to be systematically executed and examined. Make sure to take backup before such activity and also check if backups are in good condition. The backup process varies from organizations to organizations and also each companies do carry/follow their own process.
Step 3: Un-mount the device (or mount point)
Need to un-mount the block device being used. Again, this is not necessary. However, as a precautionary measure we’d un-mount the device. Also, run the "sync" command before the unmount to make sure that all necessary IO's are flushed to disk.
If unable to un-mount the block device then one could run "lsof <Mount-Point" to check which process is actually accessing the device and either wait for the process to exit cleanly, otherwise, kill the process if not important.
In an example of such an issue as shown below where user “redhat” is accessing the block device on the terminal “tty3”, therefore we could not un-mount this device.
So, in such situations we could forcefully kill that particular user session (forcefully logout user session) if not important, otherwise, wait till that process goes off cleanly (this is just an example). After this un-mount the particular file system.
Step 4: Flush IO.
Run "blockdev --flushbufs <DeviceName>" to flush any outstanding IO's of the device. NOTE: Please don’t run the command unnecessarily on other devices which may be IO bound, otherwise, if the activity is not planned.

Step 5: Remove device references to md (RAID) or LVM.
If the device is a member of an LVM, then need to move data using ‘pvmove’ command to another device, and then run ‘vgreduce’ command, and later remove physical volume (pv) using ‘pvremove’.
Understand the corresponding volume group and identify the physical volumes, extent size, volume group, free extents and other attributes. Let’s run the “vgdisplay -v datavg” command to understand the details about this volume group, mapped physical volumes etc,. 
Identify another physical volume either equal or bigger than “/dev/sdc”.
In this demo setup, there another sparsely used physical volume (/dev/sdd) as shown below, which is of size 2GB and this would replace “/dev/sdc”:
Note: If the block device is not part of the volume group then, one has to do ‘pvcreate’ on the block device and then run ‘vgextend’ to make the pv part of the volume group.
Let us find out more about physical extents that are used or free by using the ‘pvdisplay’ command. Run the ‘pvdisplay’ command with “-m” attribute as shown below:

  block device /dev/sdb

block device /dev/sdc

block device /dev/sdd

From the above output, we could make it out that the physical volume ‘/dev/sdd’ has 411 physical extents free ( 411 X 4 == 1644 MiB ). The physical extent mappings from 100 till 510 are available. However, physical extents mapping from 0 – 99 are being used.

Step 6: Let us move the physical extents.
In this scenario, we would need to move physical extents of the device ‘/dev/sdc’ to ‘/dev/sdd’. The device ‘sdd’ has enough free extents available that could accommodate all the extents of ‘sdc’. There are a total of 255 PEs(Physical Extent) that we need to move out of the device ‘sdc’. Hence, we are moving it over to device ‘sdd’ starting from 100 till 355 extents as shown below:
This operation would need free space to work in the background to complete the task. NOTE: Please read the man page of the ‘pvmove’ command for details.

Step 7: Verify and remove the physical volume.
As per the below screenshots, it is clear that the physical volume ‘/dev/sdc’ is totally free and could be removed from the volume group now safely. 
Next step is to remove the physical volume out of the volume group and wipe out physical volume metadata as shown below:
NOTE: If the device is a multipath device then run the command "multipath -ll" and note all paths to a device, then remove paths using "multipath -f <DeviceName>".

Step 8: Remove Device References.
Remove references to the device's path based name like /dev/sd<x>, /dev/disk/by-path in any scripts, or applications or utilities so future addition or removal of devices would not be mistaken with current device.
Finally, remove device path from SCSI sub-system. First, take the device offline and then delete it.
Notice that the device ‘/dev/sdc’ is not visible in the SCSI subsystem and this could be un-attached from the specified server. This device could be un-presented from storage side and make it available for other usage.
NOTE: All such activities would need some downtime. Please execute this process under the vigilance of a senior or expert Linux administrator. Make sure to plan, document and record such session and should have a backup plan ready in case if something goes wrong.


Navinika said...

I am really impressed your written a blog. Hope we are eagerly waiting for such post from your side. HATS OFF for the valuable information shared!
Linux Training in Electronic City

RedMood said...

nice work tiger

Edd said...

Amazing stuff. Thanks!