Integrating LVM with Hadoop for elastic Datanode storage

Adarsh Saxena
2 min readMar 14, 2021

--

Header image

What is LVM?

In Linux, Logical Volume Manager (LVM) is a device mapper framework that provides logical volume management for the Linux kernel.

Head over to practical Implementation

Part 1. Create a LVM partition and mount to a data storage

Step 1: List all the hard disk available

fdisk -l

step 2: Create a PV (Physical Volume)

The physical volume over here will be created the same size as that of the disk size we are using.

pvcrete /dev/xvdf

Step 3: Create Volume Group with previously created PV

Create a volume group with the previously created Physical Volume. The name of the VG is medvg (say)

vgcreate medvg /dev/xvdf

To see the details of the VG (volume group) that we have just created, you can use the vgdisplay command

vgdisplay

Step 4: Create a new LV (Logical Volume)

Here, we’re creating a logical volume of size 2 GiB.

lvcreate --size 1G --name medlv medvg

You can check whether the LV is created or not using the following command

fdisk -l

Step 5: Format the LV & mount it

mkfs.ext4 /dev/medvg/medlvmkdir /dnfoldermount /dev/medvg/medlv /dnfolder

Now, you just need to use this folder in the data node to be shared in the cluster. For that, add this folder in the configuration file in the datanode.

Part 2: Dynamically increase the size of the shared space of the datanode.

Step 1: Attach a new storage device

We’ve added a storage device of size 4 GiB. After attaching the device, you can check if it is there or not using the following command:

fdisk -l /dev/xvdg

Step 2: Create a PV

This time we’re going to create the PV of size 4 GiB, the same as the size of the storage device that we’ve attached.

pvcreate /dev/xvdg

Step 3: Add this VG to the previously created VG

vgextend medvg /dev/xvdg

Step 4: Extend the LV previously created

lvextend -L +4G /dev/medvg/medlv -r

Step 5: Check if it is working fine?

To check if the size of LV is extended, use the following command:

fdisk -l

To check if the storage is updated in the Hadoop, use the command:

hadoop dfsadmin -report

Conclusion

We’ve successfully configured the data node storage with LVM and increased the size of our datanode.

--

--

Adarsh Saxena
Adarsh Saxena

Written by Adarsh Saxena

Hey Everone, I am DevOps Practitioner, Cloud Computing, BigData, Machine Learning are my favorite parts. Connect me on LinkedIn to know more about me.

No responses yet