Volumes in Kubernetes

In this tutorial, we will take a look at Volumes in Kubernetes. Let us look at volumes in Docker first. Docker containers are meant to be transient in nature which means they are meant to last only for a short period of time.

As we discussed in Docker storage, If we don’t attach the volume in the container runtime, when container destroyed and then all data will be lost.

So, We need to persist data into the Docker container so we attach a volume to the containers when they are created.

The data are processed by the container is now placed in this volume thereby retaining it permanently. Even if the container is deleted the data remains in the volume.

In the Kubernetes world, the PODs created in Kubernetes are transient in nature. When a POD is created to process data and then deleted, the data processed by it gets deleted as well.

So for this we attach a volume to the POD. The data generated by the POD is now stored in the volume, and even after the POD is delete, the data remains.

Volumes and Mounts

Let’s look at a simple implementation of volumes. We have a single node Kubernetes cluster. We create a simple POD that generates a random between 1 and 100 and writes that to a file at /opt/number.out and then gets deleted along with the random number.

To retain the number generated by the POD, we create a volume and a volume needs a storage. When you create a volume you can choose to configure it storage in different ways.

We will look at the various options in a bit but for now we will simply configure it to use a directory on the host. In this case I specify a path /data on the host.

This way any files created in the volume would be stored in the directory data on my node. Once the volume is created, to access it from a container we mount the volume to a directory inside the container.

We use the volumeMounts field in each container to mount the data-volume to the directory /opt within the container.

The random number will now be written to /opt mount inside the container, which happens to be on the data-volume which is in fact /data directory on the host.

When the POD gets deleted, the file with the random number still lives on the host. Let’s take a step back and look at the volume storage options.

Volume storage options

We already know the host path option to configure a directory and the host has storage space for the volume. Now that works fine on a single node however it is not recommended for use in a multi node cluster.

This is because the PODs would use the /data directory on all the nodes, and expect all of them to be the same and have the same data since they are on different servers.

They are in fact not the same unless you configure some kind of external replicated cluster storage solution.

Kubernetes supports several types of standard storage solutions such as NFS, glusterFS, Flocker, FibreChannel, CephFS, ScaleIO or public cloud solutions like AWS EBS, Azure Disk or File or Google’s Persistent Disk.

For example, to configure an AWS Elastic Block Store volume as the storage or the volume, we replace hostPath field of the volume with awsElasticBlockStore field along with the volumeID and filesystem type. The Volume storage will now be on AWS EBS.

Volumes in Kubernetes