How Rancher's Local Path Provisioner works
For the past month, I’ve been researching for baremetal k8s distribution to use at our company. I’ve been using Rancher’s k3s, a lightweight kubernetes distribution and I find it very light and convenient to setup. Together with bunch of built-in tools bundled with k3s, I really like its super cool storage provisioner. Thank you Rancher, you guys rocks \m/
What we’ll learn here#
- What it takes to dynamically provision a PV in Kubernetes
- What is Local Path Provisioner
- How to use Rancher’s Local Path Provisioner
- How dynamic storage external-provisioner works
Frankly speaking, I personally find it abit hard for me to get more information about this kind of stuff, unless I work on Kubernetes project itself or storage-related system company. Otherwise, I have to dig around this kind of projects and examine them by myself. I am sharing this to help people like to me to get started to know more about Kubernetes storage and external-provisioner specifically.
How Kubernetes Dynamic Storage Provisioning work#
I recommend you to read a documentation from IBM to know in detail how storage works in Kubernetes.
This is how you attach a storage volume into a pod in Kubernetes:
- A user creates a PVC
- Storage device (storage class) listen to the request and provision a PV
- PV is created by the storage provisioner
- The PVC and PV are connected together. The status is now
Bound
- The pods created is linked to the PV created
The role of storage provisioner here is to help you provision the PV. There are many provisioners available out there, both open source and closed source. If you are using managed Kubernetes service (ie. AWS EKS, GKE, AKS), the service provider will likely handle step #2 and #3 for you already so that you do not have to worry about that. They are tightly integrated with their block storage offerings (eg. Google Cloud Persistent Disk, AWS EBS, etc.)
The problem happens when you are using bare-metal Kubernetes cluster where you have to provision and manage the storage by yourself. There are several options available when you’re on bare metal setup:
- Use Rook.io or Rancher’s Longhorn (thanks again) to setup a clustered storage within the Kubernetes cluster
- Use networked storage cluster running on Ceph or GlusterFS
- Use proprietary solutions like PortWorx, StorageOS, OpenEBS
- Use Local Persistent Volume or HostPath on local node
- many other ways
I was interested to use Local Persistent Volume for my setup because of its speed. There’s one caveat when using clustered network storage solution (eg. Ceph/ GlusterFS / Rook.io), the IO performance will drop by ~60% of its original performance because of data replication, but of course you don’t have to worry about data loss when the disks/nodes are missing. However, the Local Persistent Volume feature just released in official Kubernetes v1.14 and it DOES NOT come as dynamic storage provisoner. I wonder if they working to make it as dynamic storage provisioner in the future.
Introducing Local Path Provisioner#
This is where Local Path Provisioner comes in. Local Path Provisioner provides its user the convenience to dynamically provision Persistent Volume (PV) based on Persistent Volume Claim (PVC). It utilize the local storage in each node. Based on the user configuration, the Local Path Provisioner will create hostPath
based persistent volume on the node automatically. It works just fine for my use case for now.
How to use Local Path Provisioner on your local cluster#
Step 1: Install the Local Path Provisioner in your cluster (Skip this if you are using k3s)
$ kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
Step 2: Prepare a PVC
$ kubectl create -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/examples/pvc.yaml
Step 3: Create a volume-test
pod
$ kubectl create -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/examples/pod.yaml
Step 4: Check for the PV being created for you
# PV is created
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-bc3117d9-c6d3-11e8-b36d-7a42907dda78 2Gi RWO Delete Bound default/local-path-pvc local-path 4s
# PVC is bounded
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
local-path-pvc Bound pvc-bc3117d9-c6d3-11e8-b36d-7a42907dda78 2Gi RWO local-path 16s
Step 5: Create a test file in the PV
$ kubectl exec volume-test -- sh -c "echo local-path-test > /data/test"
$ kubectl exec volume-test -- sh cat /data/test
> local-path-test
Step 6: Check the directory /var/lib/rancher/k3s/storage
on the node’s disk
$ ssh <kubernetes node ip>
$ ls /var/lib/rancher/k3s/storage/
> pvc-d82e480f-2929-4717-a275-3d892ab63d4d
$ cat /var/lib/rancher/k3s/storage/pvc-d82e480f-2929-4717-a275-3d892ab63d4d/test
> local-path-test
Tada!! Your file is persisted in the node’s disk
Check out official Github repository for more information about this.
What happens under the hood#
The way Local Path Provisioner works is actually very easy. Generally, Local Path Provisioner implements kubernete-incubator/external-provisioner module and they add implementation on top of it.
How Local Path Provisioner is setup:
- A
local-path-provisioner
pod is deployed in your cluster. - The pod runs a
local-path-provisioner daemon
. - When it’s first initiated, it will load config file from Kubernetes ConfigMap to determine the path where to mount in the node. By default, the value is
/var/lib/rancher/k3s/storage
. - The daemon will proactively monitor the config if there is any new change.
- The daemon will listen to Kubernetes cluster events using kubernetes-go client.
This workflow of Local Path Provisioner:
- User creates a PVC request. Local Path Provisioner daemon will listen to the Kubernetes events for PVC requests from it’s Kubernetes go client
- The daemon will invoke external-provisioner
controller.go
(refer below) - The controller will invoke
Provision()
function inprovisioner.go
to prepare the volume Provision()
function will launch a temporary Helper pod
- A busybox container
- It mounts
/var/lib/rancher/k3s/storage
in the host filesystem - It creates a new directory
/var/lib/rancher/k3s/storage/random-id-1
in host filesystem
- Once the new directory is ready, it will returns a PersistentVolume object to
controller.go
with path of the new directory and NodeAffinity of the server. - Voila! Your PV is ready to be consumed.
* controller.go
refers to /vendor/github.com/kubernetes-incubator/external-storage/lib/controller/controller.go
in the repository
You can clearly see them by tail
the log
$ kubectl logs local-path-provisioner-58fb86bdfd-5xhkv
ERROR: logging before flag.Parse: I1213 08:38:32.264357 1 leaderelection.go:187] attempting to acquire leader lease kube-system/rancher.io-local-path...
ERROR: logging before flag.Parse: I1213 08:38:32.290243 1 leaderelection.go:196] successfully acquired lease kube-system/rancher.io-local-path
ERROR: logging before flag.Parse: I1213 08:38:32.291043 1 controller.go:572] Starting provisioner controller rancher.io/local-path_local-path-provisioner-58fb86bdfd-5xhkv_f1c9359d-1d83-11ea-a159-5aac2052443a!
ERROR: logging before flag.Parse: I1213 08:38:32.292231 1 event.go:221] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"rancher.io-local-path", UID:"c727ff38-96e8-4535-a8cb-73f8e4959ab3", APIVersion:"v1", ResourceVersion:"392", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' local-path-provisioner-58fb86bdfd-5xhkv_f1c9359d-1d83-11ea-a159-5aac2052443a became leader
ERROR: logging before flag.Parse: I1213 08:38:32.391293 1 controller.go:621] Started provisioner controller rancher.io/local-path_local-path-provisioner-58fb86bdfd-5xhkv_f1c9359d-1d83-11ea-a159-5aac2052443a!
ERROR: logging before flag.Parse: I1213 08:43:56.017931 1 controller.go:927] provision "default/local-path-pvc" class "local-path": started
time="2019-12-13T08:43:56Z" level=info msg="Creating volume pvc-d82e480f-2929-4717-a275-3d892ab63d4d at k3s:/var/lib/rancher/k3s/storage/pvc-d82e480f-2929-4717-a275-3d892ab63d4d"
ERROR: logging before flag.Parse: I1213 08:43:56.025417 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"local-path-pvc", UID:"d82e480f-2929-4717-a275-3d892ab63d4d", APIVersion:"v1", ResourceVersion:"713", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/local-path-pvc"
time="2019-12-13T08:44:18Z" level=info msg="Volume pvc-d82e480f-2929-4717-a275-3d892ab63d4d has been created on k3s:/var/lib/rancher/k3s/storage/pvc-d82e480f-2929-4717-a275-3d892ab63d4d"
ERROR: logging before flag.Parse: I1213 08:44:18.120204 1 controller.go:1027] provision "default/local-path-pvc" class "local-path": volume "pvc-d82e480f-2929-4717-a275-3d892ab63d4d" provisioned
ERROR: logging before flag.Parse: I1213 08:44:18.120418 1 controller.go:1041] provision "default/local-path-pvc" class "local-path": trying to save persistentvolume "pvc-d82e480f-2929-4717-a275-3d892ab63d4d"
ERROR: logging before flag.Parse: I1213 08:44:18.132018 1 controller.go:1048] provision "default/local-path-pvc" class "local-path": persistentvolume "pvc-d82e480f-2929-4717-a275-3d892ab63d4d" saved
ERROR: logging before flag.Parse: I1213 08:44:18.132069 1 controller.go:1089] provision "default/local-path-pvc" class "local-path": succeeded
ERROR: logging before flag.Parse: I1213 08:44:18.132427 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"local-path-pvc", UID:"d82e480f-2929-4717-a275-3d892ab63d4d", APIVersion:"v1", ResourceVersion:"713", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-d82e480f-2929-4717-a275-3d892ab63d4d
We can see that line 7, the local-path daemon received a request of PVC. That is where the daemon do its job. The rest of the lines are self-explanatory.
Lets inspect the Kubernetes event stream
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
14m Normal Starting node/k3s Starting kubelet.
...
8m44s Normal WaitForFirstConsumer persistentvolumeclaim/local-path-pvc waiting for first consumer to be created before binding
8m34s Normal Provisioning persistentvolumeclaim/local-path-pvc External provisioner is provisioning volume for claim "default/local-path-pvc"
8m14s Normal ExternalProvisioning persistentvolumeclaim/local-path-pvc waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
8m12s Normal ProvisioningSucceeded persistentvolumeclaim/local-path-pvc Successfully provisioned volume pvc-d82e480f-2929-4717-a275-3d892ab63d4d
<unknown> Normal Scheduled pod/volume-test Successfully assigned default/volume-test to k3s
8m11s Normal Pulling pod/volume-test Pulling image "nginx:stable-alpine"
7m58s Normal Pulled pod/volume-test Successfully pulled image "nginx:stable-alpine"
7m58s Normal Created pod/volume-test Created container volume-test
7m57s Normal Started pod/volume-test Started container volume-test``
There is nothing much in the Kubernetes event stream. It’s just that you could see how see the process of how the PV is being scheduled, successfully created and bounded.
How the PV get deleted#
I leave it to you to discover yourself. Hint: It’s kinda reverse of what we’ve seen here. Hit me up on Twitter if you need more clue :D
This is all what I got after reading the Go code & Kubernetes documentation for few hours. I hope you find it beneficial as I do. If you get creative enough, you might be able to write your own Kubernetes storage external provisioner to suit your need, maybe to invoke Lambda API requests here and there and send you notifications :D
Disclaimer: I am not associated with Kubernetes community nor the Rancher. If you find this article is false or inaccurate, feel free to reach me.
If you have any questions or anything to add and to correct me, feel free to reach me on Twitter at @sdil :)