ShuangLu
7 min readMar 27, 2022

--

What happens when volumeManager in the kubelet starts?

TL;DR

The volumeManager is initialized along with the initialization of kubelet and is started by kubelet as well. It starts 3 async call with ‘goroutine’ to maintain the objects ‘desiredStateOfWorld’ and ‘actualStateOfWorld’ and ‘reconcile’ the ‘volumes’ on the node to the desired state.

The code is v1.21.2 and some of my understanding might be incorrect. Feel free to leave your comment.

Important Struct

actualStateOfWorld
  • attachedVolumes: AttachedVolume represents a volume the kubelet volume manager believes to be successfully attached to a node it is managing. Volume types that do not implement an attacher are assumed to be in this state.
attachedVolumes
  • mountedPods: The mountedPod object represents a pod for which the kubelet volume manager believes the underlying volume has been successfully been mounted.
mountedPods
desiredStateOfWorld
  • volumeToMount: The volume object represents a volume that should be attached to this node and mounted to podsToMount.
volumeToMount
  • podToMount: The pod object represents a pod that references the underlying volume and should mount it once it is attached.
podToMount
MountedVolume
volumeToMount

Stages

Initialization

  • The initialization of “volumeManager”(vm) happens when kubelet starts.
  • In the initialization, it initializes two cache objects “desiredStateOfWorld” (dsw) and “actualStateOfWorld”(asw) and one ‘operationExecutor’ which is to start async attach/detach/mount/unmount operations
  • It also initializes the ‘desiredStateOfWorldPopulator’(dswp) which populates the desiredStateOfWorld using the kubelet PodManager and the ‘reconciler’ which reconciles the dsw with the asw by triggering attach, detach, mount, and unmount operations using the operationExecutor
Initialization

Run

  • Run’ is called in the ‘Run’ of kubelet
  • There are three async calls in the ‘Run’.
Run
volumePluginMgr.Run
dswp.Run

findAndAddNewPods

  1. It creates a map ‘mountedVolumesForPod’ for unique pod name to outer volume name to MountedVolume
  2. It calls ‘GetAllMountedVolumes’ of ‘asw’ to retrieve all mountedVolumes which is actually a combination of ‘pod’ and ‘volume’ objects defined in ‘operationExecutor’ and then it iterates each mounted volume to update ‘mountedVolumesForPod’
  3. It iterates the pods retrieved by ‘podManager’ and runs ‘processPodVolumes’ to add the volumes to the ‘volumesToMount’ of ‘dsw’ and also update the ‘podsToMount’ under ‘volumesToMount’.
dswp.findAndAddNewPods

findAndRemoveDeletedPods

  1. It calls ‘GetVolumesToMount’ of ‘dsw’ to retrieve all ‘volumeToMount’ and iterates them.
  2. In the iteration, for pods exists in podManager, if the volume doesn’t have attachable plugin, it updates the ‘pluginIsAttachable’ of ‘volumesToMount’ to false. If the pod is terminated and ‘keepTerminatedPodVolumes’ is enabled, it will move to next item.
  3. For pod doesn’t exist in podManager or pod is terminated and ‘keepTerminatedPodVolumes’ is disabled, it calls ‘GetPods’ to fetch the running pods from the kubeContainerRuntime.
  4. It iterates the pods from container runtime and check if any running container. If there is container still in running state, the ‘dswp’ exits the loop and move to next ‘volumeToMount’. If there is no running container, it checks whether the volume is in the ‘attachedVolumes’ of ‘asw’. If it meets other requirements and good to remove, it deletes the corresponding ‘podsToMount’ object from the ‘volumesToMount’ and if there is no child pod referring to this volume, the volume object is deleted from ‘volumesToMount’ as well.
dswp.findAndRemoveDeletedPods
  • 3rd one is to call ‘Run’ of the ‘reconciler’ to checks if volumes that should be mounted are mounted and volumes that should be unmounted are unmounted. There are three functions in the ‘reconcile’ to execute which are ‘unmountVolumes’, ‘mountAttachVolumes’ and ‘unmountDetachDevices’.
reconciler.Run

unmountVolumes

  1. It calls ‘GetAllMountedVolumes’ of ‘asw’ to retrieve all mountedVolumes which is actually a combination of ‘pod’ and ‘volume’ objects defined in ‘operationExecutor’ and then it iterates each mounted volume.
  2. For each volume and pod, it checks whether the volume or pod exists in the ‘volumesToMount’ of ‘dsw’. If not, it calls ‘UnmountVolume’ of ‘operationExecutor’ to unmount the volume.
  3. Depends on the fs type of the volume, the ‘operationExecutor’ calls two different functions to generate the ‘UnmountVolume function needed to perform the unmount of a volume plugin’. For ‘filesystem’ volume, the ‘operationGenerator’ finds the volume plugin from ‘volumeToUnmount’ and creates the corresponding ‘volumeUnmounter’. For example, if it’s configmap volume, the implementation in the ‘configmap’ will be invoked. With this unmounter, an ‘unmountVolumeFunc’ is created which includes the operation to clean up directory, use the unmounter to ‘TearDown’ the volume which logs the ‘“UnmountVolume.TearDown succeeded xxxx’ at the same time and clean up the volume/pod in ‘attachedVolumes’ or ‘mountedPods’ of ‘asw’. Then the ‘operationExecutor’ calls ‘Run’ of ‘pendingOperations’ to execute the ‘unmountVolumeFunc’ in order to avoid the operations on the same volume at the same time.
reconciler.unmountVolumes

mountAttachVolumes

  1. It calls ‘GetVolumesToMount’ of ‘dsw’ to retrieve all ‘volumesToMount’ and iterates them.
  2. For each volume and pod, it checks whether the volume or pod exists in the ‘attachedVolumes’ of the ‘asw’. If the volume doesn’t exist, the ‘asw’ returns ‘newVolumeNotAttachedError’ else it checks whether the specific pod exists and return the result according to the state.
  3. When ‘asw’ returns the ‘newVolumeNotAttachedError’, the ‘reconciler’ checks whether the ‘controllerAttachDetachEnabled’ is enabled or whether ‘PluginIsAttachable’ of the ‘volumeToMount’ is disabled. If any of them is true, the ‘reconciler’ calls ‘operationExecutor’ to execute the operation ‘VerifyControllerAttachedVolume’ . During this, the ‘operationExecutor’ generates a function called ‘verifyControllerAttachedVolumeFunc’ to achieve that actually. In this function, if ‘PluginIsAttachable’ of the ‘volumeToMount’ is false, the volume and devicePath detail will be updated to the ‘attachedVolume’ of ‘asw’ via calling ‘AttachVolume’. If ‘ReportedInUse’ of the ‘volumeToMount’ is false, the ‘operationExecutor’ will return ‘Volume has not been added to the list of VolumesInUse in the node’s volume status’. Else the ‘operationExecutor’ fetches the node object from the apiserver and iterates all the volumes in ‘VolumesAttached’ of node status. If any attached volume matches with the ‘VolumeName’ of ‘’volumeToMount’, the volume will be ‘MarkVolumeAsAttached’ in ‘asw’ and log ‘Controller attach succeeded’.
  4. When the ‘controllerAttachDetachEnabled’ is disabled and ‘PluginIsAttachable’ of the ‘volumeToMount’ is enabled, the ‘operationExecutor’ calls ‘AttachVolume’ and in that it calls ‘GenerateAttachVolumeFunc’. In ‘GenerateAttachVolumeFunc’, it finds the ‘attachableVolumePlugin’ and creates ‘volumeAttacher’ which is similar as ‘volumeUnmounter’ mentioned previously and is implemented by each volume plugin itself. The ‘volumeAttacher’ is executed to attach the volume and if it succeeds, there will be log ‘AttachVolume.Attach succeeded’. The ‘MarkVolumeAsAttached’ of ‘asw’ is called eventually.
  5. When ‘asw’ returns the volume is not mounted or error is ‘IsRemountRequiredError’, the ‘operationExecutor’ calls ‘mountVolume’ to execute the mount operation and in that it calls ‘GenerateMountVolumeFunc’ if it’s filesystem volume.
  6. In ‘GenerateMountVolumeFunc’, it finds the ‘volumePlugin’ from ‘VolumeSpec’ of ‘volumeToMount’ and it creates ‘mountVolumeFunc’. In the function, it validates the nodeAffinity, create a new mounter, creates a volumeAttacher if there is ‘attachableVolumePlugin’, creates a volumeDeviceMounter if there is ‘deviceMountableVolumePlugin’. If there is ‘volumeAttacher’, it will WaitForAttach which is to validate the volume status from the OS. If there is ‘volumeDeviceMounter’ and the result of ‘GetDeviceMountState’ is not ‘DeviceGloballyMounted’, it will call the ‘MountDevice’ of the ‘volumeDeviceMounter’ which is to call the implementation of each plugin to mount the volume. If this succeeds, it will mark the device as mounted in ‘asw’ via calling ‘MarkDeviceAsMounted’. At the end calls the ‘SetUp’ implementation of the specific plugin and calls ‘MarkVolumeAsMounted’ in ‘asw’ if it succeeds.
  7. Once the ‘GenerateMountVolumeFunc’ is created, it returns the ‘Run’ of ‘pendingOperations’ with the generated function as parameter.
  8. When ‘asw’ returns the error which is ‘IsFSResizeRequiredError’, the ‘operation executor’ calls ‘ExpandInUseVolume’ to resize the filesystem.
reconcile.mountAttachVolumes

--

--