In this blog post, we describe Kubernetes Volume Controller (KVC). It is an open source project we’ve developed, which provides basic volume and data management in Kubernetes tailored towards machine learning (ML) workloads and pipelines.
Data is an important component in ML workloads and pipelines. Typically, data scientists and ML practitioners handle the data using existing primitives available through a scheduling system such as Kubernetes. However, the users still need to keep track of the data as well as the relationship between data and the primitives used. Data from multiple sources might be required to run their workloads and pipelines. In some cases (e.g., hyperparameter tuning), the data might need to be replicated or made available in some of the compute nodes in a cluster. Making sure the data is available on the nodes in which the ML task is scheduled is also cumbersome. Moreover, the user experience with any software that alleviates these issues should be seamless. It should integrate well with existing ML workflows and should not hinder the progress of users.
For example, to enable the execution of an ML workload for a user group, cluster operators might download a frequently used dataset manually to a subset of compute nodes in the cluster and label the nodes appropriately to indicate the presence of that dataset. A data scientist who needs to use that dataset in their workload will have to keep track of these labels and make sure their ML workload lands on a compute node where the required dataset is available in order to consume the data. The same holds true for frequently used models. If the data scientist wants to explore a new dataset or a model, it might cause further difficulties and delays. Spending time in the orchestration of such a workflow is a drain on the productivity of a data scientist.
We believe data scientists, ML practitioners, and cluster operators would be happy to offload these systems in ML issues to the scheduling substrate when possible. The goal of the Kubernetes Volume Controller (KVC) is to solve these issues for ML workloads and pipelines for the Kubernetes container orchestration system.
KVC provides a single interface to manage data from different data sources in a Kubernetes cluster using existing primitives such as API extension capabilities and volumes. It establishes a relationship between data and volumes and provides a way to abstract the details away from the user. When using KVC, users are expected to only interact with a single resource type in Kubernetes without having to worry about other underlying complexities.
Kubernetes natively supports a variety of volumes backed by different sources. However, data management for ML workloads on Kubernetes gives rise to several challenges with respect to user experience and system software. We describe some of the challenges below:
KVC leverages the operator pattern in Kubernetes to satisfy the requirements specified above for data management for ML workloads. It consists of a custom resource definition (CRD) and a custom controller which drives the current state [i.e., spec of a KVC custom resource (CR)] to its desired state (i.e., the status of a KVC CR). An example CR is shown in Figure 1. Each CR can contain one or more VolumeConfigs from different data sources and the metadata required to establish a relationship between volume and data and the information required to track and manage it. Each VolumeConfig contains an ID, the number of replicas required of this particular data, a data source type, options specific to the data source type and labels to annotate the data. These labels can also be used in retrieving and searching for a specific dataset in a cluster. The full schema for each data source type is described here.
Figure 1. Example KVC CR spec.
When created, the CR goes into a Pending state. The custom controller drives the execution of this CR to the desired state (i.e., the Running state) by creating appropriate sub-resources and managing the data transfer when required. Depending on the data source type, the custom controller either creates persistent volumes (PVs) and persistent volume claims (PVCs) or creates a host path volume and exposes the path along with node affinity details to guide the scheduling of the pods for data gravity. An example status of a KVC CR can be seen in Figure 2.
The status of a CR gives the details regarding the current state of the resource. The controller will update the status with a Running state and an array of volume statuses which have a 1-to-1 mapping with the array of VolumeConfigs if everything was executed successfully. If there were any errors, this error is bubbled up in the CR along with a corresponding verbose message guiding the user to debug the CR.
Figure 2. Example KVC CR status.
The node affinity information provided in the CR status can be used as-is in a pod spec along with the host path to access the data. For example, to use the CR status specified in Figure 2, the node affinity details can be added in the node affinity field and the host path details can be added in the volumes field of a pod spec, respectively.
If the data is in S3, a user can create a KVC CR with S3 data source type and the additional metadata required to access the data. KVC will then provision the data on nodes equal to the number of replicas and provide the node affinity details along with the host path on those nodes.
Figure 3. Example KVC Workflow for S3 Data Source Type
Figure 3 shows an example flow on how the KVC custom controller drives the execution for S3 data source. When a CR of S3 data source type with the location and the expected number of replicas is created, the controller choses a set of nodes, deploys a set of pods guiding them to each of nodes and downloads the data from S3 location provided in the CR. If the download is successful, the CR is updated with the appropriate volume source and the node affinity required to guide tasks to land on nodes where the data is available. Otherwise, appropriate error is propagated to the CR status.
If the data is located in an NFS share, the user can create a KVC CR with the SourceType as NFS and provide the NFS server IP and an exported path. The PV and PVC pair provided in the status can be used in a pod spec to mount the data.
Figure 4. Example KVC workflow for NFS Source Type
Figure 4 illustrates an example flow for an NFS source type. When a CR is created with NFS as the source type, the KVC controller creates a NFS PV and a PVC using the server endpoint location provided in the CR and exposes them via the status of the CR. If there was any error in the process, the appropriate error is updated in the CR status.
Every data source type should implement the DataHandler interface. Additional source types can be added by implementing the same interface. Example on how to implement the interface can be seen in this pull request.
Read the developer manual in KVC repository to get more information on contributions. Provide feedback, ideas and report bugs by opening and commenting on issues. You can get in touch with us in the Kubeflow community slack channel or emailing to the kubeflow-discuss mailing list. We look forward to hearing from you!
We thank Elson Rodriguez, Jeremy Lewi, Nan Liu, Jose Aguirre, Scott Leishman and Jason Knight for providing feedback on this project.