Install Alluxio on Kubernetes

Slack Docker Pulls

This documentation shows how to install Alluxio on Kubernetes via Operator, a Kubernetes extension for managing applications.

Prerequisites

  • Kubernetes
    • A Kubernetes cluster with version at least 1.19, with feature gates enabled
    • Ensure the cluster’s Kubernetes Network Policy allows for connectivity between applications (Alluxio clients) and the Alluxio Pods on the defined ports
    • The Kubernetes cluster has helm 3 with version at least 3.6.0 installed
    • Image registry for storing and managing container image
  • Permissions. Reference: Using RBAC Authorization
    • Permission to create CRD (Custom Resource Definition)
    • Permission to create ServiceAccount, ClusterRole, and ClusterRoleBinding for the operator pod
    • Permission to create namespace that the operator will be in

Preparation

Download the files for Alluxio operator and Alluxio cluster

  • alluxio-operator-1.2.0-helmchart.tgz is the helm chart for deploying Alluxio operator
  • alluxio-k8s-operator-1.2.0-docker.tar is the docker image for Alluxio operator
  • alluxio-csi-1.2.0-docker.tar is the docker image for Alluxio CSI, which should be required by default
  • alluxio-enterprise-AI-3.2-5.2.0-docker.tar is the docker image for Alluxio

Upload the images to an image registry

This example shows how to upload Alluxio operator image. Repeat these steps for all the images above.

# load the image to local
$ docker load -i alluxio-k8s-operator-1.2.0-docker.tar

# retag the image with your private registry
$ docker tag alluxio/k8s-operator:1.2.0 <YOUR.PRIVATE.REGISTRY.HERE>/alluxio/k8s-operator:1.2.0

# push to the remote registry
$ docker push <YOUR.PRIVATE.REGISTRY.HERE>/alluxio/k8s-operator:1.2.0

Extract the helm chart for operator

# the command will extract the files to the directory alluxio-operator/
$ tar zxf alluxio-operator-1.2.0-helmchart.tgz

Prepare configuration files

  • For the operator, put the configurations in alluxio-operator/alluxio-operator.yaml
image: <YOUR.PRIVATE.REGISTRY.HERE>/alluxio/k8s-operator
imageTag: 1.2.0
alluxio-csi:
  image: <YOUR.PRIVATE.REGISTRY.HERE>/alluxio/csi
  imageTag: 1.2.0
  • For the Alluxio cluster, put the configurations in alluxio-operator/alluxio-cluster.yaml to describe the cluster. To create a standard cluster, you can use the minimal configuration. The properties in .spec.properties field will pass to Alluxio processes via an alluxio-site.properties configuration file.
  • To mount an external storage to the Alluxio cluster, put the configurations in alluxio-operator/ufs.yaml. The example will mount an existing S3 path to Alluxio. For more information, please refer to Storage Overview.
apiVersion: k8s-operator.alluxio.com/v1
kind: UnderFileSystem
metadata:
  name: alluxio-s3
spec:
  alluxioCluster: alluxio
  path: s3://my-bucket/path/to/mount
  mountPath: /s3
  mountOptions:
    s3a.accessKeyId: xxx
    s3a.secretKey: xxx
    alluxio.underfs.s3.region: us-east-1

Deployment

Deploy Alluxio operator

# the last parameter is the directory to the helm chart
$ helm install operator -f alluxio-operator/alluxio-operator.yaml alluxio-operator
NAME: operator
LAST DEPLOYED: Wed May 15 17:32:34 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

# verify if the operator is running as expected
$ kubectl get pod -n alluxio-operator
NAME                                              READY   STATUS    RESTARTS   AGE
alluxio-controller-6b449d8b68-njx7f               1/1     Running   0          45s
operator-alluxio-csi-controller-765f9fd65-drjm4   2/2     Running   0          45s
operator-alluxio-csi-nodeplugin-ks262             2/2     Running   0          45s
operator-alluxio-csi-nodeplugin-vk8r4             2/2     Running   0          45s
ufs-controller-65f7c84cbd-kll8q                   1/1     Running   0          45s

Deploy Alluxio

$ kubectl create -f alluxio-operator/alluxio-cluster.yaml
alluxiocluster.k8s-operator.alluxio.com/alluxio created

# the cluster will be starting
$ kubectl get pod
NAME                                          READY   STATUS              RESTARTS   AGE
alluxio-etcd-0                                0/1     ContainerCreating   0          7s
alluxio-etcd-1                                0/1     ContainerCreating   0          7s
alluxio-etcd-2                                0/1     ContainerCreating   0          7s
alluxio-master-0                              0/1     Init:0/1            0          7s
alluxio-monitor-grafana-847fd46f4b-84wgg      0/1     Running             0          7s
alluxio-monitor-prometheus-778547fd75-rh6r6   1/1     Running             0          7s
alluxio-worker-76c846bfb6-2jkmr               0/1     Init:0/2            0          7s
alluxio-worker-76c846bfb6-nqldm               0/1     Init:0/2            0          7s

# check the status of the cluster
$ kubectl get alluxiocluster
NAME      CLUSTERPHASE   AGE
alluxio   Ready          2m18s

# and check the running pods after the cluster is ready
$ kubectl get pod
NAME                                          READY   STATUS    RESTARTS   AGE
alluxio-etcd-0                                1/1     Running   0          2m3s
alluxio-etcd-1                                1/1     Running   0          2m3s
alluxio-etcd-2                                1/1     Running   0          2m3s
alluxio-master-0                              1/1     Running   0          2m3s
alluxio-monitor-grafana-7b9477d66-mmcc5       1/1     Running   0          2m3s
alluxio-monitor-prometheus-78dbb89994-xxr4c   1/1     Running   0          2m3s
alluxio-worker-85fd45db46-c7n9p               1/1     Running   0          2m3s
alluxio-worker-85fd45db46-sqv2c               1/1     Running   0          2m3s

Note that in Alluxio 3.x, the “master” component is not critical on the I/O path and becomes a stateless component that only serves jobs like distributed load.

Mount storage to Alluxio

$ kubectl create -f alluxio-operator/ufs.yaml
underfilesystem.k8s-operator.alluxio.com/alluxio-s3 created

# verify the status of the storage
$ kubectl get ufs
NAME         PHASE   AGE
alluxio-s3   Ready   46s

# also check the mount table via Alluxio command line
$ kubectl exec -it alluxio-master-0 -- alluxio mount list 2>/dev/null
Listing all mount points
s3://my-bucket/path/to/mount  on  /s3/ properties={s3a.secretKey=xxx, alluxio.underfs.s3.region=us-east-1, s3a.accessKeyId=xxx}

Configuration

Minimal configuration

Starting with the minimal configuration, we can build a standard cluster:

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
metadata:
  name: alluxio
spec:
  image: <YOUR.PRIVATE.REGISTRY.HERE>/alluxio/alluxio-enterprise
  imageTag: AI-3.2-5.2.0
  properties:
    alluxio.license: "xxx"

  worker:
    count: 2

  pagestore:
    quota: 1000Gi

Common use cases

Change the resource limitations

For every component, like worker, master, and FUSE, we can change the resource by the following configuration:

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  worker:
    resources:
      limits:
        cpu: "12"
        memory: "36Gi"
      requests:
        cpu: "1"
        memory: "32Gi"
    jvmOptions:
      - "-Xmx22g"
      - "-Xms22g"
      - "-XX:MaxDirectMemorySize=10g"
  • The container will never be able to access the resource over the limits, and the requests are used during scheduling. For more information, please refer to Resource Management for Pods and Containers
  • The limit of the memory should be a little bit over the sum of the heap size(-Xmx) and the direct memory size(-XX:MaxDirectMemorySize=10g) to avoid out-of-memory problems.

Use PVC for page store

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  pagestore:
    type: persistentVolumeClaim
    storageClass: ""
    quota: 1000Gi
  • The PVC will be created by the operator
  • The storageClass defaults to standard, but can be specified to empty string for static binding

Mount NAS or other host path

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  hostPaths:
    worker:
      /mnt/nas:/ufs/data
    fuse:
      /mnt/nas:/ufs/data
  • The key is the host path on the node, and the value is the mounted path in the container
  • If using a NAS as the UFS, the same path needs to be mounted to both the workers and FUSE processes so that the FUSE can fallback if any error occurs

Mount custom config maps

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  configMaps:
    custom-config-map: /etc/custom
  • The key is the name of the ConfigMap, and the value if the mounted path in the container
  • The /opt/alluxio/conf is already mounted by default. The custom config maps need to mount to other paths

Use the root user

The FUSE pod will always use the root user. The other processes use the user with uid 1000 by default. In the container, the user is named alluxio. To change it to the root user, use this configuration:

apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  user: 0
  group: 0
  fsGroup: 0
  • Sometimes it’s enough to specify the .spec.fsGroup = 0 only when the files can be accessed by the root group
  • The ownership of the mounted host path, such as the page store path and log path, will be transferred to root if changing to the root user.