Rook Ceph Continued…
There are more moving parts to Ceph that you’ll probably want to set up, but aren’t strictly needed. I’ll document those here. Feel free to skip anything in this section that you don’t care about.
Snapshot Classes
For some backup utilities, or even for manual backup, you’ll probably want to
be able to create point-in-time snapshots of your PVCs with COW (copy-on-write)
semantics. This is something that Ceph natively supports using the hidden (even
from ls -al) directories called .snap that exist in every directory in the
filesystem. However, it takes a bit more setup to do this “the Kubernetes way”.
First, we need to install the optional “external-snapshotter” manifests to help Kubernetes understand what a snapshot is and how to manage snapshot resources. These manifests are official resources that are agnostic to the CSI driver used, but do not always come with Kubernetes by default.
# Install snapshot CRDs
kubectl kustomize https://github.com/kubernetes-csi/external-snapshotter/client/config/crd | kubectl create -f -
# Install the CSI-agnostic snapshot controller
kubectl -n kube-system kustomize https://github.com/kubernetes-csi/external-snapshotter/deploy/kubernetes/snapshot-controller | kubectl create -f -
Now that the snapshotter is installed, the following YAML file will create a
VolumeSnapshotClass that will tell Kubernetes how to take these point-in-time
snapshots:
# kubectl apply -f <file>
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: rook-fs-snap
driver: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph # name of the CephCluster resource
csi.storage.k8s.io/snapshotter-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph
deletionPolicy: Delete
Then, you’ll be able to create snapshots of PVCs using a manifest like this:
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: cephfs-pvc-test-snap
namespace: default
spec:
volumeSnapshotClassName: rook-fs-snap
source:
persistentVolumeClaimName: cephfs-pvc-test
… and create a new PVC to use that snapshot like this:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc-test-snap-view
namespace: default
spec:
dataSource:
name: cephfs-pvc-test-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 16Ti
Block Storage
If you plan on running virtual machines, have specific filesystem requirements, etc, you’ll likely want to set up block storage. The following yaml will set up a Ceph RBD pool backed by ssd storage:
# kubectl apply -f <file>
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: vm-storage-ssd
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
deviceClass: ssd
This one will set up a block-based storage class to consume the
CephBlockPool:
# kubectl apply -f <file>
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: vm-storage-ssd
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
clusterID: rook-ceph
# Ceph pool into which the RBD image shall be created
pool: vm-storage-ssd
# RBD image format. Defaults to "2".
imageFormat: "2"
# For 5.4 or later kernels:
imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
# The secrets contain Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
# Optional, if you want to add dynamic resize for PVC.
# For now only ext3, ext4, xfs resize support provided, like in Kubernetes itself.
allowVolumeExpansion: true
NFS Server
The following yaml will provision the Ceph NFS gateway, which will allow you to mount anything in CephFS via NFS. This is exceptionally useful for troubleshooting deployments, as it’s much easier to port forward, does not require authentication (except IP address-based access controls), and doesn’t require any special drivers.
# kubectl apply -f <file>
---
apiVersion: ceph.rook.io/v1
kind: CephNFS
metadata:
name: nfs-server
namespace: rook-ceph
spec:
server:
active: 1
Please keep in mind that the usual NFS limitations apply. NFS is NOT a POSIX-compliant filesystem, and if your use case requires POSIX compliance, you’re going to have a bad time. This is simply a means of accessing your PVCs as ordinary document stores. File locks may not be respected, and I/O will fail if a file is deleted (this sounds like intuitive behavior, but actually is not how most filesystems work, and can cause real problems in production).
You’ll also need a service to access this NFS server. Here is an example:
# kubectl apply -f <file>
---
apiVersion: v1
kind: Service
metadata:
name: nfs-server
namespace: rook-ceph
spec:
ports:
- name: nfs
port: 2049
type: LoadBalancer
loadBalancerIP: 10.3.0.192 # Set this to your IP address
externalTrafficPolicy: Local
selector:
# Use the name of the CephNFS here
ceph_nfs: nfs-server
# It is safest to send clients to a single NFS server instance. Instance "a" always exists.
# ^-- Please pay attention to this!!!
# NFS can get weird in high availability deployments!!!
# I'm tired of cleaning up messes caused by poor NFS deployments...
instance: a
Now, to add an NFS share for the cephfs-test-pvc we created earlier, run the following in your shell:
# Get the path to the named PVC within the CephFS cluster:
pvc_name="cephfs-pvc-test"
pv_name="$(kubectl get pvc "$pvc_name" --template '{{index . "spec" "volumeName"}}')"
subvol_path="$(kubectl get pv "$pv_name" --template '{{index . "spec" "csi" "volumeAttributes" "subvolumePath"}}')"
# Create an NFS export for the PVC path:
ceph nfs export create cephfs nfs-server /test kubefs "$subvol_path"
Assuming you got no errors, you can mount the share with a command like this:
sudo mount -t nfs -o nfsvers=4.1,proto=tcp <nfs_service_ip>:/test /path/to/your/mountpoint
Now to clean up, run the following commands:
sudo umount /path/to/your/mountpoint
ceph nfs export rm nfs-server /test
kubectl delete pvc cephfs-pvc-test
Dashboard Access
You can add a Service for accessing the dashboard outside the cluster as documented in the official docs. If you don’t want to expose the dashboard outside the cluster network, I recommend using an alias like this:
Linux (wl-clipboard):
alias rook-dash='kubectl get secret -n rook-ceph rook-ceph-dashboard-password --template '{{index . "data" "password"}}' | base64 -d | wl-copy; echo '\''Copied password to clipboard'\''; kubectl port-forward -n rook-ceph service/rook-ceph-mgr-dashboard --address 127.0.0.1 8443:8443 & xdg-open '\''https://127.0.0.1:8443/'\''; fg'
Linux (xclip):
alias rook-dash='kubectl get secret -n rook-ceph rook-ceph-dashboard-password --template '{{index . "data" "password"}}' | base64 -d | xclip -selection clipboard; echo '\''Copied password to clipboard'\''; kubectl port-forward -n rook-ceph service/rook-ceph-mgr-dashboard --address 127.0.0.1 8443:8443 & xdg-open '\''https://127.0.0.1:8443/'\''; fg'
Mac:
alias rook-dash='kubectl get secret -n rook-ceph rook-ceph-dashboard-password --template '{{index . "data" "password"}}' | base64 -d | pbcopy; echo '\''Copied password to clipboard'\''; kubectl port-forward -n rook-ceph service/rook-ceph-mgr-dashboard --address 127.0.0.1 8443:8443 & open '\''https://127.0.0.1:8443/'\''; fg'
When run, this alias will pull the auto-generated admin secret from the cluster and copy it to your clipboard, then start a port forward and open the dashboard in your default browser.