Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Portworx is a popular Kubernetes persistent storage and Docker storage solution. It’s a clustered block storage solution and provides a Cloud-Native layer from which containerized stateful applications programmatically consume block, file, and object storage services directly through the scheduler.

With Portworx, you can manage any database or stateful service on any infrastructure using any container scheduler. You get a single data management layer for all of your stateful services, no matter where they run.

In this post, we will learn how to deploy Cassandra to Kubernetes and use Portworx Volumes to provide HA capability:

Install, configure Portworx
Use the Portworx Storage Class to create a PVC with 3 replicas of the data
Use a simple YAML file to deploy Cassandra using this storage class
How to validate data persistence by deleting the Cassandra pod

First, we will deploy Cassandra in a StatefulSet with a single node (replicas=1) to show the basics of node failover. We will create sample data, force Cassandra to flush the data to disk, and then failover the Cassandra pod and show how it comes back up with its data intact. Then, we’re going to show how we can scale the cluster to 3 nodes and dynamically create volumes for each.

Quick Snapshot

Step #1.Validate Kubernetes
Step #2.Install Portworx
Step #3: Create StorageClass
Step #4: Deploy Cassandra
Step #5: Create a Cassandra Database
Step #6: Delete Cassandra Instance
Step #7: Verify data is still available
Step #8: Scale the cluster
Additional Resources :

Step #1.Validate Kubernetes

Use kubectl get nodes to check if the Kubernetes nodes are ready.

Step #2.Install Portworx

Portworx requires at least 2 to 3 nodes in the cluster to have dedicated storage for use. It will then carve out virtual volumes from these storage pools. In this example, we use a 20GB block device that exists on each node.

Image – Choose the device to install portworx

In the above install command, note the below:

c=px-demo specifies the cluster name
b=true specifies to use internal etcd
kbVer=${VER} specifies the Kubernetes version
s=/dev/vdb specifies the block device to use

Use kubectl get pods -n kube-system -l name=portworx -o w to check if the Portworx pods are ready and status is in RUNNING state.

You can also take a look at the cluster status using the pxctl command as well.

Now, we have the Portworx cluster ready, we can proceed to the next step.

Step #3: Create StorageClass

StorageClass provides a way to describe the “classes” of storage. Various classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators.

Storage class may differ according to the needs of the business application. Now for our scenario, we have defined below storage class with a replication factor of 2 to accelerate Cassandra node recovery and we also defined a group name for Cassandra so that we can take 3DSnapshots.

Refer here for a full list of supported parameters for Portworx volume.

Create the storage class using kubectl create command.

In case of production environments, you would also have to add the "fg=true" parameter to your StorageClass to ensure that Portworx places each Cassandra volume and their replica on separate nodes so that in case of node failure we never failover to a node where it is already running. To enable this feature with a 3 volume group and 2 replicas you need a minimum of 6 worker nodes.

We have got StorageClass ready, let’s deploy Cassandra on the cluster.

Step #4: Deploy Cassandra

In this step, we are going to deploy a 3 node Cassandra application using a stateful set. StatefulSet is used to manage stateful applications i.e., maintains a sticky identity for each of their Pods. Kubernetes maintains a persistent identifier so that it can maintain across any rescheduling.

Create below Cassandra StatefulSet that uses a Portworx PVC created in the earlier step.

apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: "apps/v1beta1"
kind: StatefulSet
metadata:
name: cassandra
spec:
serviceName: cassandra
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
# Use the stork scheduler to enable more efficient placement of the pods
schedulerName: stork
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v14
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations:
volume.beta.kubernetes.io/storage-class: px-storageclass
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: cqlsh
spec:
containers:
- name: cqlsh
image: mikewright/cqlsh
command:
- sh
- -c
- "exec tail -f /dev/null"

Create the StatefulSet using kubectl create command.

Use kubectl get pods the command to validate if the pod is READY.

As an optional step, you can use pxctl the command line to inspect the volumes underlying volumes of Cassandra pod. that we have created.

From the output, infer the following

State indicates the volume is attached and shows the node on which it is attached and This is the node where the Kubernetes pod is running.
HA shows the number of configured replicas for this volume.
Labels show the name of the PVC for this volume.
Replica sets on nodes shows the px nodes on which volume is replicated.

Now that we have Cassandra ready, we can create a sample database and populate some data.

Step #5: Create a Cassandra Database

Initialize a sample database on our Cassandra instance using CQL commands.

Next step is to create a keyspace with replication of 3 and insert some sample data:

Image – Create a keyspace and insert sample data

Once the data is inserted, check if the same has been created.

Now that we have got the records created, we can proceed to check if the failover works properly or not but before that, we will have to flush (use nodetool flush command) the in-memory data onto disk so that when the Cassandra starts on another node it will have access to the data that was just written. Cassandra by default keeps data in memory and only flushes it to disk after 10 minutes by default.

Step #6: Delete Cassandra Instance

Let us simulate failure by cordoning the node where Cassandra is running and then deleting the Cassandra pod. The pod will then be rescheduled to make sure it lands on one of the nodes that have the replica of the data.

Once the Cassandra pod gets deleted, Kubernetes will start to create a new Cassandra pod on another node. Use kubectl get pods to verify, when the pod comes back up it will be in the RUNNING and READY(1/1) state.

Image – Verify replacement pod starts running

Also, we have to uncordon the node before the next step.

We have the new Cassandra pod running, let’s check if the database we previously created is still intact.

Step #7: Verify data is still available

Let’s start a CQL Shell session and validate if the data is available.

Image – Verify if data is still available

Congrats! we have our data and survived the node failure too!

Step #8: Scale the cluster

We will scale our Cassandra stateful set to 3 replicas using kubectl scale command.

You can watch the pods getting added:

It will take a minute or two for all three Cassandra nodes to come online and discover each other.

Additional Resources :

Summary

Article Name

Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Description

In this post, we will learn how to deploy Cassandra to Kubernetes and use Portworx Volumes to provide HA capability:

Author

Karthik

Publisher Name

Upnxtblog

Publisher Logo

Incassandra, kubernetes, portworx

Minimum Viable Product (MVP) Development: A Startup’s Roadmap to Success

How to Integrate Salesforce with Your Ecommerce Platform : Step-by-Step Guide

Guide To Building Successful eCommerce WordPress Site

How Paraphrasing is Helpful in Academic Work

How to Fix Microsoft Compatibility Telemetry High Disk Usage?

Get smallest, fastest, fully-conformant MicroK8s Kubernetes

How to run Java application as service on Linux

How to set memory limit for your Java containers?

Portworx Tutorial : Demonstrate HA Cassandra Stateful Application

Step #1.Validate Kubernetes

Step #2.Install Portworx

Step #3: Create StorageClass

Step #4: Deploy Cassandra

Step #5: Create a Cassandra Database

Step #6: Delete Cassandra Instance

Step #7: Verify data is still available

Step #8: Scale the cluster

Additional Resources :

Like this:

Related

Average Rating

Leave a Reply Cancel reply

Gateway API vs. Ingress API in Kubernetes: A Modern Approach to Traffic Management

Unlock the Potential of Java Microservices for Scalable Solutions

How You Can Improve Your Business’s Performance with a Kubernetes Ingress Controller

Enforcing policies with Kubewarden on Amazon EKS

Top Kubernetes Security Best Practices: Securing Kubernetes Workloads with OPA & OPA Gatekeeper in Amazon EKS

Choosing the Right Container Orchestration Service: A Guide to AppRunner, ECS, and EKS

Step #1.Validate Kubernetes

Step #2.Install Portworx

Step #3: Create StorageClass

Step #4: Deploy Cassandra

Step #5: Create a Cassandra Database

Step #6: Delete Cassandra Instance

Step #7: Verify data is still available

Step #8: Scale the cluster

Additional Resources :

Share this:

Like this:

Related

Average Rating