Wednesday, July 13, 2016
Thousand Instances of Cassandra using Kubernetes Pet Set
Editor’s note: this post is part of a series of in-depth articles on what’s new in Kubernetes 1.3
Running The Greek Pet Monster Races
For the Kubernetes 1.3 launch, we wanted to put the new Pet Set through its paces. By testing a thousand instances of Cassandra, we could make sure that Kubernetes 1.3 was production ready. Read on for how we adapted Cassandra to Kubernetes, and had our largest deployment ever.
It’s fairly straightforward to use containers with basic stateful applications today. Using a persistent volume, you can mount a disk in a pod, and ensure that your data lasts beyond the life of your pod. However, with deployments of distributed stateful applications, things can become more tricky. With Kubernetes 1.3, the new Pet Set component makes everything much easier. To test this new feature out at scale, we decided to host the Greek Pet Monster Races! We raced Centaurs and other Ancient Greek Monsters over hundreds of thousands of races across multiple availability zones.
As many of you know Kubernetes is from the Ancient Greek: κυβερνήτης. This means helmsman, pilot, steersman, or ship master. So in order to keep track of race results, we needed a data store, and we choose Cassandra. Κασσάνδρα, Cassandra who was the daughter of King of Priam and Queen Hecuba of Troy. With multiple references to the ancient Greek language, we thought it would be appropriate to race ancient Greek monsters.
From there the story kinda goes sideways because Cassandra was actually the Pets as well. Read on and we will explain.
One of the new exciting features in Kubernetes 1.3 is Pet Set. In order to organize the deployment of containers inside of Kubernetes, different deployment mechanisms are available. Examples of these components include Resource Controllers and Daemon Set. Pet Sets is a new feature that delivers the capability to deploy containers, as Pets, inside of Kubernetes. Pet Sets provide a guarantee of identity for various aspects of the pet / pod deployment: DNS name, consistent storage, and ordered pod indexing. Previously, using components like Deployments and Replication Controllers, would only deploy an application with a weak uncoupled identity. A weak identity is great for managing applications such as microservices, where service discovery is important, the application is stateless, and the naming of individual pods does not matter. Many software applications do require strong identity, including many different types of distributed stateful systems. Cassandra is a great example of a distributed application that requires consistent network identity, and stable storage.
Pet Sets provides the following capabilities:
- A stable hostname, available to others in DNS. Number is based off of the Pet Set name and starts at zero. For example cassandra-0.
- An ordinal index of Pets. 0, 1, 2, 3, etc.
- Stable storage linked to the ordinal and hostname of the Pet.
- Peer discovery is available via DNS. With Cassandra the names of the peers are known before the Pets are created.
- Startup and Teardown ordering. Which numbered Pet is going to be created next is known, and which Pet will be destroyed upon reducing the Pet Set size. This feature is useful for such admin tasks as draining data from a Pet, when reducing the size of a cluster.
If your application has one or more of these requirements, then it may be a candidate for Pet Set.
A relevant analogy is that a Pet Set is composed of Pet dogs. If you have a white, brown or black dog and the brown dog runs away, you can replace it with another brown dog no one would notice. If over time you can keep replacing your dogs with only white dogs then someone would notice. Pet Set allows your application to maintain the unique identity or hair color of your Pets.
Example workloads for Pet Set:
- Clustered software like Cassandra, Zookeeper, etcd, or Elastic require stable membership.
- Databases like MySQL or PostgreSQL that require a single instance attached to a persistent volume at any time.
Only use Pet Set if your application requires some or all of these properties. Managing pods as stateless replicas is vastly easier.
So back to our races!
As we have mentioned, Cassandra was a perfect candidate to deploy via a Pet Set. A Pet Set is much like a Replica Controller with a few new bells and whistles. Here’s an example YAML manifest:
Headless service to provide DNS lookup
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra-data
----
# new API name
apiVersion: "apps/v1alpha1"
kind: PetSet
metadata:
name: cassandra
spec:
serviceName: cassandra
# replicas are the same as used by Replication Controllers
# except pets are deployed in order 0, 1, 2, 3, etc
replicas: 5
template:
metadata:
annotations:
pod.alpha.kubernetes.io/initialized: "true"
labels:
app: cassandra-data
spec:
# just as other component in Kubernetes one
# or more containers are deployed
containers:
- name: cassandra
image: "cassandra-debian:v1.1"
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "4"
memory: 11Gi
requests:
cpu: "4"
memory: 11Gi
securityContext:
privileged: true
env:
- name: MAX\_HEAP\_SIZE
value: 8192M
- name: HEAP\_NEWSIZE
value: 2048M
# this is relying on guaranteed network identity of Pet Sets, we
# will know the name of the Pets / Pod before they are created
- name: CASSANDRA\_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local,cassandra-1.cassandra.default.svc.cluster.local"
- name: CASSANDRA\_CLUSTER\_NAME
value: "OneKDemo"
- name: CASSANDRA\_DC
value: "DC1-Data"
- name: CASSANDRA\_RACK
value: "OneKDemo-Rack1-Data"
- name: CASSANDRA\_AUTO\_BOOTSTRAP
value: "false"
# this variable is used by the read-probe looking
# for the IP Address in a `nodetool status` command
- name: POD\_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the pet volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra\_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above. Storage can be automatically
# created for the Pets depending on the cloud environment.
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 380Gi
You may notice that these containers are on the rather large size, and it is not unusual to run Cassandra in production with 8 CPU and 16GB of ram. There are two key new features that you will notice above; dynamic volume provisioning, and of course Pet Set. The above manifest will create 5 Cassandra Pets / Pods starting with the number 0: cassandra-data-0, cassandra-data-1, etc.
In order to generate data for the races, we used another Kubernetes feature called Jobs. Simple python code was written to generate the random speed of the monster for every second of the race. Then that data, position information, winners, other data points, and metrics were stored in Cassandra. To visualize the data, we used JHipster to generate a AngularJS UI with Java services, and then used D3 for graphing.
An example of one of the Jobs:
apiVersion: batch/v1
kind: Job
metadata:
name: pet-race-giants
labels:
name: pet-races
spec:
parallelism: 2
completions: 4
template:
metadata:
name: pet-race-giants
labels:
name: pet-races
spec:
containers:
- name: pet-race-giants
image: py3numpy-job:v1.0
command: ["pet-race-job", --length=100", "--pet=Giants", "--scale=3"]
resources:
limits:
cpu: "2"
requests:
cpu: "2"
restartPolicy: Never
Since we are talking about Monsters, we had to go big. We deployed 1,009 minion nodes to Google Compute Engine (GCE), spread across 4 zones, running a custom version of the Kubernetes 1.3 beta. We ran this demo on beta code since the demo was being set up before the 1.3 release date. For the minion nodes, GCE virtual machine n1-standard-8 machine size was chosen, which is vm with 8 virtual CPUs and 30GB of memory. It would allow for a single instance of Cassandra to run on one node, which is recommended for disk I/O.
Then the pets were deployed! One thousand of them, in two different Cassandra Data Centers. Cassandra distributed architecture is specifically tailored for multiple-data center deployment. Often multiple Cassandra data centers are deployed inside the same physical or virtual data center, in order to separate workloads. Data is replicated across all data centers, but workloads can be different between data centers and thus application tuning can be different. Data centers named ‘DC1-Analytics’ and ‘DC1-Data’ where deployed with 500 pets each. The race data was created by the python Batch Jobs connected to DC1-Data, and the JHipster UI was connected DC1-Analytics.
Here are the final numbers:
- 8,072 Cores. The master used 24, minion nodes used the rest
- 1,009 IP addresses
- 1,009 routes setup by Kubernetes on Google Cloud Platform
- 100,510 GB persistent disk used by the Minions and the Master
- 380,020 GB SSD disk persistent disk. 20 GB for the master and 340 GB per Cassandra Pet.
- 1,000 deployed instances of Cassandra
Yes we deployed 1,000 pets, but one really did not want to join the party! Technically with the Cassandra setup, we could have lost 333 nodes without service or data loss.
Limitations with Pet Sets in 1.3 Release
- Pet Set is an alpha resource not available in any Kubernetes release prior to 1.3.
- The storage for a given pet must either be provisioned by a dynamic storage provisioner based on the requested storage class, or pre-provisioned by an admin.
- Deleting the Pet Set will not delete any pets or Pet storage. You will need to delete your Pets and possibly its storage by hand.
- All Pet Sets currently require a “governing service”, or a Service responsible for the network identity of the pets. The user is responsible for this Service.
- Updating an existing Pet Set is currently a manual process. You either need to deploy a new Pet Set with the new image version or orphan Pets one by one and update their image, which will join them back to the cluster.
Resources and References
- The source code for the demo is available on GitHub: (Pet Set examples will be merged into the Kubernetes Cassandra Examples).
- More information about Jobs
- Documentation for Pet Set
- Image credits: Cassandra image and Cyclops image
– Chris Love, Senior DevOps Open Source Consultant for Datapipe. Twitter @chrislovecnm
- Introducing kustomize; Template-free Configuration Customization for Kubernetes May 29
- Getting to Know Kubevirt May 22
- Gardener - The Kubernetes Botanist May 17
- Docs are Migrating from Jekyll to Hugo May 5
- Announcing Kubeflow 0.1 May 4
- Current State of Policy in Kubernetes May 2
- Developing on Kubernetes May 1
- Zero-downtime Deployment in Kubernetes with Jenkins Apr 30
- Kubernetes Community - Top of the Open Source Charts in 2017 Apr 25
- Local Persistent Volumes for Kubernetes Goes Beta Apr 13
- Container Storage Interface (CSI) for Kubernetes Goes Beta Apr 10
- Fixing the Subpath Volume Vulnerability in Kubernetes Apr 4
- Principles of Container-based Application Design Mar 15
- Expanding User Support with Office Hours Mar 14
- How to Integrate RollingUpdate Strategy for TPR in Kubernetes Mar 13
- Apache Spark 2.3 with Native Kubernetes Support Mar 6
- Kubernetes: First Beta Version of Kubernetes 1.10 is Here Mar 2
- Reporting Errors from Control Plane to Applications Using Kubernetes Events Jan 25
- Core Workloads API GA Jan 15
- Introducing client-go version 6 Jan 12
- Extensible Admission is Beta Jan 11
- Introducing Container Storage Interface (CSI) Alpha for Kubernetes Jan 10
- Kubernetes v1.9 releases beta support for Windows Server Containers Jan 9
- Five Days of Kubernetes 1.9 Jan 8
- Introducing Kubeflow - A Composable, Portable, Scalable ML Stack Built for Kubernetes Dec 21
- Kubernetes 1.9: Apps Workloads GA and Expanded Ecosystem Dec 15
- Using eBPF in Kubernetes Dec 7
- PaddlePaddle Fluid: Elastic Deep Learning on Kubernetes Dec 6
- Autoscaling in Kubernetes Nov 17
- Certified Kubernetes Conformance Program: Launch Celebration Round Up Nov 16
- Kubernetes is Still Hard (for Developers) Nov 15
- Securing Software Supply Chain with Grafeas Nov 3
- Containerd Brings More Container Runtime Options for Kubernetes Nov 2
- Kubernetes the Easy Way Nov 1
- Enforcing Network Policies in Kubernetes Oct 30
- Using RBAC, Generally Available in Kubernetes v1.8 Oct 28
- It Takes a Village to Raise a Kubernetes Oct 26
- kubeadm v1.8 Released: Introducing Easy Upgrades for Kubernetes Clusters Oct 25
- Five Days of Kubernetes 1.8 Oct 24
- Introducing Software Certification for Kubernetes Oct 19
- Request Routing and Policy Management with the Istio Service Mesh Oct 10
- Kubernetes Community Steering Committee Election Results Oct 5
- Kubernetes 1.8: Security, Workloads and Feature Depth Sep 29
- Kubernetes StatefulSets & DaemonSets Updates Sep 27
- Introducing the Resource Management Working Group Sep 21
- Windows Networking at Parity with Linux for Kubernetes Sep 8
- Kubernetes Meets High-Performance Computing Aug 22
- High Performance Networking with EC2 Virtual Private Clouds Aug 11
- Kompose Helps Developers Move Docker Compose Files to Kubernetes Aug 10
- Happy Second Birthday: A Kubernetes Retrospective Jul 28
- How Watson Health Cloud Deploys Applications with Kubernetes Jul 14
- Kubernetes 1.7: Security Hardening, Stateful Application Updates and Extensibility Jun 30
- Draft: Kubernetes container development made easy May 31
- Managing microservices with the Istio service mesh May 31
- Kubespray Ansible Playbooks foster Collaborative Kubernetes Ops May 19
- Kubernetes: a monitoring guide May 19
- Dancing at the Lip of a Volcano: The Kubernetes Security Process - Explained May 18
- How Bitmovin is Doing Multi-Stage Canary Deployments with Kubernetes in the Cloud and On-Prem Apr 21
- RBAC Support in Kubernetes Apr 6
- Configuring Private DNS Zones and Upstream Nameservers in Kubernetes Apr 4
- Advanced Scheduling in Kubernetes Mar 31
- Scalability updates in Kubernetes 1.6: 5,000 node and 150,000 pod clusters Mar 30
- Five Days of Kubernetes 1.6 Mar 29
- Dynamic Provisioning and Storage Classes in Kubernetes Mar 29
- Kubernetes 1.6: Multi-user, Multi-workloads at Scale Mar 28
- The K8sPort: Engaging Kubernetes Community One Activity at a Time Mar 24
- Deploying PostgreSQL Clusters using StatefulSets Feb 24
- Containers as a Service, the foundation for next generation PaaS Feb 21
- Inside JD.com's Shift to Kubernetes from OpenStack Feb 10
- Run Deep Learning with PaddlePaddle on Kubernetes Feb 8
- Highly Available Kubernetes Clusters Feb 2
- Running MongoDB on Kubernetes with StatefulSets Jan 30
- Fission: Serverless Functions as a Service for Kubernetes Jan 30
- How we run Kubernetes in Kubernetes aka Kubeception Jan 20
- Scaling Kubernetes deployments with Policy-Based Networking Jan 19
- A Stronger Foundation for Creating and Managing Kubernetes Clusters Jan 12
- Kubernetes UX Survey Infographic Jan 9
- Kubernetes supports OpenAPI Dec 23
- Cluster Federation in Kubernetes 1.5 Dec 22
- Windows Server Support Comes to Kubernetes Dec 21
- StatefulSet: Run and Scale Stateful Applications Easily in Kubernetes Dec 20
- Introducing Container Runtime Interface (CRI) in Kubernetes Dec 19
- Five Days of Kubernetes 1.5 Dec 19
- Kubernetes 1.5: Supporting Production Workloads Dec 13
- From Network Policies to Security Policies Dec 8
- Kompose: a tool to go from Docker-compose to Kubernetes Nov 22
- Kubernetes Containers Logging and Monitoring with Sematext Nov 18
- Visualize Kubelet Performance with Node Dashboard Nov 17
- CNCF Partners With The Linux Foundation To Launch New Kubernetes Certification, Training and Managed Service Provider Program Nov 8
- Modernizing the Skytap Cloud Micro-Service Architecture with Kubernetes Nov 7
- Bringing Kubernetes Support to Azure Container Service Nov 7
- Tail Kubernetes with Stern Oct 31
- Introducing Kubernetes Service Partners program and a redesigned Partners page Oct 31
- How We Architected and Run Kubernetes on OpenStack at Scale at Yahoo! JAPAN Oct 24
- Building Globally Distributed Services using Kubernetes Cluster Federation Oct 14
- Helm Charts: making it simple to package and deploy common applications on Kubernetes Oct 10
- Dynamic Provisioning and Storage Classes in Kubernetes Oct 7
- How we improved Kubernetes Dashboard UI in 1.4 for your production needs Oct 3
- How we made Kubernetes insanely easy to install Sep 28
- How Qbox Saved 50% per Month on AWS Bills Using Kubernetes and Supergiant Sep 27
- Kubernetes 1.4: Making it easy to run on Kubernetes anywhere Sep 26
- High performance network policies in Kubernetes clusters Sep 21
- Creating a PostgreSQL Cluster using Helm Sep 9
- Deploying to Multiple Kubernetes Clusters with kit Sep 6
- Cloud Native Application Interfaces Sep 1
- Security Best Practices for Kubernetes Deployment Aug 31
- Scaling Stateful Applications using Kubernetes Pet Sets and FlexVolumes with Datera Elastic Data Fabric Aug 29
- SIG Apps: build apps for and operate them in Kubernetes Aug 16
- Kubernetes Namespaces: use cases and insights Aug 16
- Create a Couchbase cluster using Kubernetes Aug 15
- Challenges of a Remotely Managed, On-Premises, Bare-Metal Kubernetes Cluster Aug 2
- Why OpenStack's embrace of Kubernetes is great for both communities Jul 26
- The Bet on Kubernetes, a Red Hat Perspective Jul 21
- Happy Birthday Kubernetes. Oh, the places you’ll go! Jul 21
- A Very Happy Birthday Kubernetes Jul 21
- Bringing End-to-End Kubernetes Testing to Azure (Part 2) Jul 18
- Steering an Automation Platform at Wercker with Kubernetes Jul 15
- Dashboard - Full Featured Web Interface for Kubernetes Jul 15
- Cross Cluster Services - Achieving Higher Availability for your Kubernetes Applications Jul 14
- Citrix + Kubernetes = A Home Run Jul 14
- Thousand Instances of Cassandra using Kubernetes Pet Set Jul 13
- Stateful Applications in Containers!? Kubernetes 1.3 Says “Yes!” Jul 13
- Kubernetes in Rancher: the further evolution Jul 12
- Autoscaling in Kubernetes Jul 12
- rktnetes brings rkt container engine to Kubernetes Jul 11
- Minikube: easily run Kubernetes locally Jul 11
- Five Days of Kubernetes 1.3 Jul 11
- Updates to Performance and Scalability in Kubernetes 1.3 -- 2,000 node 60,000 pod clusters Jul 7
- Kubernetes 1.3: Bridging Cloud Native and Enterprise Workloads Jul 6
- Container Design Patterns Jun 21
- The Illustrated Children's Guide to Kubernetes Jun 9
- Bringing End-to-End Kubernetes Testing to Azure (Part 1) Jun 6
- Hypernetes: Bringing Security and Multi-tenancy to Kubernetes May 24
- CoreOS Fest 2016: CoreOS and Kubernetes Community meet in Berlin (& San Francisco) May 3
- Introducing the Kubernetes OpenStack Special Interest Group Apr 22
- SIG-UI: the place for building awesome user interfaces for Kubernetes Apr 20
- SIG-ClusterOps: Promote operability and interoperability of Kubernetes clusters Apr 19
- SIG-Networking: Kubernetes Network Policy APIs Coming in 1.3 Apr 18
- How to deploy secure, auditable, and reproducible Kubernetes clusters on AWS Apr 15
- Container survey results - March 2016 Apr 8
- Adding Support for Kubernetes in Rancher Apr 8
- Configuration management with Containers Apr 4
- Using Deployment objects with Kubernetes 1.2 Apr 1
- Kubernetes 1.2 and simplifying advanced networking with Ingress Mar 31
- Using Spark and Zeppelin to process big data on Kubernetes 1.2 Mar 30
- Building highly available applications using Kubernetes new multi-zone clusters (a.k.a. 'Ubernetes Lite') Mar 29
- AppFormix: Helping Enterprises Operationalize Kubernetes Mar 29
- How container metadata changes your point of view Mar 28
- Five Days of Kubernetes 1.2 Mar 28
- 1000 nodes and beyond: updates to Kubernetes performance and scalability in 1.2 Mar 28
- Scaling neural network image classification using Kubernetes with TensorFlow Serving Mar 23
- Kubernetes 1.2: Even more performance upgrades, plus easier application deployment and management Mar 17
- Kubernetes in the Enterprise with Fujitsu’s Cloud Load Control Mar 11
- ElasticBox introduces ElasticKube to help manage Kubernetes within the enterprise Mar 11
- State of the Container World, February 2016 Mar 1
- Kubernetes Community Meeting Notes - 20160225 Mar 1
- KubeCon EU 2016: Kubernetes Community in London Feb 24
- Kubernetes Community Meeting Notes - 20160218 Feb 23
- Kubernetes Community Meeting Notes - 20160211 Feb 16
- ShareThis: Kubernetes In Production Feb 11
- Kubernetes Community Meeting Notes - 20160204 Feb 9
- Kubernetes Community Meeting Notes - 20160128 Feb 2
- State of the Container World, January 2016 Feb 1
- Kubernetes Community Meeting Notes - 20160121 Jan 28
- Kubernetes Community Meeting Notes - 20160114 Jan 28
- Why Kubernetes doesn’t use libnetwork Jan 14
- Simple leader election with Kubernetes and Docker Jan 11
- Creating a Raspberry Pi cluster running Kubernetes, the installation (Part 2) Dec 22
- Managing Kubernetes Pods, Services and Replication Controllers with Puppet Dec 17
- How Weave built a multi-deployment solution for Scope using Kubernetes Dec 12
- Creating a Raspberry Pi cluster running Kubernetes, the shopping list (Part 1) Nov 25
- Monitoring Kubernetes with Sysdig Nov 19
- One million requests per second: Dependable and dynamic distributed systems at scale Nov 11
- Kubernetes 1.1 Performance upgrades, improved tooling and a growing community Nov 9
- Kubernetes as Foundation for Cloud Native PaaS Nov 3
- Some things you didn’t know about kubectl Oct 28
- Kubernetes Performance Measurements and Roadmap Sep 10
- Using Kubernetes Namespaces to Manage Environments Aug 28
- Weekly Kubernetes Community Hangout Notes - July 31 2015 Aug 4
- The Growing Kubernetes Ecosystem Jul 24
- Weekly Kubernetes Community Hangout Notes - July 17 2015 Jul 23
- Strong, Simple SSL for Kubernetes Services Jul 14
- Weekly Kubernetes Community Hangout Notes - July 10 2015 Jul 13
- Announcing the First Kubernetes Enterprise Training Course Jul 8
- Kubernetes 1.0 Launch Event at OSCON Jul 2
- How did the Quake demo from DockerCon Work? Jul 2
- The Distributed System ToolKit: Patterns for Composite Containers Jun 29
- Slides: Cluster Management with Kubernetes, talk given at the University of Edinburgh Jun 26
- Cluster Level Logging with Kubernetes Jun 11
- Weekly Kubernetes Community Hangout Notes - May 22 2015 Jun 2
- Kubernetes on OpenStack May 19
- Weekly Kubernetes Community Hangout Notes - May 15 2015 May 18
- Docker and Kubernetes and AppC May 18
- Kubernetes Release: 0.17.0 May 15
- Resource Usage Monitoring in Kubernetes May 12
- Weekly Kubernetes Community Hangout Notes - May 1 2015 May 11
- Kubernetes Release: 0.16.0 May 11
- AppC Support for Kubernetes through RKT May 4
- Weekly Kubernetes Community Hangout Notes - April 24 2015 Apr 30
- Borg: The Predecessor to Kubernetes Apr 23
- Kubernetes and the Mesosphere DCOS Apr 22
- Weekly Kubernetes Community Hangout Notes - April 17 2015 Apr 17
- Kubernetes Release: 0.15.0 Apr 16
- Introducing Kubernetes API Version v1beta3 Apr 16
- Weekly Kubernetes Community Hangout Notes - April 10 2015 Apr 11
- Faster than a speeding Latte Apr 6
- Weekly Kubernetes Community Hangout Notes - April 3 2015 Apr 4
- Paricipate in a Kubernetes User Experience Study Mar 31
- Weekly Kubernetes Community Hangout Notes - March 27 2015 Mar 28
- Kubernetes Gathering Videos Mar 23
- Welcome to the Kubernetes Blog! Mar 20