Docker Meet Cassandra. Cassandra Meet Docker.
After having spent quite a bit of time learning Docker and after hearing strong community interest for the technology even though few have played with it, I figured it’d be be best to share what I’ve learned. Hopefully the knowledge transfer helps newcomers get up and running with Cassandra in a concise, yet deeply informed manner.
A few years ago I finally started playing with Docker by way of Vagrant. That entire experience was weird. Don’t do it.
Later Docker Compose was released and all the roadblocks I previously encountered immediately melted away and the power of Docker was made very aware to me. Since then I’ve been like my cat, but instead of “Tuna Tuna Tuna Tuna” it’s more like: “Docker Docker Docker Docker.”
But the more I spoke about Docker and asked around about Docker, the sadder I became since:
- Few really used Docker.
- Fewer had even heard of Docker Compose.
- Everyone was worried about how Docker performance would be in production.
- Some were waiting for the Mesos and Kubernetes war to play out.
- Kubernetes won by the way. Read any news around Docker-Kubernetes and AWS-Kubernetes to make your own judgements.
Within The Last Pickle, I advocate for Docker as best I can. Development project? “Why not use Docker?” Quick test? “cough Docker cough.” Learn everything you can about Grafana, Graphite, and monitoring dashboards you ask? “Okay, Docker it is!”
About a year later, we’re here and guess what? Now you get to be here with me as well! :tada:
Docker Cassandra Bootstrap
In October, Nick Bailey invited me to present at the local Austin Cassandra Users Meetup and I figured this was the time to consolidate my recent knowledge and learnings into a simplified project. I figured if I had already spent time on such an intricate project I could save others time and give them a clean environment they could play with, develop on, then move into production.
That’s how the docker-cassandra-bootstrap project was born.
I will stay away from how to run the Docker Cassandra Bootstrap project within this blog post since the instructions are already within the Github project’s README.md. Instead, I’ll focus on the individual components, what’s hidden in which nooks, and which stubs are commented out in which crannies for future use and development.
docker-compose.yml
The docker-compose.yml is the core building block for any Docker Compose project.
Building
What Docker provided was the Dockerfile
which allowed image definitions to run containers. (Note the differentiation that containers are images that are running.) Building an image using Docker was pretty straight forward:
docker build .
However, building many images would require tagging and additional docker
parameters. And that can get confusing really quickly and definitely isn’t user-friendly.
Instead, Docker Compose lets you build entire ecosystems of services with a simple command:
docker-compose build
Now with Docker Compose, you don’t have to keep track of image tagging, Dockerfile
location, or anything else that I gratefully have too little experience with. Instead, images are defined by the image
(from Docker Hub) and build
(from a local Dockerfile
) parameters within a Docker Compose service.
Environmental Variables
Along with the simplification of image definitions, Docker Compose introduces the env_file
parameter which is a not-quite-bash environmental definition file. It’s slightly different, bash commands will not resolve, you can’t use an envar within another’s definition, and don’t use quotes since those will be considered part of the envar’s value. While these env_files
come with their own limitations, env_files
means I no longer have to have ugly, long, complicated lines like:
docker -e KEY=VAL -e KEY2=VAL2 -e WHAT=AMI -e DOING=NOW ...
Through the use of multiple docker-compose.yml
files, one can create:
- An
env_file
calledsettings.env
which has all generalized settings. - An
env_file
calleddev.env
which has dev-specific settings. - An
env_file
calledprod.env
which has production-specific settings. - An
env_file
callednz.env
which will define variables to flip all gifs by 180-degrees to counteract the fact that New Zealanders are upside down.
At the end of the day, the filenames, segregation, and environmental variable values are for you to use and define within your own ecosystem.
But that’s getting ahead of ourselves. Just know that you can place whatever environmental variables you want in these files as you create production-specific env_files
which may not get used today, but will be utilized when you move into production.
Volumes
Within Docker everything within a container that is not stored in volumes
is temporary. This means that if we launch a new container using any given static Docker image, we can manipulate multiple aspects of system configuration, data, file placement, etc, without a concern on changing our stable static environment. If something ever breaks, we simply kill the container, launch a new container based off the same image, and we’re back to our familiar static and stable state. However, if we ever want to persist data, storing this valuable data within Docker volumes
ensures that the data is accessible across container restarts.
The above statement probably makes up about 90% of my love for Docker (image layer caching probably makes up a majority of the remaining 10%).
What my declaration of love means is that while a container is running: it is your pet. It does what you ask of it (unless it’s a cat) and it will live with you by your side crunching on the code you fed it. But once your pet has finished processing the provided code, it will vanish into the ether and you can replace it like you would cattle.
This methodology is a perfect fit for short-lived microservice workers. However, if you want to persist data, or provide configuration files to the worker microservices, you’ll want to use volumes
.
While you can use named volumes to allow Docker to handle the data location for you, not using named volumes will put you in full control of where the data directory will resolve to, can provide performance benefits, and will remove one layer of indirection in case anything should go wrong.
When people ask about the performance of Docker in production, volumes
are the key component. If your service relies on disk access, use volumes
to ensure a higher level of performance. For all other cases, if there even is a CPU-based performance hit, it should be mild and you can always scale horizontally. The ultimate performance benefit of using Docker is that applications are containerized and are extremely effective at horizontal scaling. Also, consider the time and effort costs to test, deploy, and rollback any production changes when using containers. Although this isn’t essentially performance related, it does increase codebase velocity which may be as valuable as raw performance metrics.
Entrypoints
Back to looking at the docker-compose.yml
, Docker entrypoints
define which program will be executed when a machine begins running. You can think of SSH’s entrypoint as being bash
, or zsh
.
Under the cqlsh
service, the default entrypoint
is overwritten by cqlsh cassandra
, where cassandra
is the name of the Cassandra Docker Compose service. This means that we want to use the cassandra:3.11
image, but not use the bash script that sets up the cassandra.yaml
and other Cassandra settings. Instead, the service will utilize cassandra:3.11
’s image and start the container with cqlsh cassandra
. This allows the following shorthand command to be run, all within a Docker container, without any local dependencies other than Docker and Docker Compose:
docker-compose run cqlsh
The above command starts up the cqlsh
service, calls the cqlsh
binary, and provides the cassandra
hostname as the contact point.
The nodetool
service is good example of creating a shorthand command for an otherwise complicated process. Instead of having:
docker-compose run --entrypoint bash cassandra
$ nodetool -h cassandra -u cassandraUser -pw cassandraPass status
We can simply run:
docker-compose run nodetool status
Any parameters following the service name are appended to the defined entrypoint
’s command and replaces the service’s command
parameter. For the nodetool
service the default command
is help
, but in the above line, the command
that is appended to the entrypoint
is status
.
Links
If we have a service that will rely on communication with another service, the way that the cassandra-reaper
service must be in contact with the cassandra
service that Reaper will be monitoring, we can use the links
service parameter to define both the hostname and servicename within this link.
I like to be both simple and explicit, which is why I use the same name for the hostname and service name like:
links:
- cassandra:cassandra
The above line will allow the cassandra-reaper
service’s container to contact the cassandra
service by way of the cassandra
hostname.
Ports
Ports are a way to expose a service’s port to another service, or bind a service’s port to a local port.
Because the cassandra-reaper
service is meant to be used via its web UI, we bind the service’s 8080
and 8081
ports from within the service onto the local machine’s 8080
and 8081
ports using the following lines:
ports:
- "8080:8080"
- "8081:8081"
This means if we visit http://localhost:8080/webui/ from our local machine, we’ll be processing code from within a container to service that web request.
Restart
The restart
command is more of a Docker Compose-specific scheduler that dictates what a service should do if the container is abruptly terminated.
In the case of the grafana
and logspout
services, if Grafana or Logspout ever die, the containers will exit and the grafana
or logspout
services will automatically restart and come back online.
While this parameter may be ideal for some microservices, it may not be ideal for services that power data stores.
The cassandra
service
The docker-compose.yml
defines the cassandra
service as having a few mounted configuration files.
The two configuration files that are enabled by default include configurations for:
- collectd.
- Prometheus JMX exporter.
The two configuration files that are disabled by default are for:
- The Graphite metrics reporter.
- The Filebeat log reporter for ELK.
collectd
The cassandra/config/collectd.cassandra.conf configuration file loads a few plugins that TLP has found to be useful for enterprise metric dashboards.
A few packages need to be installed for the collectd to be fully installed with the referenced plugins. The service must also be started from within the cassandra/docker-entrypoint.sh or just by using service collectd start
on hardware.
The metrics that are collected include information on:
- Disk.
- CPU load.
- Network traffic.
- Memory usage.
- System logs.
collectd is then configured to write to a Prometheus backend by default. Commented code is included for writing to a Graphite backend.
For further information on each of the plugins, visit the collectd wiki.
Prometheus JMX Exporter
The cassandra/config/prometheus.yml configuration file defines the JMX endpoints that will be collected by the JMX exporter and exposed via a REST API for Prometheus ingestion.
The following jar needs to be used and referenced within cassandra-env.sh:
Only the most important enterprise-centric list of metrics are being collected for Prometheus ingestion.
The resulting name
formats did not follow the standard Prometheus naming scheme, but instead something between that of the Graphite dot-naming scheme with the use of Prometheus-styled labels when relevant. Do note that Prometheus will automatically convert all dot-separated metrics to underscore-separated metrics since Prometheus does not understand the concept of hierarchical metric keys.
Graphite Metrics Reporter
The cassandra/config/graphite.cassandra.yaml configuration file is included and commented out by default.
For reporting to work correctly, this file requires the following jars to be placed into Cassandra’s lib directory and have Java 8 installed:
- cassandra/lib/metrics-core-3.1.2.jar
- cassandra/lib/metrics-graphite-3.1.2.jar
- cassandra/lib/reporter-config-base-3.0.3.jar
- cassandra/lib/reporter-config3-3.0.3.jar
Once properly enabled, the cassandra
service can contact a Graphite host with enterprise-centric metrics. However, we did not include the graphite
service within this project because our experience with Prometheus has been smoother than dealing with Graphite and because Prometheus seems to scale better than Graphite in production environments. As of the most recent Grafana releases, Prometheus has also received plenty of Prometheus-centric features and support as well.
Filebeat Log Reporter
The cassandra/config/filebeat.yml configuration file is set up to contact a pre-existing ELK stack.
The Filebeat package is required and can be started within a Docker container by referencing the Filebeat service from within cassandra-env.sh.
The Filebeat configuration file is by no means complete (and assume incorrect!), but it does provide a good starting point for learning how to ingest log files into Logstash using Filebeat. Please consider this configuration file as fully experimental.
JMX Authentication
Because Reaper for Apache Cassandra will be contacting the JMX port of this cassandra
service we will also need to add authentication files for JMX in two locations and set the LOCAL_JMX
variable to no
to expose the JMX port externally while requiring authentication.
The cqlsh
Service
The cqlsh
service is a helper service that simply uses the cassandra:3.11
image hosted on Docker Hub to provide the cqlsh binary while mounting the local ./cassandra/schema.cql
file into the container for simple schema creation and data querying.
The defined setup allows us to run the following command with ease:
docker-compose run cqlsh -f /schema.cql
The nodetool
Service
The nodetool
service is another helper service that is provided to automatically fill in the host, username, and password parameters.
By default, it includes the help
command
to provide a list of options for nodetool
. Running the following command will contact the cassandra
node, automatically authenticate, and show a list of options:
docker-compose run nodetool
By including an additional parameter within the command line we will overwrite the help
command
and run that requested command instead, like status
:
docker-compose run nodetool status
The cassandra-reaper
Service
The cassandra-reaper
service does not use a locally built and customized image, but instead uses an image hosted on Docker Hub. The image that is specifically used is ab0fff2
. However, you can choose to use the latest
release or master
image if you want the bleeding edge version.
The configuration is all handled via environmental variables for easy Docker consumption within cassandra-reaper/cassandra-reaper.env. For a list of all configuration options, see the Reaper documentation.
The ports that are exposed onto localhost
are 8080
and 8181
for the web UI and administration UI, respectively.
The grafana
Service
The grafana
service uses the grafana/grafana image from Docker Hub and is configured using the grafana/grafana.env
. For a list of all configuration options via environmental variables, visit the Grafana documentation.
The two scripts included in grafana/bin/ are referenced in the README.md and are used to create data sources that rely on the Prometheus data store and uploads all the grafana/dashboard JSON files into Grafana.
The JSON files were meticulously created for two of our enterprise customer’s production deployments (shared here with permission). They include 7 dashboards that highlight specific Cassandra metrics in a drill-down fashion:
- Overview
- Read Path
- Write Path
- Client Connections
- Alerts
- Reaper
- Big Picture
At TLP we typically always start off with the Overview dashboard. Depending on our metrics we’ll switch to look at the Read Path, Write Path, or Client Connection dashboards for further investigation. Do keep in mind that while these dashboards are a work in progress, they do include proper x/y-axis labeling, units, and tooltips/descriptions. The tooltips/descriptions should provide info in the following order, when relevant:
- Description
- Values
- False Positives
- Required Actions
- Warning
The text is meant to accompany any alerts that are triggered via the auto-generated Alerts dashboard. This way Slack, PagerDuty, or email alerts contain some context into why the alert was triggered, what most likely culprits may be involved, and how to resolve the alert.
Do note that while the other dashboards may have triggers set up, only the Alerts dashboard will fire the triggers since all the dashboards make use of Templating Variables for easy Environment, Data Center, and Host selection, which Grafana alerts do not yet support.
If there are ever any issues around repair, take a look at the Reaper dashboard as well to monitor the Reaper for Apache Cassandra service.
The Big Picture dashboard provides a 30,000 foot view of the cluster and is nice to reason about, but ultimately provides very little value other than showing the high-level trends of the Cassandra deployment being monitored.
The logspout
Service
The logspout
service is a unique service across multiple aspects. While this section will cover some of the special use cases, feel free to skim this section but do try to intake as much information as possible since these sort of use cases will arise in your future docker-compose.yml creations, even if not today.
At the top of the docker-compose.yml under the logspout
section, we define a build
parameter. This build
parameter references logspout/Dockerfile which has no real information since the FROM
image that our local image refers to uses a few ONBUILD
commands. These commands are defined in the parent image and make use of logspout/modules.go. Our custom logspout/modules.go installs the required Logstash dependencies for use with pre-existing Logstash deployments.
While logspout/build.sh was not required to be duplicated since the parent image already had the file pre-baked, I did so for safekeeping.
The logspout/logspout.env file includes a few commented out settings I felt were interesting to look at if I were to experiement with Logstash in the future. These might be a good starting point for further investigation.
The logspout
service also uses the restart: always
setting to ensure that any possible issues with the log redirection service will automatically be resolved by restarting the service immediately after failure.
Within the logspout
service, we are redirecting the container’s port 80
to our localhost’s port 8000
. This allows us to curl http://localhost:8000/logs
from our local machine and grab the logs that the container was displaying on its own REST API under port 80
.
In order for any of the logspout
container magic to work, we need to bind our localhost’s /var/run/docker.sock
into the container as a read-only mount. Even though the mount is read-only, there are still security concerns for doing such an operation. Since this line is still required for allowing this logging redirection to occur, I did include two links to further clarify the security risks involved with including this container within production environments:
- https://raesene.github.io/blog/2016/03/06/The-Dangers-Of-Docker.sock/
- http://stackoverflow.com/questions/40844197
Perhaps in the future the Docker/Moby team will provide a more secure workaround.
The last line that has not been mentioned is the command
option which is commented out by default within the docker-compose.yml. By uncommenting the command
option, we stop using the default command
provided in the parent Dockerfile which exposes the logs via an REST API and we begin to send our logs to the PaperTrail website which provides 100 MB/month of hosted logs for free.
In order to isolate checked-in code from each developer’s $PAPERTRAIL_PORT
, each developer must locally create a copy of .env.template within their project’s directory under the filename .env
. Within this .env
file we can define environmental variables that will be used by docker-compose.yml
. Once you have an account for PaperTrail, set PAPERTRAIL_PORT
to the port assigned by PaperTrail to begin seeing your logs within PaperTrail.com. And since .env
is setup to be ignored via .gitignore we do not have to worry about sending multiple developer’s logs to the same location.
The prometheus
Service
The prometheus
service is one of the simpler services within this Docker Compose environment.
The prometheus
service:
- Uses a Docker Hub image.
- Will require access to both the
cassandra
andcassandra-reaper
services to monitor their processes. - Will expose the container’s port
9090
onto localhost’s port9090
. - Will persist all data within the container’s
/prometheus
directory onto our local./data/prometheus
directory. - Will configure the container using a locally-stored copy of the prometheus.yml.
The prometheus.yml defines which REST endpoints Prometheus will consume to gather metrics from as well as a few other documented configurations. Prometheus will collect metrics from the following services:
- Prometheus.
- Cassandra.
- Cassandra internal metrics.
- collectd metrics.
- Reaper for Apache Cassandra.
The pickle-factory
Sample Write Application Service
The pickle-factory
, which is a misnomer and should be “pickle-farm” in hindsight, is a sample application that shows an ideal write pattern.
For production and development purposes, the pickle-factory/Dockerfile uses the line COPY . .
to copy all of the “microservices” code into the Docker image. The docker-compose.yml uses the volume
key to overwrite any pre-baked code to allow for a simpler development workflow.
The ideal development to production workflow would be to modify the pickle-factory/factory.py
file directly and have each saved copy refreshed within the running container. Once all changes were tested and committed to master
, a continuous integration (CI) job will build the new Docker images and push those images to Docker Hub for both development and production consumption. The next developer to modify pickle-factory/factory.py
will grab the latest image but use their local copy of pickle-factory/factory.py
. The next time that the production image gets updated it will include the latest copy of pickle-factory/factory.py
, as found in master, directly in the Docker image.
Included in the Dockerfile is commented out code for installing gosu
, a sudo
replacement for Docker written in Go. gosu
allows the main user to spawn another thread under a non-root user to better contain attackers. Theoretically, if an attacker gains access into the pickle-factory
container, they would do so under a non-privileged user and not be able to potentially breakout of the container into the Docker internals and out onto the host machine. Practically, gosu
limits the possible attack surface on a Docker container. To fully use the provided functionality of gosu
, pickle-factory/docker-entrypoint.sh needs to be modified as well.
The pickle-factory/factory.py makes a few important design decisions:
- Grabs configurations from environmental variables instead of configuration files for easier Docker-first configurations.
- The
Cluster()
object uses theDCAwareRoundRobinPolicy
in preparation for multiple data centers. - The C-based
LibevConnection
event loop provider is used since it’s more performant than the Python-native default event loop provider. - All statements are prepared in advance to minimize request payload sizes.
- In the background, the Cassandra nodes will send the statements around to each node in the cluster.
- The statements will be indexed via a simple integer.
- All subsequent requests will map the simple integer to the pre-parsed CQL statement.
- Employee records are generated and written asynchronously instead of synchronously in an effort to improve throughput.
- Workforce data is denormalized and written asynchronously with a max-buffer to ensure we don’t overload Cassandra with too many in-flight requests.
- If any requests did not complete successfully the
future.result()
call will throw the matching exception.
The pickle-shop
Sample Read Application Service
Much like the pickle-factory
microservice, the pickle-shop
service follows a similar workflow. However, the pickle-shop
service definition within docker-compose.yml will not overwrite the application code within /usr/src/app
and instead require a rebuilding the local image by using:
docker-compose build
This workflow was chosen as a way to differentiate between a development workflow (pickle-factory
) and a production workflow (pickle-shop
). For production images, all code should ideally be baked into the Docker image and shipped without any external dependencies such as a codebase dependency.
pickle-shop/shop.py follows the same overall flow as pickle-factory/factory.py. After preparing the statements, the read-heavy “microservice” completes the following actions:
- Performs a synchronous read request to grab all the employee IDs.
- 10 asynchronous read queries are performed for 10 employees.
- We then process all results at the roughly the same time.
If using a non-asynchronous driver, one would probably follow the alternate workflow:
- Perform 1 synchronous read query for 1 of 10 employees.
- Process the results of the employee query.
- Repeat.
However, following the synchronous workflow will take roughly:
O(N), where N is the number of employees waiting to be processed
Following the asynchronous workflow we will roughly take:
O(max(N)), where max(N) is the maximum amount of time an employee query will take.
Simplified, the asynchronous workflow will roughly take:
O(1)
Conclusion
At the end of this series of information, we have covered:
- Docker
- Docker Compose
- A few docker-compose.yml settings.
- Dockerized Cassandra.
- Helper Docker Compose services.
- Dockerized Reaper for Apache Cassandra.
- Dockerized Grafana.
- Logspout for Docker-driven external logging.
- Dockerized Prometheus.
- A sample write-heavy asynchronous application using Cassandra.
- A sample read-heavy asynchronous application using Cassandra.
TLP’s hope is that the above documentation and minimal Docker Compose ecosystem will provide a starting point for future community Proof of Concepts utilizing Apache Cassandra. With this project, each developer can create, maintain, and develop within their own local environment without any external overhead other than Docker and Docker Compose.
Once the POC has reached a place of stability, simply adding the project to a Continuous Integration workflow to publish tested images will allow for proper Release Management. After that point, the responsibility typically falls onto the DevOps team to grab the latest Docker image, replace any previously existing Docker containers, and launch the new Docker image. This would all occur without any complicated hand off consisting of dependencies, configuration files (since we’re using environmental variables), and OS-specific requirements.
Hopefully you will find Docker Compose to be as powerful as I’ve found it to be! Best of luck in cranking out the new POC you’ve been itching to create!