After having spent quite a bit of time learning Docker and after hearing strong community interest for the technology even though few have played with it, I figured it’d be be best to share what I’ve learned. Hopefully the knowledge transfer helps newcomers get up and running with Cassandra in a concise, yet deeply informed manner.
A few years ago I finally started playing with Docker by way of Vagrant. That entire experience was weird. Don’t do it.
Later Docker Compose was released and all the roadblocks I previously encountered immediately melted away and the power of Docker was made very aware to me. Since then I’ve been like my cat, but instead of “Tuna Tuna Tuna Tuna” it’s more like: “Docker Docker Docker Docker.”
But the more I spoke about Docker and asked around about Docker, the sadder I became since:
- Few really used Docker.
- Fewer had even heard of Docker Compose.
- Everyone was worried about how Docker performance would be in production.
- Some were waiting for the Mesos and Kubernetes war to play out.
- Kubernetes won by the way. Read any news around Docker-Kubernetes and AWS-Kubernetes to make your own judgements.
Within The Last Pickle, I advocate for Docker as best I can. Development project? “Why not use Docker?” Quick test? “cough Docker cough.” Learn everything you can about Grafana, Graphite, and monitoring dashboards you ask? “Okay, Docker it is!”
About a year later, we’re here and guess what? Now you get to be here with me as well! :tada:
Docker Cassandra Bootstrap
In October, Nick Bailey invited me to present at the local Austin Cassandra Users Meetup and I figured this was the time to consolidate my recent knowledge and learnings into a simplified project. I figured if I had already spent time on such an intricate project I could save others time and give them a clean environment they could play with, develop on, then move into production.
That’s how the docker-cassandra-bootstrap project was born.
I will stay away from how to run the Docker Cassandra Bootstrap project within this blog post since the instructions are already within the Github project’s README.md. Instead, I’ll focus on the individual components, what’s hidden in which nooks, and which stubs are commented out in which crannies for future use and development.
The docker-compose.yml is the core building block for any Docker Compose project.
What Docker provided was the
Dockerfile which allowed image definitions to run containers. (Note the differentiation that containers are images that are running.) Building an image using Docker was pretty straight forward:
docker build .
However, building many images would require tagging and additional
docker parameters. And that can get confusing really quickly and definitely isn’t user-friendly.
Instead, Docker Compose lets you build entire ecosystems of services with a simple command:
Now with Docker Compose, you don’t have to keep track of image tagging,
Dockerfile location, or anything else that I gratefully have too little experience with. Instead, images are defined by the
image (from Docker Hub) and
build (from a local
Dockerfile) parameters within a Docker Compose service.
Along with the simplification of image definitions, Docker Compose introduces the
env_file parameter which is a not-quite-bash environmental definition file. It’s slightly different, bash commands will not resolve, you can’t use an envar within another’s definition, and don’t use quotes since those will be considered part of the envar’s value. While these
env_files come with their own limitations,
env_files means I no longer have to have ugly, long, complicated lines like:
docker -e KEY=VAL -e KEY2=VAL2 -e WHAT=AMI -e DOING=NOW ...
Through the use of multiple
docker-compose.yml files, one can create:
settings.envwhich has all generalized settings.
dev.envwhich has dev-specific settings.
prod.envwhich has production-specific settings.
nz.envwhich will define variables to flip all gifs by 180-degrees to counteract the fact that New Zealanders are upside down.
At the end of the day, the filenames, segregation, and environmental variable values are for you to use and define within your own ecosystem.
But that’s getting ahead of ourselves. Just know that you can place whatever environmental variables you want in these files as you create production-specific
env_files which may not get used today, but will be utilized when you move into production.
Within Docker everything within a container that is not stored in
volumes is temporary. This means that if we launch a new container using any given static Docker image, we can manipulate multiple aspects of system configuration, data, file placement, etc, without a concern on changing our stable static environment. If something ever breaks, we simply kill the container, launch a new container based off the same image, and we’re back to our familiar static and stable state. However, if we ever want to persist data, storing this valuable data within Docker
volumes ensures that the data is accessible across container restarts.
The above statement probably makes up about 90% of my love for Docker (image layer caching probably makes up a majority of the remaining 10%).
What my declaration of love means is that while a container is running: it is your pet. It does what you ask of it (unless it’s a cat) and it will live with you by your side crunching on the code you fed it. But once your pet has finished processing the provided code, it will vanish into the ether and you can replace it like you would cattle.
This methodology is a perfect fit for short-lived microservice workers. However, if you want to persist data, or provide configuration files to the worker microservices, you’ll want to use
While you can use named volumes to allow Docker to handle the data location for you, not using named volumes will put you in full control of where the data directory will resolve to, can provide performance benefits, and will remove one layer of indirection in case anything should go wrong.
When people ask about the performance of Docker in production,
volumes are the key component. If your service relies on disk access, use
volumes to ensure a higher level of performance. For all other cases, if there even is a CPU-based performance hit, it should be mild and you can always scale horizontally. The ultimate performance benefit of using Docker is that applications are containerized and are extremely effective at horizontal scaling. Also, consider the time and effort costs to test, deploy, and rollback any production changes when using containers. Although this isn’t essentially performance related, it does increase codebase velocity which may be as valuable as raw performance metrics.
Back to looking at the
entrypoints define which program will be executed when a machine begins running. You can think of SSH’s entrypoint as being
cqlsh service, the default
entrypoint is overwritten by
cqlsh cassandra, where
cassandra is the name of the Cassandra Docker Compose service. This means that we want to use the
cassandra:3.11 image, but not use the bash script that sets up the
cassandra.yaml and other Cassandra settings. Instead, the service will utilize
cassandra:3.11’s image and start the container with
cqlsh cassandra. This allows the following shorthand command to be run, all within a Docker container, without any local dependencies other than Docker and Docker Compose:
docker-compose run cqlsh
The above command starts up the
cqlsh service, calls the
cqlsh binary, and provides the
cassandra hostname as the contact point.
nodetool service is good example of creating a shorthand command for an otherwise complicated process. Instead of having:
docker-compose run --entrypoint bash cassandra $ nodetool -h cassandra -u cassandraUser -pw cassandraPass status
We can simply run:
docker-compose run nodetool status
Any parameters following the service name are appended to the defined
entrypoint’s command and replaces the service’s
command parameter. For the
nodetool service the default
help, but in the above line, the
command that is appended to the
If we have a service that will rely on communication with another service, the way that the
cassandra-reaper service must be in contact with the
cassandra service that Reaper will be monitoring, we can use the
links service parameter to define both the hostname and servicename within this link.
I like to be both simple and explicit, which is why I use the same name for the hostname and service name like:
links: - cassandra:cassandra
The above line will allow the
cassandra-reaper service’s container to contact the
cassandra service by way of the
Ports are a way to expose a service’s port to another service, or bind a service’s port to a local port.
cassandra-reaper service is meant to be used via its web UI, we bind the service’s
8081 ports from within the service onto the local machine’s
8081 ports using the following lines:
ports: - "8080:8080" - "8081:8081"
This means if we visit http://localhost:8080/webui/ from our local machine, we’ll be processing code from within a container to service that web request.
restart command is more of a Docker Compose-specific scheduler that dictates what a service should do if the container is abruptly terminated.
In the case of the
logspout services, if Grafana or Logspout ever die, the containers will exit and the
logspout services will automatically restart and come back online.
While this parameter may be ideal for some microservices, it may not be ideal for services that power data stores.
docker-compose.yml defines the
cassandra service as having a few mounted configuration files.
The two configuration files that are enabled by default include configurations for:
- Prometheus JMX exporter.
The two configuration files that are disabled by default are for:
- The Graphite metrics reporter.
- The Filebeat log reporter for ELK.
The cassandra/config/collectd.cassandra.conf configuration file loads a few plugins that TLP has found to be useful for enterprise metric dashboards.
A few packages need to be installed for the collectd to be fully installed with the referenced plugins. The service must also be started from within the cassandra/docker-entrypoint.sh or just by using
service collectd start on hardware.
The metrics that are collected include information on:
- CPU load.
- Network traffic.
- Memory usage.
- System logs.
collectd is then configured to write to a Prometheus backend by default. Commented code is included for writing to a Graphite backend.
For further information on each of the plugins, visit the collectd wiki.
Prometheus JMX Exporter
The cassandra/config/prometheus.yml configuration file defines the JMX endpoints that will be collected by the JMX exporter and exposed via a REST API for Prometheus ingestion.
The following jar needs to be used and referenced within cassandra-env.sh:
Only the most important enterprise-centric list of metrics are being collected for Prometheus ingestion.
name formats did not follow the standard Prometheus naming scheme, but instead something between that of the Graphite dot-naming scheme with the use of Prometheus-styled labels when relevant. Do note that Prometheus will automatically convert all dot-separated metrics to underscore-separated metrics since Prometheus does not understand the concept of hierarchical metric keys.
Graphite Metrics Reporter
The cassandra/config/graphite.cassandra.yaml configuration file is included and commented out by default.
For reporting to work correctly, this file requires the following jars to be placed into Cassandra’s lib directory and have Java 8 installed:
Once properly enabled, the
cassandra service can contact a Graphite host with enterprise-centric metrics. However, we did not include the
graphite service within this project because our experience with Prometheus has been smoother than dealing with Graphite and because Prometheus seems to scale better than Graphite in production environments. As of the most recent Grafana releases, Prometheus has also received plenty of Prometheus-centric features and support as well.
Filebeat Log Reporter
The cassandra/config/filebeat.yml configuration file is set up to contact a pre-existing ELK stack.
The Filebeat package is required and can be started within a Docker container by referencing the Filebeat service from within cassandra-env.sh.
The Filebeat configuration file is by no means complete (and assume incorrect!), but it does provide a good starting point for learning how to ingest log files into Logstash using Filebeat. Please consider this configuration file as fully experimental.
Because Reaper for Apache Cassandra will be contacting the JMX port of this
cassandra service we will also need to add authentication files for JMX in two locations and set the
LOCAL_JMX variable to
no to expose the JMX port externally while requiring authentication.
cqlsh service is a helper service that simply uses the
cassandra:3.11 image hosted on Docker Hub to provide the cqlsh binary while mounting the local
./cassandra/schema.cql file into the container for simple schema creation and data querying.
The defined setup allows us to run the following command with ease:
docker-compose run cqlsh -f /schema.cql
nodetool service is another helper service that is provided to automatically fill in the host, username, and password parameters.
By default, it includes the
command to provide a list of options for
nodetool. Running the following command will contact the
cassandra node, automatically authenticate, and show a list of options:
docker-compose run nodetool
By including an additional parameter within the command line we will overwrite the
command and run that requested command instead, like
docker-compose run nodetool status
cassandra-reaper service does not use a locally built and customized image, but instead uses an image hosted on Docker Hub. The image that is specifically used is
ab0fff2. However, you can choose to use the
latest release or
master image if you want the bleeding edge version.
The configuration is all handled via environmental variables for easy Docker consumption within cassandra-reaper/cassandra-reaper.env. For a list of all configuration options, see the Reaper documentation.
The ports that are exposed onto
8181 for the web UI and administration UI, respectively.
grafana service uses the grafana/grafana image from Docker Hub and is configured using the
grafana/grafana.env. For a list of all configuration options via environmental variables, visit the Grafana documentation.
The two scripts included in grafana/bin/ are referenced in the README.md and are used to create data sources that rely on the Prometheus data store and uploads all the grafana/dashboard JSON files into Grafana.
The JSON files were meticulously created for two of our enterprise customer’s production deployments (shared here with permission). They include 7 dashboards that highlight specific Cassandra metrics in a drill-down fashion:
- Read Path
- Write Path
- Client Connections
- Big Picture
At TLP we typically always start off with the Overview dashboard. Depending on our metrics we’ll switch to look at the Read Path, Write Path, or Client Connection dashboards for further investigation. Do keep in mind that while these dashboards are a work in progress, they do include proper x/y-axis labeling, units, and tooltips/descriptions. The tooltips/descriptions should provide info in the following order, when relevant:
- False Positives
- Required Actions
The text is meant to accompany any alerts that are triggered via the auto-generated Alerts dashboard. This way Slack, PagerDuty, or email alerts contain some context into why the alert was triggered, what most likely culprits may be involved, and how to resolve the alert.
Do note that while the other dashboards may have triggers set up, only the Alerts dashboard will fire the triggers since all the dashboards make use of Templating Variables for easy Environment, Data Center, and Host selection, which Grafana alerts do not yet support.
If there are ever any issues around repair, take a look at the Reaper dashboard as well to monitor the Reaper for Apache Cassandra service.
The Big Picture dashboard provides a 30,000 foot view of the cluster and is nice to reason about, but ultimately provides very little value other than showing the high-level trends of the Cassandra deployment being monitored.
logspout service is a unique service across multiple aspects. While this section will cover some of the special use cases, feel free to skim this section but do try to intake as much information as possible since these sort of use cases will arise in your future docker-compose.yml creations, even if not today.
At the top of the docker-compose.yml under the
logspout section, we define a
build parameter. This
build parameter references logspout/Dockerfile which has no real information since the
FROM image that our local image refers to uses a few
ONBUILD commands. These commands are defined in the parent image and make use of logspout/modules.go. Our custom logspout/modules.go installs the required Logstash dependencies for use with pre-existing Logstash deployments.
While logspout/build.sh was not required to be duplicated since the parent image already had the file pre-baked, I did so for safekeeping.
The logspout/logspout.env file includes a few commented out settings I felt were interesting to look at if I were to experiement with Logstash in the future. These might be a good starting point for further investigation.
logspout service also uses the
restart: always setting to ensure that any possible issues with the log redirection service will automatically be resolved by restarting the service immediately after failure.
logspout service, we are redirecting the container’s port
80 to our localhost’s port
8000. This allows us to
curl http://localhost:8000/logs from our local machine and grab the logs that the container was displaying on its own REST API under port
In order for any of the
logspout container magic to work, we need to bind our localhost’s
/var/run/docker.sock into the container as a read-only mount. Even though the mount is read-only, there are still security concerns for doing such an operation. Since this line is still required for allowing this logging redirection to occur, I did include two links to further clarify the security risks involved with including this container within production environments:
Perhaps in the future the Docker/Moby team will provide a more secure workaround.
The last line that has not been mentioned is the
command option which is commented out by default within the docker-compose.yml. By uncommenting the
command option, we stop using the default
command provided in the parent Dockerfile which exposes the logs via an REST API and we begin to send our logs to the PaperTrail website which provides 100 MB/month of hosted logs for free.
In order to isolate checked-in code from each developer’s
$PAPERTRAIL_PORT, each developer must locally create a copy of .env.template within their project’s directory under the filename
.env. Within this
.env file we can define environmental variables that will be used by
docker-compose.yml. Once you have an account for PaperTrail, set
PAPERTRAIL_PORT to the port assigned by PaperTrail to begin seeing your logs within PaperTrail.com. And since
.env is setup to be ignored via .gitignore we do not have to worry about sending multiple developer’s logs to the same location.
prometheus service is one of the simpler services within this Docker Compose environment.
- Uses a Docker Hub image.
- Will require access to both the
cassandra-reaperservices to monitor their processes.
- Will expose the container’s port
9090onto localhost’s port
- Will persist all data within the container’s
/prometheusdirectory onto our local
- Will configure the container using a locally-stored copy of the prometheus.yml.
The prometheus.yml defines which REST endpoints Prometheus will consume to gather metrics from as well as a few other documented configurations. Prometheus will collect metrics from the following services:
- Cassandra internal metrics.
- collectd metrics.
- Reaper for Apache Cassandra.
pickle-factory Sample Write Application Service
pickle-factory, which is a misnomer and should be “pickle-farm” in hindsight, is a sample application that shows an ideal write pattern.
For production and development purposes, the pickle-factory/Dockerfile uses the line
COPY . . to copy all of the “microservices” code into the Docker image. The docker-compose.yml uses the
volume key to overwrite any pre-baked code to allow for a simpler development workflow.
The ideal development to production workflow would be to modify the
pickle-factory/factory.py file directly and have each saved copy refreshed within the running container. Once all changes were tested and committed to
master, a continuous integration (CI) job will build the new Docker images and push those images to Docker Hub for both development and production consumption. The next developer to modify
pickle-factory/factory.py will grab the latest image but use their local copy of
pickle-factory/factory.py. The next time that the production image gets updated it will include the latest copy of
pickle-factory/factory.py, as found in master, directly in the Docker image.
Included in the Dockerfile is commented out code for installing
sudo replacement for Docker written in Go.
gosu allows the main user to spawn another thread under a non-root user to better contain attackers. Theoretically, if an attacker gains access into the
pickle-factory container, they would do so under a non-privileged user and not be able to potentially breakout of the container into the Docker internals and out onto the host machine. Practically,
gosu limits the possible attack surface on a Docker container. To fully use the provided functionality of
gosu, pickle-factory/docker-entrypoint.sh needs to be modified as well.
The pickle-factory/factory.py makes a few important design decisions:
- Grabs configurations from environmental variables instead of configuration files for easier Docker-first configurations.
Cluster()object uses the
DCAwareRoundRobinPolicyin preparation for multiple data centers.
- The C-based
LibevConnectionevent loop provider is used since it’s more performant than the Python-native default event loop provider.
- All statements are prepared in advance to minimize request payload sizes.
- In the background, the Cassandra nodes will send the statements around to each node in the cluster.
- The statements will be indexed via a simple integer.
- All subsequent requests will map the simple integer to the pre-parsed CQL statement.
- Employee records are generated and written asynchronously instead of synchronously in an effort to improve throughput.
- Workforce data is denormalized and written asynchronously with a max-buffer to ensure we don’t overload Cassandra with too many in-flight requests.
- If any requests did not complete successfully the
future.result()call will throw the matching exception.
pickle-shop Sample Read Application Service
Much like the
pickle-factory microservice, the
pickle-shop service follows a similar workflow. However, the
pickle-shop service definition within docker-compose.yml will not overwrite the application code within
/usr/src/app and instead require a rebuilding the local image by using:
This workflow was chosen as a way to differentiate between a development workflow (
pickle-factory) and a production workflow (
pickle-shop). For production images, all code should ideally be baked into the Docker image and shipped without any external dependencies such as a codebase dependency.
- Performs a synchronous read request to grab all the employee IDs.
- 10 asynchronous read queries are performed for 10 employees.
- We then process all results at the roughly the same time.
If using a non-asynchronous driver, one would probably follow the alternate workflow:
- Perform 1 synchronous read query for 1 of 10 employees.
- Process the results of the employee query.
However, following the synchronous workflow will take roughly:
O(N), where N is the number of employees waiting to be processed
Following the asynchronous workflow we will roughly take:
O(max(N)), where max(N) is the maximum amount of time an employee query will take.
Simplified, the asynchronous workflow will roughly take:
At the end of this series of information, we have covered:
- Docker Compose
- A few docker-compose.yml settings.
- Dockerized Cassandra.
- Helper Docker Compose services.
- Dockerized Reaper for Apache Cassandra.
- Dockerized Grafana.
- Logspout for Docker-driven external logging.
- Dockerized Prometheus.
- A sample write-heavy asynchronous application using Cassandra.
- A sample read-heavy asynchronous application using Cassandra.
TLP’s hope is that the above documentation and minimal Docker Compose ecosystem will provide a starting point for future community Proof of Concepts utilizing Apache Cassandra. With this project, each developer can create, maintain, and develop within their own local environment without any external overhead other than Docker and Docker Compose.
Once the POC has reached a place of stability, simply adding the project to a Continuous Integration workflow to publish tested images will allow for proper Release Management. After that point, the responsibility typically falls onto the DevOps team to grab the latest Docker image, replace any previously existing Docker containers, and launch the new Docker image. This would all occur without any complicated hand off consisting of dependencies, configuration files (since we’re using environmental variables), and OS-specific requirements.
Hopefully you will find Docker Compose to be as powerful as I’ve found it to be! Best of luck in cranking out the new POC you’ve been itching to create!