Apache Cassandra has a feature called Read Repair Chance that we always recommend our clients to disable. It is often an additional ~20% internal read load cost on your cluster that serves little purpose and provides no guarantees.
Recently, we’ve performed a health check on a cluster that was having transient performance issues. One of the main tables quickly caught our attention: it had 135 columns and latencies were suboptimal. We suspected the number of columns to be causing extra latencies and created some stress profiles to verify this theory, answering the following question: What is the impact of having lots of columns in an Apache Cassandra table?
As Apache Cassandra consultants, we get to review a lot of data models. Best practices claim that the number of tables in a cluster should not exceed one hundred. But we rarely see proper benchmarks evidencing the impact of excessive tables on performance. In this blog post, we’ll discuss the potential impacts of large data models and run benchmarks to verify our assumptions.
DataStax Astra is a database-as-a-service (DBaaS) for Apache Cassandra. It is available on AWS, GCP and Azure. Starting with version 2.1, Reaper can use DataStax Astra as a serverless storage backend for its data. In this post, we will walk you through the steps for setting it up in just a few minutes.
Reaper is our tool for managing repairs for Apache Cassandra.
tlp-stress is our tool for benchmarking Apache Cassandra clusters.
tlp-cluster is our tool for quickly provisioning Apache Cassandra clusters for test purposes.