The de-facto tool to model and test workloads on Cassandra is cassandra-stress. It is a widely known tool, appearing in numerous blog posts to illustrate performance testing on Cassandra and often recommended for stress testing specific data models. Theoretically there is no reason why cassandra-stress couldn’t fit your performance testing needs. But cassandra-stress has some caveats when modeling real workloads, the most important of which we will cover in this blog post.
In our first post about TimeWindowCompactionStrategy, Alex Dejanovski discussed use cases and the reasons for its introduction in 3.0.8 as a replacement for DateTieredCompactionStrategy. In our experience switching production environments storing time series data to TWCS, we have seen the performance of many production systems improve dramatically.
In this post we’ll explore a new compaction strategy available in Apache Cassandra. We’ll dig into it’s use cases, limitations, and share our experiences of using it with various production clusters.
In this post I’ll introduce you to an advanced option in Apache Cassandra called user defined compaction. As the name implies, this is a process by which we tell Cassandra to create a compaction task for one or more tables explicitly. This task is then handed off to the Cassandra runtime to be executed like any other compaction.