In this blog post we will take a look at consistency mechanisms in Apache Cassandra. There are three reasonably well documented features serving this purpose:
We’re delighted to introduce cassandra-reaper.io, the dedicated site for the open source Reaper project! Since we adopted Reaper from the incredible folks at Spotify, we’ve added a significant number of features, expanded the supported versions past 2.0, added support for incremental repair, and added a Cassandra backend to simplify operations.
A handy feature was silently added to Apache Cassandra’s
nodetool just over a year ago. The feature added was the
-j (jobs) option. This little gem controls the number of compaction threads to use when running either a
upgradesstables. The option was added to
nodetool via CASSANDRA-11179 to version 3.5. It has been back ported to Apache Cassandra versions 2.1.14, 2.2.6, and 3.5.
One of the big challenges people face when starting out working with Cassandra and time series data is understanding the impact of how your write workload will affect your cluster. Writing too quickly to a single partition can create hot spots that limit your ability to scale out. Partitions that get too large can lead to issues with repair, streaming, and read performance. Reading from the middle of a large partition carries a lot of overhead, and results in increased GC pressure. Cassandra 4.0 should improve the performance of large partitions, but it won’t fully solve the other issues I’ve already mentioned. For the foreseeable future, we will need to consider their performance impact and plan for them accordingly.