This blog post describes how to monitor Apache Cassandra using the Intel Snap open source telemetry framework. The document also covers some introductory knowledge on how monitoring in Cassandra works. It will use Apache Cassandra 3.0.10 and the resulting monitoring metrics will be visualised using Grafana. Docker containers will be used for Intel Snap and Grafana.
The amount of metrics your platform is collecting can overwhelm a metrics system. This is a common problem as many of today’s metrics solutions like Graphite do not scale successfully. If you don’t have the option to use a metrics backend that can scale, like DataDog, you’re left trying to find a way to cut back the number of metrics you’re collecting. This blog goes through some customisations that provide improvements alleviating Graphite’s inability to scale. It describes how to install and use customisations made to the metrics and metrics-reporter-config libraries used in Cassandra.
Compaction in Apache Cassandra isn’t usually the first (or second) topic that gets discussed when it’s time to start optimizing your system. Most of the time we focus on data modeling and query patterns. An incorrect data model can turn a single query into hundreds of queries, resulting in increased latency, decreased throughput, and missed SLAs. If you’re using spinning disks the problem is magnified by time consuming disk seeks.
That said, compaction is also an incredibly important process. Understanding how a compaction strategy complements your data model can have a significant impact on your application’s performance. For instance, in Alex Dejanovski’s post on TimeWindowCompactionStrategy, he shows how a simple change to the compaction strategy can significantly decrease disk usage. As he demonstrated, a cluster mainly concerned with high rates of TTL’ed time series data can achieve major space savings and significantly improved performance. Knowing how each compaction strategy works in detail will help you make the right choice for your data model and access patterns. Likewise, knowing the nuance of compaction in general can help you understand why the system isn’t behaving as you’d expect when there’s a problem. In this post we’ll discuss some of the nuance of compaction, which will help you better know your database.
Apache Cassandra can store data on disk in an orderly fashion, which makes it great for time series. As you may have seen in numerous tutorials, to get the last 10 rows of a time series, just use a descending clustering order and add a LIMIT 10 clause. Simple and efficient!
Well if we take a closer look, it might not be as efficient as one would think, which we will cover in this blog post.