Custom commands in cstar
Welcome to the next part of the cstar post series. The previous post introduced cstar and showed how it can run simple shell commands using various execution strategies. In this post, we will teach you how to build more complex custom commands.
Basic Custom Commands
Out of the box, cstar comes with three commands:
$ cstar
usage: cstar [-h] {continue,cleanup-jobs,run} ...
cstar
positional arguments:
{continue,cleanup-jobs,run}
continue Continue a previously created job (*)
cleanup-jobs Cleanup old finished jobs and exit (*)
run Run an arbitrary shell command
Custom commands allow extending these three with anything one might find useful. Adding a custom command to cstar is as easy as placing a file to ~/.cstar/commands
or /etc/cstar/commands
. For example, we can create ~/.cstar/commands/status
that looks like this:
#!/usr/bin/env bash
nodetool status
With this file in place, cstar now features a brand new status
command:
$ cstar
usage: cstar [-h] {continue,cleanup-jobs,run,status} ...
cstar
positional arguments:
{continue,cleanup-jobs,run,status}
continue Continue a previously created job (*)
cleanup-jobs Cleanup old finished jobs and exit (*)
run Run an arbitrary shell command
status
A command like this allows us to stop using:
cstar run --command "nodetool status" --seed-host <host_ip>
And use a shorter version instead:
cstar status --seed-host <host_ip>
We can also declare the command description and default values for cstar’s options in the command file. We can do this by including commented lines with a special prefix. For example, we can include the following lines in our ~/.cstar/commands/status
file:
#!/usr/bin/env bash
# C* cluster-parallel: true
# C* dc-parallel: true
# C* strategy: all
# C* description: Run nodetool status
nodetool status
Once we do this, the status
will show up with a proper description in cstar
’s help and running cstar status --seed-host <host_ip>
will be equivalent to:
cstar status --seed-host <host_ip> --cluster-parallel --dc-parallel --strategy all
When cstar
begins the execution of a command, it will print an unique ID of the command being run. This ID is needed for resuming a job but more on this later. We also need the job ID to examine the output of the commands. We can find the output in:
$ ~/.cstar/jobs/<job_id>/<hostname>/out
Parametrized Custom Commands
When creating custom commands, cstar
allows declaring custom arguments as well. We will explain this feature by introducing a command that deletes snapshots older than given number of days.
We will create a new file, ~/.cstar/commands/clear-snapshots
, that will start like this:
#!/usr/bin/env bash
# C* cluster-parallel: true
# C* dc-parallel: true
# C* strategy: all
# C* description: Clear snapshots older than given number of days
# C* argument: {"option":"--days", "name":"DAYS", "description":"Snapshots older than this many days will be deleted", "default":"7", "required": false}
The new element here is the last line starting with # C* argument:
. Upon seeing this prefix, cstar
will parse the remainder of the line as a JSON payload describing the custom argument. In the case above, cstar
will:
- Use
--days
as the name of the argument. - Save the value of this argument into a variable named
DAYS
. We will see how to access this in a bit. - Associate a description with this argument.
- Use
7
as a default value. - Do not require this option.
With this file in place, cstar
already features the command in its helps:
$ cstar
usage: cstar [-h] {continue,cleanup-jobs,run,status,clear-snapshots} ...
cstar
positional arguments:
{continue,cleanup-jobs,run,clear-snapshots}
continue Continue a previously created job (*)
cleanup-jobs Cleanup old finished jobs and exit (*)
run Run an arbitrary shell command
status Run nodetool status
clear-snapshots Clear snapshots older than given number of days
$ cstar clear-snapshots --help
usage: cstar clear-snapshots [-h] [--days DAYS]
[--seed-host [SEED_HOST [SEED_HOST ...]]]
...
<other default options omitted>
optional arguments:
-h, --help show this help message and exit
--days DAYS
Snapshots older than this many days will be deleted
--seed-host [SEED_HOST [SEED_HOST ...]]
One or more hosts to use as seeds for the cluster (edited)
...
<other default options omitted>
Now we need to add the command which will actually clear the snapshots. This command needs to do three things:
- Find the snapshots that are older than given number of days.
- We will use the
-mtime
filter of the find utility: find /var/lib/cassandra/*/data/*/*/snapshots/ -mtime +"$DAYS" -type d
- Note we are using
"$DAYS"
to reference the value of the custom argument.
- We will use the
- Extract the snapshot names from the findings.
- We got absolute paths to the directories found. Snapshot names are the last portion of these paths. Also, we will make sure to keep each snapshot name only once:
sed -e 's#.*/##' | sort -u
- Invoke
nodetool clearsnapshot -t <snapshot_name>
to clear each of the snapshots.
Putting this all together, the clear-snapshots
file will look like this:
#!/usr/bin/env bash
# C* cluster-parallel: true
# C* dc-parallel: true
# C* strategy: all
# C* description: Clear up snapshots older than given number of days
# C* argument: {"option":"--days", "name":"DAYS", "description":"Snapshots older than this many days will be deleted", "default":"7", "required": false}
find /var/lib/cassandra/data/*/*/snapshots/ -mtime +"$DAYS" -type d |\
sed -e 's#.*/##' |\
sort -u |\
while read line; do nodetool clearsnapshot -t "${line}"; done
We can now run the clear-snpahsots
command like this:
$ cstar clear-snapshots --days 2 --seed-host <seed_host>
Complex Custom Commands
One of the main reasons we consider cstar
so useful is that the custom commands can be arbitrary shell scripts, not just one-liners we have seen so far. To illustrate this, we are going to share two relatively complicated commands.
Upgrading Cassandra Version
The first command will cover a rolling upgrade of the Cassandra version. Generally speaking, the upgrade should happen as quickly as possible and with as little downtime as possible. This is the ideal application of cstar
’s topology
strategy: it will execute the upgrade on as many nodes as possible while ensuring a quorum of replicas stays up at any moment. Then, the upgrade of a node should follow these steps:
- Create snapshots to allow rollback if the need arises.
- Upgrade the Cassandra installation.
- Restart the Cassandra process.
- Check the upgrade happened successfully.
Clearing the snapshots, or upgrading SSTables is something that should not be part of the upgrade itself. Snapshots being just hardlinks will not consume excessive space and Cassandra (in most cases) can operate with older SSTable versions. Once all nodes are upgraded, these actions are easy enough to perform with dedicated cstar
commands.
The ~/.cstar/commands/upgrade
command might look like this:
#!/usr/bin/env bash
# C* cluster-parallel: true
# C* dc-parallel: true
# C* strategy: topology
# C* description: Upgrade Cassandra package to given target version
# C* argument: {"option":"--snapshot-name", "name":"SNAPSHOT_NAME", "description":"Name of pre-upgrade snapshot", "default":"preupgrade", "required": false}
# C* argument: {"option":"--target-version", "name":"VERSION", "description":"Target version", "required": true}
# -x prints the executed commands commands to standard output
# -e fails the entire script if any of the commands fails
# -u fails the script if any of the variables is not bound
# -o pipefail instrucs the interpreter to return right-most non-zero status of a piped command in case of failure
set -xeuo pipefail
# exit if a node is already on the target version
if [[ $(nodetool version) == *$VERSION ]]; then
exit 0
fi
# create Cassandra snapshots to allow rollback in case of problems
nodetool clearsnapshot -t "$SNAPSHOT_NAME"
nodetool snapshot -t "$SNAPSHOT_NAME"
# upgrade Cassandra version
sudo apt-get install -y cassandra="$VERSION"
# gently stop the cassandra process
nodetool drain && sleep 5 && sudo service cassandra stop
# start the Cassandra process again
sudo service cassandra start
# wait for Cassandra to start answering JMX queries
for i in $(seq 60); do
if ! nodetool version 2>&1 > /dev/null; then
break
fi
sleep 1s
done
# fail if the upgrade did not happen
if ! [[ $(nodetool version) == *$VERSION ]]; then
exit 1
fi
When running this command, we can be extra-safe and use the --stop-after
option:
$ cstar upgrade --seed-host <host_name> --target-version 3.11.2 --stop-after 1
This will instruct cstar
to upgrade only one node and exit the execution. Once that happens, we can take our time to inspect the node to see if the upgrade went smoothly. When we are confident enough, we can resume the command. Output of each cstar
command starts with a highlighted job identifier, which we can use with the continue
command:
$ cstar continue <job_id>
Changing Compaction Strategy
The second command we would like to share performs a compaction strategy change in a rolling fashion.
Compaction configuration is a table property. It needs an ALTER TABLE
CQL statement execution to change. Running a CQL statement is effective immediately across the cluster. This means once we issue the statement, each node will react to the compaction change. The exact reaction depends on the change, but it generally translates to increased compaction activity. It is not always desirable to have this happen: compaction can be an intrusive process and affect the cluster performance.
Thanks to CASSANDRA-9965, there is a way of altering compaction configuration on a single node via JMX since Cassandra version 2.1.9. We can set CompactionParametersJson
MBean value and change the compaction configuration the node uses. Once we know how to change one node, we can have cstar
do the same but across the whole cluster.
Once we change the compaction settings, we should also manage the aftermath. Even though the change is effective immediately, it might take a very long time until each SSTable undergoes a newly configured compaction. The best way of doing this is to trigger a major compaction and wait for it to finish. After a major compaction, all SSTables are organised according to the new compaction settings and there should not be any unexpected compaction activity afterwards.
While cstar
is excellent in checking which nodes are up or down, it does not check for other aspects of nodes health. It does not have the ability to monitor compaction activity. Therefore we should include the wait for major compaction in the command we are about to build. The command will then follow these steps:
- Stop any compactions that are currently happening.
- Set the
CompactionParametersJson
MBean to the new value.- We will use jmxterm for this and assume the JAR file is already present on the nodes.
- Run a major compaction to force Cassandra to organise SSTables according to the new setting and make
cstar
wait for the compactions to finish.- This step is not mandatory. Cassandra would re-compact the SSTables eventually.
- Doing a major compaction will cost extra resources and possibly impact the node’s performance. We do not recommend doing this at all nodes in parallel.
- We are taking advantage of the
topology
strategy which will guarantee a quorum of replicas free from this load at any time.
The ~/.cstar/commands/change-compaction
command might look like this:
#! /bin/bash
# C* cluster-parallel: true
# C* dc-parallel: true
# C* strategy: topology
# C* description: Switch compaction strategy using jmxterm and perform a major compaction on a specific table
# C* argument: {"option":"--keyspace-name", "name":"KEYSPACE", "description":"Keyspace containing the target table", "required": true}
# C* argument: {"option":"--table", "name":"TABLE", "description":"Table to switch the compaction strategy on", "required": true}
# C* argument: {"option":"--compaction-parameters-json", "name":"COMPACTION_PARAMETERS_JSON", "description":"New compaction parameters", "required": true}
# C* argument: {"option":"--major-compaction-flags", "name":"MAJOR_COMPACTION_FLAGS", "description":"Flags to add to the major compaction command", "default":"", "required": false}
# C* argument: {"option":"--jmxterm-jar-location", "name":"JMXTERM_JAR", "description":"jmxterm jar location on disk", "required": true}
set -xeuo pipefail
echo "Switching compaction strategy on $KEYSPACE.$TABLE"
echo "Stopping running compactions"
nodetool stop COMPACTION
echo "Altering compaction through JMX..."
echo "set -b org.apache.cassandra.db:columnfamily=$TABLE,keyspace=$KEYSPACE,type=ColumnFamilies CompactionParametersJson $COMPACTION_PARAMETERS_JSON" | java -jar $JMXTERM_JAR --url 127.0.0.1:7199 -e
echo "Running a major compaction..."
nodetool compact ${KEYSPACE} ${TABLE} $MAJOR_COMPACTION_FLAGS
The command requires options specifying which keyspace and table to apply the change on. The jmxterm location and the new value for the compaction parameters are another two required arguments. The command also allows passing in flags to the major compaction. This is useful for cases when we are switching to SizeTieredCompactionStrategy
, where the -s
flag will instruct cassandra to produce several size-tiered files instead of a single big file.
Running the nodetool compact
command will not return until the major compaction finishes. This will cause the execution on one node to not complete until this happens. Consequently, cstar
will see this long execution and dutifully wait for it to complete before moving on to other nodes.
Here is an example of running this command:
$ cstar change-compaction --seed-host <host_name> --keyspace tlp_stress --table KeyValue --jmxterm-jar-location /usr/share/jmxterm-1.0.0-uber.jar --compaction-parameters-json "{\"class\":\"LeveledCompactionStrategy\",\"sstable_size_in_mb\":\"120\"}"
This command also benefits from the --stop-after
option. Moreover, once all nodes are changed, we should not forget to persist the schema change by doing the actual ALTER TABLE
command.
Conclusion
In this post we talked about cstar
and its feature of adding custom commands. We have seen:
- How to add a simple command to execute
nodetool status
on all nodes at once. - How define custom parameters for our commands, which allowed us to build a command for deleting old snapshots.
- That the custom commands are essentially regular bash scripts and can include multiple statements. We used this feature to do a safe and fast Cassandra version upgrade.
- That the custom commands can call external utilities such as
jmxterm
, which we used to change compaction strategy for a table in a rolling fashion.
In the next post, we are going to look into cstar
’s cousin called cstarpar
. cstarpar
differs in the way commands are executed on remote nodes and allows for heavier operations such as rolling reboots.