Zipkin Tracing
Apache Cassandra

About The Last Pickle


Work with clients
to deliver and improve

Apache Cassandra based solutions

Based in
USA, New Zealand, Australia

microservices - devOps - distributed tracing

zipkin

zipkin & cassandra

Apache Cassandra the data platform de jure
for the next evolution of software services


an enterprise moving ever towards
microservices and BASE architectures


the missing piece for many is tracing and profiling difficult to reproduce problems

Scaling Data


Apache Cassandra
data platform de jure
next evolution of software services

Scaling People


an enterprise moving ever towards
microservices and BASE architectures

architectural safety


  • zero exceptions
  • spot problems first
  • stable master
  • no user left out
  • plug-n-play services


tracing and profiling difficult to reproduce problems

Zipkin


an implementation of Google's Dapper paper

search traces

analyse one trace

realtime in browser

platform call graph

client    |    server



CS -->                        
                        --> SR

                        <-- SS
CR <--                        


simple http call


simple http call


simple c* call


simple c* call


http call passing through headers


Tracing in C*

    CO-ORDINATOR NODE                    REPLICA NODE


 -->
    beginSession(..)
        trace(..)
        trace(..)
                                    --> initialiseMessage(..)
                                            trace(..)
                                            trace(..)
                                    <--
        trace(..)
    endSession(..)
 <--
                        

Zipkin in C*


  • visualisation
  • detailed timings
  • hierarchy and asynchronisity
  • zero tracing overhead
Two classes to override

public class ZipkinTracing extends Tracing
{..}

public class ZipkinTraceState extends TraceState
{..}


then run enabling new tracing
bin/cassandra -Dcassandra.custom_tracing_class=..ZipkinTracing

Zipkin across C*


Zipkin across C*

    CO-ORDINATOR NODE                    REPLICA NODE


 -->
    beginSession(..)
        trace(..)
        trace(..)
                   (zipkin headers) --> initialiseMessage(..)
                                            trace(..)
                                            trace(..)
                                    <--
        trace(..)
    endSession(..)
 <--
                        

Zipkin across C*

Zipkin into C*


Zipkin into C*


to the rescue

http call passing through headers




c* call using custom payload




enable zipkin tracing and the custom payload handler

bin/cassandra
    -Dcassandra.custom_tracing_class=..ZipkinTracing
    -Dcassandra.custom_query_handler_class=..CustomPayloadMirroringQueryHandler


the patch?

src/java/org/apache/cassandra/net/MessageOut.java                   |  7 +------
src/java/org/apache/cassandra/net/OutboundTcpConnection.java        |  4 +++-
src/java/org/apache/cassandra/service/QueryState.java               | 12 ++++++++++--
src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java|  2 +-
4 file changed, 15 insertion(+), 10 deletion(-)

what else?


  • anti-entropy repair
  • compaction

Thanks

  • Zipkin     – https://github.com/openzipkin/zipkin

  • Brave (zipkin java instrumentation)
                – https://github.com/openzipkin/brave

  • C* patch for pluggable tracing
                – https://issues.apache.org/jira/browse/CASSANDRA-10392

  • Zipkin Cassandra implementation
                – https://github.com/thelastpickle/cassandra-zipkin-tracing

  • Google's Dapper paper
                – http://research.google.com/pubs/pub36356.html

  • C* custom payloads
                – https://issues.apache.org/jira/browse/CASSANDRA-8553
                – https://datastax.github.io/java-driver/2.2.0-rc2/features/custom_payloads/