When people think of tuning Apache Cassandra to perform better, their first instinct usually is to look into hardware, JVM or Cassandra configuration. But this is only one side of things, the client application which connects to Cassandra database can be tuned as well. Applications which store and read from Cassandra use a driver to connect to Cassandra, and DataStax driver has become a standard in the last few years. This blog post will concentrate on the client side of things and we will try to explain what can be done on the driver side so that applications using Cassandra could perform better.
This will be a blog post series where the first couple of blog posts will cover a couple of settings which can impact performance a lot and the second blog post will explain a few tricks based on our use case which we applied to squeeze additional performance numbers. We will concentrate on application side tuning only, if you want to find out more about Cassandra tuning you can watch the Cassandra Tuning - Above and Beyond presentation from Cassandra Summit 2016.
The first thing that you can look into, especially if you connect to a version of Cassandra which supports the V3 protocol (check which version support which protocol). DataStax wanted to leave compatibility in place on driver level when you both connect to Cassandra using the V2 and V3 protocol and to generate the same load. Since they lowered the core and maximum number of connections to 1 in the V3 protocol and they had 2-8 connections with V2 running simultaneous 128 requests, they left a low value of 1024 as the default for V3 (check the documentation of simultaneous connections). This was the prime reason why V3 defaults are far from optimal settings you can use to communicate with Cassandra. The reason for this transition was the fact that the dynamic resizing of connection pool can be expensive, and with 32K simultaneous requests per connection having more than 1 connection is not optimal. Make sure to have proper monitoring of connection load in place before you increase this 1024 request per connection as suggested in the documentation.
There is one more interesting setting when configuring connection pools - poolTimeoutMillis. Basically it tells the application how long it should wait until connection to a host is ready to send the request. The default is 5 seconds but in some use cases waiting and blocking is not acceptable, it is preferable to fail fast and raise NoHostsAvailable. If you have a really latency-sensitive application make sure to set this to 0 and handle the NoHostsAvailable exception in the most appropriate based on your needs.
Our recommended settings for the V3 protocol can be like this:
Usually you have a really latency-sensitive use case where being stuck and waiting for request is not a viable option. DataStax driver has SocketOptions which allows you to set your own read timeout based on the needs of your use case. SocketOptions defaults are way too high for a low latency use case, however they are slightly higher than Cassandra timeouts so it makes sense. The default for read request is 12 seconds and when you have a millisecond SLA this is not an option, it is way better to set it much lower and handle OperationTimedOutException by application (or by retry policy) than to wait for 12 seconds. We usually set SocketOptions.setReadTimeoutMillis slightly higher that the highest timeout in cassandra.yaml and add a retry policy which will retry after timeout happens.
Pay attention that this is a cluster-wide setting, if you want to set this per request and override defaults you can do it (from new version of driver 3.1.x) directly on SimpleStatement via setReadTimeoutMillis. Reads are usually more important to happen fast than writes, so you can set an i.e. read timeout much lower than for writes only for selected queries.
Still to come
In this part we covered first round of settings which can give you quick wins. Stay tuned since next week we will cover speculative executions and latency aware load balancing policy which can help you get even better performance for some use cases.