1# Configuration 2 3## Load balancing 4 5Load balancing controls how queries are distributed to nodes in a Cassandra 6cluster. 7 8Without additional configuration the C/C++ driver defaults to using Datacenter-aware 9load balancing with token-aware routing. This means that driver will only send 10queries to nodes in the local datacenter (for local consistency levels) and 11it will use the primary key of queries to route them directly to the nodes where the 12corresponding data is located. 13 14### Round-robin Load Balancing 15 16This load balancing policy equally distributes queries across cluster without 17consideration of datacenter locality. This should only be used with 18Cassandra clusters where all nodes are located in the same datacenter. 19 20### Datacenter-aware Load Balancing 21 22This load balancing policy equally distributes queries to nodes in the local 23datacenter. Nodes in remote datacenters are only used when all local nodes are 24unavailable. Additionally, remote nodes are only considered when non-local 25consistency levels are used or if the driver is configured to use remote nodes 26with the [`allow_remote_dcs_for_local_cl`] setting. 27 28```c 29CassCluster* cluster = cass_cluster_new(); 30 31const char* local_dc = "dc1"; /* Local datacenter name */ 32 33/* 34 * Use up to 2 remote datacenter nodes for remote consistency levels 35 * or when `allow_remote_dcs_for_local_cl` is enabled. 36 */ 37unsigned used_hosts_per_remote_dc = 2; 38 39/* Don't use remote datacenter nodes for local consistency levels */ 40cass_bool_t allow_remote_dcs_for_local_cl = cass_false; 41 42cass_cluster_set_load_balance_dc_aware(cluster, 43 local_dc, 44 used_hosts_per_remote_dc, 45 allow_remote_dcs_for_local_cl); 46 47/* ... */ 48 49cass_cluster_free(cluster); 50``` 51 52### Token-aware Routing 53 54Token-aware routing uses the primary key of queries to route requests directly to 55the Cassandra nodes where the data is located. Using this policy avoids having 56to route requests through an extra coordinator node in the Cassandra cluster. This 57can improve query latency and reduce load on the Cassandra nodes. It can be used 58in conjunction with other load balancing and routing policies. 59 60```c 61CassCluster* cluster = cass_cluster_new(); 62 63/* Enable token-aware routing (this is the default setting) */ 64cass_cluster_set_token_aware_routing(cluster, cass_true); 65 66/* Disable token-aware routing */ 67cass_cluster_set_token_aware_routing(cluster, cass_false); 68 69/* ... */ 70 71cass_cluster_free(cluster); 72``` 73 74### Latency-aware Routing 75 76Latency-aware routing tracks the latency of queries to avoid sending new queries 77to poorly performing Cassandra nodes. It can be used in conjunction with other 78load balancing and routing policies. 79 80```c 81CassCluster* cluster = cass_cluster_new(); 82 83/* Disable latency-aware routing (this is the default setting) */ 84cass_cluster_set_latency_aware_routing(cluster, cass_false); 85 86/* Enable latency-aware routing */ 87cass_cluster_set_latency_aware_routing(cluster, cass_true); 88 89/* 90 * Configure latency-aware routing settings 91 */ 92 93/* Up to 2 times the best performing latency is okay */ 94cass_double_t exclusion_threshold = 2.0; 95 96 /* Use the default scale */ 97cass_uint64_t scale_ms = 100; 98 99/* Retry a node after 10 seconds even if it was performing poorly before */ 100cass_uint64_t retry_period_ms = 10000; 101 102/* Find the best performing latency every 100 milliseconds */ 103cass_uint64_t update_rate_ms = 100; 104 105/* Only consider the average latency of a node after it's been queried 50 times */ 106cass_uint64_t min_measured = 50; 107 108cass_cluster_set_latency_aware_routing_settings(cluster, 109 exclusion_threshold, 110 scale_ms, 111 retry_period_ms, 112 update_rate_ms, 113 min_measured); 114 115/* ... */ 116 117cass_cluster_free(cluster); 118``` 119 120### Filtering policies 121 122#### Whitelist 123 124This policy ensures that only hosts from the provided whitelist filter will 125ever be used. Any host that is not contained within the whitelist will be 126considered ignored and a connection will not be established. It can be used in 127conjunction with other load balancing and routing policies. 128 129NOTE: Using this policy to limit the connections of the driver to a predefined 130 set of hosts will defeat the auto-detection features of the driver. If 131 the goal is to limit connections to hosts in a local datacenter use 132 DC aware in conjunction with the round robin load balancing policy. 133 134```c 135CassCluster* cluster = cass_cluster_new(); 136 137/* Set the list of predefined hosts the driver is allowed to connect to */ 138cass_cluster_set_whitelist_filtering(cluster, 139 "127.0.0.1, 127.0.0.3, 127.0.0.5"); 140 141/* The whitelist can be cleared (and disabled) by using an empty string */ 142cass_cluster_set_whitelist_filtering(cluster, ""); 143 144/* ... */ 145 146cass_cluster_free(cluster); 147``` 148 149#### Blacklist 150 151This policy is the inverse of the whitelist policy where hosts provided in the 152blacklist filter will be ignored and a connection will not be established. 153 154```c 155CassCluster* cluster = cass_cluster_new(); 156 157/* Set the list of predefined hosts the driver is NOT allowed to connect to */ 158cass_cluster_set_blacklist_filtering(cluster, 159 "127.0.0.1, 127.0.0.3, 127.0.0.5"); 160 161/* The blacklist can be cleared (and disabled) by using an empty string */ 162cass_cluster_set_blacklist_filtering(cluster, ""); 163 164/* ... */ 165 166cass_cluster_free(cluster); 167``` 168 169#### Datacenter 170 171Filtering can also be performed on all hosts in a datacenter or multiple 172datacenters when using the whitelist/blacklist datacenter filtering polices. 173 174```c 175CassCluster* cluster = cass_cluster_new(); 176 177/* Set the list of predefined datacenters the driver is allowed to connect to */ 178cass_cluster_set_whitelist_dc_filtering(cluster, "dc2, dc4"); 179 180/* The datacenter whitelist can be cleared/disabled by using an empty string */ 181cass_cluster_set_whitelist_dc_filtering(cluster, ""); 182 183/* ... */ 184 185cass_cluster_free(cluster); 186``` 187 188```c 189CassCluster* cluster = cass_cluster_new(); 190 191 192/* Set the list of predefined datacenters the driver is NOT allowed to connect to */ 193cass_cluster_set_blacklist_dc_filtering(cluster, "dc2, dc4"); 194 195/* The datacenter blacklist can be cleared/disabled by using an empty string */ 196cass_cluster_set_blacklist_dc_filtering(cluster, ""); 197 198/* ... */ 199 200cass_cluster_free(cluster); 201``` 202 203## Speculative Execution 204 205For certain applications it is of the utmost importance to minimize latency. 206Speculative execution is a way to minimize latency by preemptively executing 207several instances of the same query against different nodes. The fastest 208response is then returned to the client application and the other requests are 209cancelled. Speculative execution is <b>disabled</b> by default. 210 211### Query Idempotence 212 213Speculative execution will result in executing the same query several times. 214Therefore, it is important that queries are idempotent i.e. a query can be 215applied multiple times without changing the result beyond the initial 216application. <b>Queries that are not explicitly marked as idempotent will not be 217scheduled for speculative executions.</b> 218 219The following types of queries are <b>not</b> idempotent: 220 221* Mutation of `counter` columns 222* Prepending or appending to a `list` column 223* Use of non-idempotent CQL function e.g. `now()` or `uuid()` 224 225The driver is unable to determine if a query is idempotent therefore it is up to 226an application to explicitly mark a statement as being idempotent. 227 228```c 229CassStatement* statement = cass_statement_new( "SELECT * FROM table1", 0); 230 231/* Make the statement idempotent */ 232cass_statement_set_is_idempotent(statement, cass_true); 233 234cass_statement_free(statement); 235``` 236 237### Enabling speculative execution 238 239Speculative execution is enabled by connecting a `CassSession` with a 240`CassCluster` that has a speculative execution policy enabled. The driver 241currently only supports a constant policy, but may support more in the future. 242 243#### Constant speculative execution policy 244 245The following will start up to 2 more executions after the initial execution 246with the subsequent executions being created 500 milliseconds apart. 247 248```c 249CassCluster* cluster = cass_cluster_new(); 250 251cass_int64_t constant_delay_ms = 500; /* Delay before a new execution is created */ 252int max_speculative_executions = 2; /* Number of executions */ 253 254cass_cluster_set_constant_speculative_execution_policy(cluster, 255 constant_delay_ms, 256 max_speculative_executions); 257 258/* ... */ 259 260cass_cluster_free(cluster); 261``` 262 263### Connection Heartbeats 264 265To prevent intermediate network devices (routers, switches, etc.) from 266disconnecting pooled connections the driver sends a lightweight heartbeat 267request (using an [`OPTIONS`] protocol request) periodically. By default the 268driver sends a heartbeat every 30 seconds. This can be changed or disabled (0 269second interval) using the following: 270 271```c 272CassCluster* cluster = cass_cluster_new(); 273 274/* Change the heartbeat interval to 1 minute */ 275cass_cluster_set_connection_heartbeat_interval(cluster, 60); 276 277/* Disable heartbeat requests */ 278cass_cluster_set_connection_heartbeat_interval(cluster, 0); 279 280/* ... */ 281 282cass_cluster_free(cluster); 283``` 284Heartbeats are also used to detect unresponsive connections. An idle timeout 285setting controls the amount of time a connection is allowed to be without a 286successful heartbeat before being terminated and scheduled for reconnection. This 287interval can be changed from the default of 60 seconds: 288 289```c 290CassCluster* cluster = cass_cluster_new(); 291 292/* Change the idle timeout to 2 minute */ 293cass_cluster_set_connection_idle_timeout(cluster, 120); 294 295/* ... */ 296 297cass_cluster_free(cluster); 298``` 299 300It can be disabled by setting the value to a very long timeout or by disabling 301heartbeats. 302 303### Host State Changes 304 305The status and membership of a node can change within the life-cycle of the 306cluster. A host listener callback can be used to detect these changes. 307 308**Important**: The driver runs the host listener callback on a thread that is 309 different from the application. Any data accessed in the 310 callback must be immutable or synchronized with a mutex, 311 semaphore, etc. 312 313```c 314void on_host_listener(CassHostListenerEvent event, CassInet inet, void* data) { 315 /* Get the string representation of the inet address */ 316 char address[CASS_INET_STRING_LENGTH]; 317 cass_inet_string(inet, address); 318 319 /* Perform application logic for host listener event */ 320 if (event == CASS_HOST_LISTENER_EVENT_ADD) { 321 printf("Host %s has been ADDED\n", address); 322 } else if (event == CASS_HOST_LISTENER_EVENT_REMOVE) { 323 printf("Host %s has been REMOVED\n", address); 324 } else if (event == CASS_HOST_LISTENER_EVENT_UP) { 325 printf("Host %s is UP\n", address); 326 } else if (event == CASS_HOST_LISTENER_EVENT_DOWN) { 327 printf("Host %s is DOWN\n", address); 328 } 329} 330 331int main() { 332 CassCluster* cluster = cass_cluster_new(); 333 334 /* Register the host listener callback */ 335 cass_cluster_set_host_listener_callback(cluster, on_host_listener, NULL); 336 337 /* ... */ 338 339 cass_cluster_free(cluster); 340} 341``` 342 343**Note**: Expensive (e.g. slow) operations should not be performed in host 344 listener callbacks. Performing expensive operations in a callback 345 will block or slow the driver's normal operation. 346 347### Reconnection Policy 348 349The reconnection policy controls the interval between each attempt for a given 350connection. 351 352#### Exponential Reconnection Policy 353 354The exponential reconnection policy is the default reconnection policy. It 355starts by using a base delay in milliseconds which is then exponentially 356increased (doubled) during each reconnection attempt; up to the defined maximum 357delay. 358 359**Note**: Once the connection is re-established, this policy will restart using 360 base delay if a reconnection occurs. 361 362#### Constant Reconnection Policy 363 364The constant reconnection policy is a fixed delay for each reconnection 365attempt. 366 367### Performance Tips 368 369#### Use a single persistent session 370 371Sessions are expensive objects to create in both time and resources because they 372maintain a pool of connections to your Cassandra cluster. An application should 373create a minimal number of sessions and maintain them for the lifetime of an 374application. 375 376#### Use token-aware and latency-aware policies 377 378The token-aware load balancing can reduce the latency of requests by avoiding an 379extra network hop through a coordinator node. When using the token-aware policy 380requests are sent to one of the nodes which will retrieved or stored instead of 381routing the request through a proxy node (coordinator node). 382 383The latency-aware load balancing policy can also reduce the latency of requests 384by routing requests to nodes that historical performing with the lowest latency. 385This can prevent requests from being sent to nodes that are underperforming. 386 387Both [latency-aware] and [token-aware] can be use together to obtain the benefits of 388both. 389 390#### Use [paging] when retrieving large result sets 391 392Using a large page size or a very high `LIMIT` clause can cause your application 393to delay for each individual request. The driver's paging mechanism can be used 394to decrease the latency of individual requests. 395 396#### Choose a lower consistency level 397 398Ultimately, choosing a consistency level is a trade-off between consistency and 399availability. Performance should not be a large deciding factor when choosing a 400consistency level. However, it can affect high-percentile latency numbers 401because requests with consistency levels greater than `ONE` can cause requests 402to wait for one or more nodes to respond back to the coordinator node before a 403request can complete. In multi-datacenter configurations, consistency levels such as 404`EACH_QUORUM` can cause a request to wait for replication across a slower cross 405datacenter network link. More information about setting the consistency level 406can be found [here](http://datastax.github.io/cpp-driver/topics/basics/consistency/). 407 408### Driver Tuning 409 410Beyond the performance tips and best practices considered in the previous 411section your application might consider tuning the more fine-grain driver 412settings in this section to achieve optimal performance for your application's 413specific workload. 414 415#### Increasing core connections 416 417In some workloads, throughput can be increased by increasing the number of core 418connections. By default, the driver uses a single core connection per host. It's 419recommended that you try increasing the core connections to two and slowly 420increase this number while doing performance testing. Two core connections is 421often a good setting and increasing the core connections too high will decrease 422performance because having multiple connections to a single host inhibits the 423driver's ability to coalesce multiple requests into a fewer number of system 424calls. 425 426#### Coalesce delay 427 428The coalesce delay is an optimization to reduce the number of system calls 429required to process requests. This setting controls how long the driver's I/O 430threads wait for requests to accumulate before flushing them on to the wire. 431Larger values for coalesce delay are preferred for throughput-based workloads as 432it can significantly reduce the number of system calls required to process 433requests. 434 435In general, the coalesce delay should be increased for throughput-based 436workloads and can be decreased for latency-based workloads. Most importantly, 437the delay should consider the responsiveness guarantees of your application. 438 439Note: Single, sporadic requests are not generally affected by this delay and 440are processed immediately. 441 442#### New request ratio 443 444The new request ratio controls how much time an I/O thread spends processing new 445requests versus handling outstanding requests. This value is a percentage (with 446a value from 1 to 100), where larger values will dedicate more time to 447processing new requests and less time on outstanding requests. The goal of this 448setting is to balance the time spent processing new/outstanding requests and 449prevent either from fully monopolizing the I/O thread's processing time. It's 450recommended that your application decrease this value if computationally 451expensive or long-running future callbacks are used (via 452`cass_future_set_callback()`), otherwise this can be left unchanged. 453 454[`allow_remote_dcs_for_local_cl`]: http://datastax.github.io/cpp-driver/api/struct.CassCluster/#1a46b9816129aaa5ab61a1363489dccfd0 455[`OPTIONS`]: https://github.com/apache/cassandra/blob/cassandra-3.0/doc/native_protocol_v3.spec 456[token-aware]: http://datastax.github.io/cpp-driver/topics/configuration/#latency-aware-routing 457[latency-aware]: http://datastax.github.io/cpp-driver/topics/configuration/#token-aware-routing 458[paging]: http://datastax.github.io/cpp-driver/topics/basics/handling_results/#paging 459