Problem:

The client operates a 6-node multi-DC replication setup for Cassandra, consisting of 3 nodes in the PROD datacenter (East US2) and 3 nodes in the DR datacenter (West US2). They are planning to perform DDL changes, including altering tables to adjust the default_time_to_live parameter and dropping and recreating a table with a new definition. However, they are unsure about how these changes will be replicated across datacenters and whether any manual intervention will be required.

Process:

  1. Step 1: Understand the limitations of TTL changes: In Cassandra, altering the default_time_to_live parameter requires reinserting data with the new TTL as TTL is an internal column attribute.
  2. Step 2: Recognize the replication behavior: Schema changes using ALTER or DROP TABLE statements on a single datacenter (PROD) in a multi-DC setup will not automatically propagate to other data centers (DR). Manual intervention may be necessary.
  3. Step 3: Execute DDL changes: Run the ALTER/DROP TABLE statements separately on each datacenter to ensure data consistency. For CREATE TABLE statements, execute them on a single node.
  4. Step 4: Perform maintenance tasks: After schema changes or bulk UPDATE/DELETE operations, run nodetool compact, nodetool repair, and nodetool cleanup on all nodes in both datacenters, one node at a time.

Solution:

The client observed that dropping a table on the PROD side automatically propagated to the DR side without manual intervention. Similarly, altering the default_time_to_live parameter on the PROD side replicated to the DR side without the need for separate execution. However, it’s recommended to check for schema agreement propagation (SAP) settings and not rely solely on automatic replication. Despite not finding SAP configuration in the client’s environment, the recommendation stands to manually apply schema changes if necessary and to monitor replication behavior closely.

Conclusion:

By understanding the replication behavior of Cassandra in a multi-DC setup and executing DDL changes with caution, the client successfully managed their schema changes. While automatic propagation of schema changes may occur under certain conditions, it’s essential to verify and manually intervene if necessary to ensure data consistency across datacenters.