Intermittent Table Update Issue in Cassandra DB - Proactive Insights and Support For Open-Source Applications

Problem:

The table is not updated immediately in Cassandra DB. The reference id table is not updated sometimes and the application gets old values.

Process:

Step 1: In the initial investigation and troubleshooting of the DB our experts asked the client the following questions:

Cluster Configuration
- How many nodes are there in the cluster?
- What are the READ and WRITE consistency levels configured?
Time Synchronization
- Are there any time differences among the servers from the client side and on the Cassandra nodes?
- If there are time differences, have they been corrected to ensure synchronization between the client and Cassandra nodes?
Table Update Methodology
- How are you updating the table? Is it through a backend service, directly from the cqlsh editor, or another method?
- Are you specifying the consistency level in your queries when updating the table?

Our expert provided an example of the code:

a. Query level

PreparedStatement pstmt = session.prepare( “INSERT INTO product (SKU, description) VALUES (?, ?)”);

pstmt.setConsistencyLevel(ConsistencyLevel.QUORUM);

b. To set a global consistency level for reads AND writes using the Java driver, do something like:

QueryOptions qo = new QueryOptions().setConsistencyLevel(ConsistencyLevel.ALL).

Step 2: During a collaboration with the experts and the clients in the live session, the following topics were checked and covered:

Inconsistent updates in logs: Our experts suggested consistent updates using consensus or all agreement.
Update queries failing, recommended removing the If class from update ship to avoid repeating transactions.
Timeouts during updates due to simultaneous transactions, advised checking system time and providing comprehensive error logs.
Difficulty deleting records, possibly linked to incomplete replication across replicas.
Request for full logs to facilitate issue tracing and query debugging.
The infrastructure team synchronized system time across nodes.
Requested rollback of parameter change that may have caused problems.
Difficulty tracing issues due to frequent calls, complicating pinpointing failures.
Node synchronization issues causing consistency problems, potentially exacerbated by network faults.
vTombstones from deletions causing timeouts suggested data compaction for performance improvement.
Collect system logs to investigate issues and assess defragmentation requirements.
Synchronous update queries causing issues with multiple simultaneous transactions.
Reference ID table inconsistencies possibly due to node synchronization or misconfiguration.

Solution:

Working on the client’s issue our expert recommended doing the following:

Fixing the timings to resolve the DB time issue;
Using the ‘QUORUM’ scenario as the consistency level for data updation/retrieval as it offers a good balance between consistency and performance;
According to the CAS try the query without the (IF …) statement to minimize the impact on performance;
Increase the file_cache_size_in_mb configuration in cassandra.yaml, followed by a node restart, to improve read latency and reduce abnormal CPU usage.

Conclusion:

The Cassandra DB experiences delays in updating tables, leading to inconsistencies where the reference ID table retains old values.

The initial investigation involved testing with ConsistencyLevel.ALL during update queries. A subsequent meeting covered various topics including inconsistent updates, query failures, timeouts, record deletion issues, and system time synchronization.

Recommendations include fixing timing issues, using the ‘QUORUM’ consistency level for updates, removing ‘IF’ statements from queries, and adjusting file_cache_size_in_mb to improve performance.