Problem: The client has a two-datacenter (DC1 and DR1) Cassandra cluster. They encountered a failure while running nodetool repair on a node in DC1, which was traced to data corruption on a node in DR1. The logs indicated a corruption error in a specific SSTable file. Solution: Step 1. Initial Diagnosis: Ran nodetool repair in […]
Database 30 Jun 2024 Native Transport Failure in Apache Cassandra ClusterProblem: The client’s production Apache Cassandra cluster experienced sudden native transport failure, leading to significant operational impact. Despite efforts to diagnose the problem using system logs and debug logs, the root cause remained unidentified. Native transport errors, particularly SSLPeerUnverifiedException, were prevalent in the debug logs, indicating authentication failures for multiple nodes in the cluster. Process: […]
Database 27 Jun 2024 Analyzing Automatic Restarts and IO Errors in Cassandra Database: Expert Insights and RecommendationsProblem: The client is experiencing daily automatic restarts of their Cassandra database, potentially causing issues with application connectivity. Additionally, they’re encountering errors related to JVM memory, degraded mode, connection resets by peer, and null pointer exceptions on indexes. Solution: After a thorough analysis of the provided logs, the following findings and recommendations were made: Identify […]
Database 20 Jun 2024 Cassandra Database Unavailability and Performance IssuesProblem: The client reported instances of Cassandra databases experiencing unavailability and performance degradation, leading to application connection errors. Specifically, nodes were found to be unresponsive, and memory and CPU utilization spiked unexpectedly. Additionally, there were concerns about data inconsistencies between data centers and excessive tombstones affecting performance. Process: Initial Investigation: Upon initial investigation, it was […]
Database 17 Jun 2024 Postgres – query performance degraded after bulk deleteProblem: After deleting 52 million rows from a table, the client performed a VACUUM FULL and REINDEX on the table. Following these maintenance operations, the performance of large joins involving this table has significantly deteriorated, with query execution times increasing from minutes to hours. The task is to identify the cause of this performance regression […]
Database 16 Jun 2024 Application crashes with “could not send data to client: Connection reset by peer”Problem: The client had 10 monthly bill cycles, and during such bill cycle days, the billing process was connected to the DB cluster in 20 streams. After the process had started the replication the above-mentioned billing process failed with the “could not send data to the client: Connection reset by peer” error. The client reduced […]
Database 14 Jun 2024 Troubleshooting Apache Cassandra Query Timeout IssueProblem: The client experienced timeout errors when running adhoc queries in an Apache Cassandra cluster. Specifically, a query with multiple conditions was timing out due to the coordinator node not receiving responses from replica nodes. Process: Here are the steps taken by the expert in the process of resolving the problem: Assessment of Query Efficiency: […]
Database 13 Jun 2024 Resolving Connection Error in cqlsh with SSLProblem: A client encountered connection errors while attempting to connect to a Cassandra database cluster using cqlsh. The error messages indicated issues related to protocol version compatibility and SSL certificate verification failures (“This version of the driver does not support protocol version 21”). Process: Initial Diagnosis: The support team analyzed the error messages provided by […]
Database 7 Jun 2024 Application crashes with ‘could not send data to the client: Connection reset by peer’Problem: The client’s billing system experienced around 10 billing cycles each month. During each billing cycle, the billing process was initiated and connected to the database cluster with 20 concurrent streams. However, upon starting the process, both the replication and the billing process failed, displaying the following error: “Could not send data to the client: […]
Database