Problem: The client was using Apache Cassandra 4.1.5 installed via a tarball extraction on an AWS EC2 machine and wanted to upgrade both their Cassandra version and the operating system. The installation was done manually using the tarball method, and the client needed to understand the feasibility and potential challenges involved in upgrading the OS […]
Database 14 Mar 2025 Resolving Row Count Inconsistencies in Apache CassandraProblem: The client experienced a failure in running repairs in Apache Cassandra due to corruption in hint files. Additionally, a node in the cluster went down and could not be brought back up, causing concerns about data consistency and cluster stability. Process: Step 1: Initial Investigation The client observed errors related to corrupted hint files, […]
Database 7 Mar 2025 Optimizing PostgreSQL Query Performance and Resolving Locking IssuesProblem: The client experienced a problem with query slowness in their PostgreSQL database. Several queries were running slowly, and the application became unresponsive during the issue. The client required assistance in diagnosing and optimizing the queries contributing to the performance issues. Process: Step 1 – Initial Investigation The expert reviewed the PGAWR reports for the […]
Database 28 Feb 2025 Apache Cassandra high availability issueProblem: The client encountered a high availability issue in their Cassandra cluster, consisting of five nodes deployed on AWS EC2. After shutting down two servers (10.51.44.25 and 10.51.46.144), it became impossible to connect to the database, even though the other nodes remained online. The issue manifested as an authentication error when trying to connect to […]
Database 26 Feb 2025 Resolving PostgreSQL and ETCD failover issues in a Patroni clusterProblem: The client faced intermittent downtimes in their PostgreSQL cluster, which is managed by Patroni for high availability. These downtimes were particularly prominent during failover events when the system failed to transition smoothly between nodes during leader elections. As a result, PostgreSQL was unable to maintain continuity of service, affecting the application performance. Logs from […]
Database 17 Feb 2025 Inconsistency in Search Results of Elasticsearch with Reserved CharactersProblem: The client observed inconsistent behavior in Elasticsearch search results when searching for strings containing reserved characters, such as colons, slashes, parentheses, and curly braces. These inconsistencies were most notable when the query string included special characters without proper escaping or when using quotes around the search values. This caused mismatches in expected results, with […]
Database 14 Feb 2025 Improving Cassandra Performance by Adjusting Consistency Levels and Resource ConfigurationProblem: The customer experienced issues with their Cassandra database, specifically with write failures and slow performance during nodetool repair operations. These issues were affecting the application’s ability to interact with the database, resulting in delays and failure to write data. The Cassandra cluster, consisting of 3 nodes in each of two data centers (US East […]
Database 7 Feb 2025 Diagnosing and Resolving SSLPeerUnverifiedException in Apache CassandraProblem: The client reported an issue where Apache Cassandra nodes in their multi-datacenter cluster were logging frequent errors related to SSL certificate validation, with entries like: DEBUG [Native-Transport-Requests-1] 2025-01-23 08:54:27,794 ServerConnection.java:140 - Failed to get peer certificates for peer /10.110.151.78:36376 javax.net.ssl.SSLPeerUnverifiedException: peer not verified Despite these log entries, the cluster continued to function normally, but […]
Database 24 Jan 2025 Apache Cassandra: Resolving High Memory Usage issueProblem: The client reported high memory usage on a production Apache Cassandra node, accompanied by frequent errors related to the ThreadPoolExecutor shutting down. This led to instability in the Cassandra service, including errors like java.util.concurrent.RejectedExecutionException, and resulted in a failure to execute repairs. Process: Step 1: Initial Identification The error logs provided by the client […]
Database