Problem: The client reported that their data storage space was being rapidly exhausted because incremental backups were stored in the same location as the primary data storage. This resulted in the storage becoming full, forcing them to delete incremental backups to make room for new data. The client requested guidance on configuring incremental backups to […]
Database 16 Aug 2024 Cassandra superuser password is getting reset after server restartProblem: The client ran Cassandra v4.0.6 in non-production environments and noticed that the “Cassandra” superuser’s password (which was changed two months ago) was observed to be reset to its old password (default password “Cassandra”). After patches were applied to the OS the server rebooted (a monthly activity). The client didn’t see any evidence of someone […]
Database 15 Aug 2024 Postgres Database Cluster Crash InvestigationProblem: Three instances of the production Postgres Database cluster experienced crashes with “segmentation fault” errors within 30 days. Despite no recent changes in the system, the issue persisted, prompting the need for investigation to identify the root cause. Process: Upon receiving the initial report of the issue, the experts engaged with the client to gather […]
Database 12 Aug 2024 Failover Investigation and Resolution for PostgreSQL ClusterProblem: PG Prod Node 1 failed over to Node 2, accompanied by high WAL (Write-Ahead Logging) generation. The client requested an investigation into the cause of the failover on Node 1. Process: To conduct a thorough investigation, the following data and logs were requested: PostgreSQL logs Cluster logs Database logs WAL logs Cluster configuration Monitoring […]
Database 12 Aug 2024 Troubleshooting Kubernetes Cluster InstabilityProblem: The kube-api pod was frequently restarting with the error message: apiserver received an error that is not a metav1.Status: rpctypes.EtcdError{code:0xe, desc:”etcdserver: request timed out”}. This issue began following an etcd data migration from a single-node Kubernetes cluster to a multi-node cluster using an unconventional method. Despite multiple attempts to resolve the issue, the problem […]
Developer Tools 9 Aug 2024 Cassandra Nodetool Repair Not CompletingProblem: The client had a 10-node cluster across two data centers, with 5 nodes in each. They ran nodetool repair on one keyspace, but after 6 hours, it was stuck at 99% completion. They requested guidance on how to proceed with the issue, noting that their product version was ‘ReleaseVersion: 2.2.5.’ Process: Step 1 – […]
Database 8 Aug 2024 Cassandra Superuser Password Reset IssueProblem: Cassandra v4.0.6 in non-production environments experienced an issue where the “cassandra” superuser password, which had been changed two months prior, reset to its default password (“cassandra”) after applying OS patches and rebooting the server. No manual password changes were evident in the audit logs. Solution: Initial Steps: The superuser “cassandra” initially had the default […]
Database 2 Aug 2024 Changing Passwords in a Cassandra ClusterProblem: The client needed to change the passwords of all users in the Cassandra cluster. It was specifically inquired about the necessity of changing the default password for the “cassandra” superuser and requested a step-by-step guide, along with precautions to prevent any impact on the application. Process: The expert provided a detailed response with the […]
Database 1 Aug 2024 Intermittent Table Update Issue in Cassandra DBProblem: The table is not updated immediately in Cassandra DB. The reference id table is not updated sometimes and the application gets old values. Process: Step 1: In the initial investigation and troubleshooting of the DB our experts asked the client the following questions: Cluster Configuration How many nodes are there in the cluster? What […]
Database