Problem: The client faced recurring Kafka sink connector failures (e.g., chf-cdr-sftp-sink-connector) in a Kubernetes environment (Kafka 3.2.0 with three brokers and ZooKeeper). The failures were caused by corrupt messages at specific offsets, leading to task crashes. Despite skipping corrupt offsets and restarting connectors, the issue persisted, requiring a more permanent solution. Process: Step 1: Environment […]
Data Analytics 19 Jan 2025 Apache Spark: Resolving Airflow Scheduler Heartbeat Issues in Production EnvironmentProblem: The client reported continuous heartbeat issues in the Airflow scheduler, causing failure to generate controller DAGs in a production environment. This critical issue impacted job execution, especially when multiple jobs were triggered simultaneously, leading to timeouts and job failures. Process: Step 1: Initial Identification The error message displayed in the logs indicated that the […]
Data Analytics 17 Jan 2025 Resolving Special Character Search Issues in ElasticsearchProblem: The client encountered an issue in their Elasticsearch setup where search results did not return exact matches when the search phrase included special characters, such as “:” (colon). This problem persisted despite using a custom indexing configuration with the `index_word_delimiter_graph_filter`. The client needed a solution to preserve special characters for exact matches while maintaining […]
Data Analytics 3 Jan 2025 Resolving Indexing Failures in OpenSearch During High Availability TestingProblem: The client implemented a 4-node OpenSearch cluster to ensure high availability for their application. When all four nodes were operational, both indexing and searching worked seamlessly. However, during a high availability test where two nodes were intentionally turned off, the indexing process stalled, and no documents were processed. Indexing resumed only after the two […]
Data Analytics 20 Dec 2024 Resolving Airflow DAG Triggering IssuesProblem: The client’s operations team reported issues with triggering jobs via Apache Airflow, specifically through a custom solution, the dag_factory. While jobs triggered outside of the dag_factory worked without problems, those initiated through it were not being processed as expected. Attempts to gather logs in the Airflow UI yielded no entries, as the DAG triggering […]
Data Analytics 13 Dec 2024 Resolving Datastore Configuration Issues in CKAN for PostgreSQL IntegrationProblem: The client encountered issues with the data-explorer view functionality in their CKAN environment. While resources could be downloaded manually, the data-explorer view was unable to load. During the initial investigation, it was found that while the “datastore” plugin was enabled in the ckan.ini file, the ckan.datastore.write_url and ckan.datastore.read_url were not configured. The client was […]
Data Analytics 22 Nov 2024 Managing Out-of-Memory (OOM) Errors and Optimizing Shard Configuration in OpenSearch Production EnvironmentProblem: In the production environment of a multi-node OpenSearch cluster, the nodes frequently crashed due to Out-of-Memory (OOM) errors. Initially, the heap size was increased from 16 GB to 30 GB based on IBM’s recommendations, but the problem persisted. IBM further suggested increasing the number of shards from 16 to 64 to mitigate memory overload. […]
Data Analytics 15 Nov 2024 Resolving HBase Region Transition and Hadoop File System Permission Issues in a PROD EnvironmentProblem: The client encountered a critical issue in their production environment involving HBase regions stuck in a transition state. This problem resulted in service disruptions within their Hadoop cluster. The issue was exacerbated by file system permission changes following a cold restart of the cluster, leading to difficulties in accessing data and managing HBase operations. […]
Data Analytics 8 Nov 2024 Upgrade of Elasticsearch from Version 7.15 to 7.17Problem: The client requested assistance with upgrading their Elasticsearch installation from version 7.15 to 7.17. The client sought a detailed step-by-step guide and expressed the need for a meeting to clarify the upgrade process. Process: Upon receiving the request, the expert requested additional details about the client’s current Elasticsearch setup, including information on the cluster, […]
Data Analytics