Problem:

The client, utilizing Kafka Streams application version 3.3.1, encountered issues with managing changelog topics. Despite configuring the application for automatic cleanup of records within these topics, unchecked growth was noticed, posing potential risks to system performance and stability.

Process:

Initial Assessment: The client reported the issue, highlighting the design of their Kafka Streams application for aggregation tasks and the unexpected accumulation of records in changelog topics.

Configuration Analysis: Experts delved into the Kafka Streams configuration, scrutinizing parameters such as retention settings, state cleanup delay, and overrides at the application level. Discussions also revolved around the application’s use of window-based aggregations and the cleanup policy set to compact.

Solution:

Uniform Retention Configuration: After thorough analysis, experts recommended aligning all retention parameters to the same value to ensure consistent cleanup intervals across the application. The proposed values include:

  • log.retention.hours=48
  • log.retention.ms=172800000 (24 hours)
  • log.segment.bytes=1073741824 (1GB)
  • log.retention.check.interval.ms=300000 (5 Min)
  • num.io.threads=48
  • num.network.threads=32
  • num.partitions=5
  • offsets.retention.minutes=2880

Optimized Cleanup Mechanisms: Fine-tune dirty ratio settings and adjust cleaner thread frequency to expedite cleanup processes, thereby preventing the uncontrolled growth of records within changelog topics.

The client was provided with the reference links:

These references gave comprehensive explanations of Kafka’s retention mechanisms, aiding in understanding and implementing efficient management strategies for changelog topics.

Conclusion:

Efficient management of changelog topics is critical for maintaining the stability and performance of Kafka Streams applications. By implementing the recommended solutions, including uniform retention configurations and optimized cleanup mechanisms, the client could mitigate the risks associated with unchecked growth of records within changelog topics. Also, it helped to address the identified challenges and ensure the smooth operation of their Kafka Streams application, thereby enhancing overall system reliability and performance.