Problem:

The client was operating a 5-node Apache Cassandra cluster (version 4.1.5) and needed to establish secure access to the database from both an SRE server and an EC2 server. While basic connectivity (e.g., telnet) between the source and Cassandra target nodes was verified, direct access to Cassandra using cqlsh was unsuccessful. The client sought guidance on configuring the Cassandra nodes and network to allow successful external access while maintaining the cluster’s stability and performance.

Complicating the situation was the fact that SSH connections to certain nodes were failing, configuration files appeared inconsistent across nodes, and there were misconfigurations in cassandra.yaml, including excessive seed nodes and commented-out or malformed parameters like broadcast_address and start_rpc.

Process:

Step 1: Verifying Cluster Health

The expert guided the client in running nodetool status and nodetool describecluster across the nodes. All five nodes were reported as up and in normal state, confirming that the internal Cassandra cluster was healthy.

Step 2: Configuration Review

The client shared outputs of critical configuration parameters from multiple nodes. The expert identified inconsistencies in seed_provider and broadcast_address settings. Many nodes had four or five nodes listed as seeds — a practice discouraged in Cassandra documentation. In some cases, broadcast_address was missing or commented out.

The expert also pointed out that start_rpc was deprecated and should be removed from the configuration, as it triggered YAML parsing errors upon service startup.

Step 3: Strategic Recommendations

The expert recommended:

  • Designating only 2–3 seed nodes for better resilience and avoiding bootstrap issues.
  • Setting each node’s broadcast_address to its respective internal IP.
  • Ensuring start_native_transport: true and native_transport_port: 9042 were configured correctly.
  • Removing start_rpc entirely.

Step 4: Network Access Configuration

The expert reviewed AWS security group settings and advised on adding proper inbound rules to allow the SRE and EC2 servers (10.109.96.219 and 10.51.193.6 respectively) to access port 9042 on the Cassandra nodes.

Step 5: Validation and Connection Testing

The client applied the suggested changes and verified cqlsh connectivity from multiple nodes, confirming that the Cassandra cluster was accessible and functioning properly. Later, they attempted connections from the SRE and EC2 servers using cqlsh commands:

 cqlsh  9042 -u cassandra -p cassandra 

Successful connections were made from all target nodes, and the output of SHOW HOST and SHOW VERSION commands confirmed healthy responses from the cluster.

Solution:

The expert helped the client align all node configurations to best practices by:

  • Reducing the seed node list to two entries across all nodes.
  • Setting correct broadcast_address values for each node.
  • Removing unsupported or deprecated parameters such as start_rpc.
  • Verifying port 9042 accessibility in AWS security groups for both SRE and EC2 sources.

These changes enabled stable cqlsh connections from all intended sources and ensured that the Cassandra cluster remained healthy and consistent.

Conclusion:

Connecting external systems to a Cassandra cluster requires not only network access but also precise internal configuration. By resolving misconfigurations in cassandra.yaml and ensuring proper seed node designation, the expert helped the client achieve seamless connectivity from SRE and EC2 servers while maintaining cluster integrity. The client confirmed successful implementation and proceeded with their production deployment.