Problem:
The client planned to migrate a 5-node Cassandra cluster from an on-premises environment (version 3.11.8) to AWS (target version 4.1.5).
The client requested guidance on the best migration strategy that ensures no downtime. Additionally, the client requested information on
backup and restore procedures for the migration.
Solution:
The expert recommended a step-by-step approach. First of all, the migration should be separated into two procedures:
- Migration to the same version cluster
- Upgrade the destination after a successful migration
1) Migration to the same version cluster
The easiest and comparatively painless migration will be to add a second data center at AWS, then decommission the source DC.
To do this, you will need a stable and high-bandwidth site-to-site connection between your on-premises and AWS networks.
Here are some options: AWS VPC Connectivity Options.
After establishing the site-to-site connection, ensure there are no firewall and/or AWS security group restrictions on network
traffic between both endpoints. (All traffic should be allowed during the migration period.)
When the network preparations are done and you are sure that the connection is stable with enough bandwidth to handle the data
migration, you can proceed to add the second (or another) DC to your setup.
Here is a very detailed and good guide on how to do it with Cassandra 3.xx version: Adding a Data Center.
After moving through the steps in the documentation linked above, you will have a dual data-center setup with consistent data.
Now you can proceed with decommissioning the data center on on-prem servers. That’s relatively easy and well described here:Decommissioning a Data Center.
2) Upgrade the destination after a successful migration
Now the next step will be to upgrade from 3.xx to 4.xx.
Cassandra 4.x nodes can read Cassandra 3.x SSTable files, so there is no need to upgrade the data files directly. The migration
process is:
- Flush memtables to disk.
- Shut down the 3.x node.
- Make sure the correct Java version is installed.
- Install Cassandra 4.x.
- Configure the 4.x node to point to the old data files.
- Start the 4.x node.
- Check the logs for errors.
After all nodes are started with version 4.1 of Cassandra, run nodetool upgradesstables
on all nodes, one at a time.
This will take a long time, use a lot of CPU resources, and require about 50% of free disk space in Cassandra data folders.
Conclusion:
The proposed solution provides a robust, structured, and downtime-free approach to migrating and upgrading the Cassandra cluster.
Splitting the migration and upgrade processes minimizes risks, while the dual-DC strategy ensures data consistency during the migration.
The backup and restore steps add a layer of security, ensuring data integrity throughout the procedure. This method is effective because
it follows best practices.