Problem:

The client encountered a failure while attempting to run a Cassandra backup using Commvault on their QAT cluster. The backup process failed with a 500 HTTP error originating from the local Priam REST endpoint:

 HTTP ERROR 500
Problem accessing /REST/v1/cassadmin/info. Reason:

Commvault support traced the issue to the Priam service and advised the client to check its status. The failure was impacting regular backup operations, posing a risk to data protection and disaster recovery strategies.

Process:

Step 1 – Initial Error Analysis

The Commvault logs showed the backup failed while sending an HTTP request to Priam’s REST endpoint: http://localhost:49401/REST/v1/cassadmin/info. The error was:

 curl-err:[No error] http-resp:[500] javax.management.AttributeNotFoundException: No such attribute: RPCServerRunning 

This indicated that Priam was unable to retrieve the RPCServerRunning JMX attribute from Cassandra, likely due to a compatibility issue between Priam and the Cassandra version in use.

Step 2 – Expert Questions and Validation

To investigate further, the expert asked the client to validate:

  • Whether Priam was running on the failing node.
  • The versions of Cassandra and Priam used.
  • Any recent changes to the Priam or Cassandra configuration.

The client confirmed Priam was running and provided log files. Using:

 ps -ef | grep -i priam netstat -tulnp | grep 60811 ps -ef | grep -i commvault 

Step 3 – Root Cause Analysis

After analyzing the logs and environment, the expert identified the root cause:

  • The Priam service was issuing a JMX call to retrieve the RPCServerRunning attribute.
  • This attribute was not present in the Cassandra version used by the client—likely due to using Apache Cassandra while Priam was designed for DSE (DataStax Enterprise).

No faults were found in Cassandra’s core functionality. Logs confirmed normal operations (flush, compaction, GC). This further supported the hypothesis that the issue was specific to the interaction between Priam and Cassandra’s JMX interface.

Solution:

The expert provided a multi-tiered recommendation:

Short-Term Fixes:

  • Replace or patch cvPriam.jar to ensure compatibility with Apache Cassandra.
  • Request an updated version of Priam from Commvault that avoids querying deprecated/nonexistent JMX attributes.
  • Ensure Cassandra and Priam versions are aligned and officially supported together.

Long-Term Hardening:

  • Add a pre-backup health check for Priam’s REST API endpoint using: curl -s http://localhost:49401/REST/v1/cassadmin/info
  • Ensure proper JMX credentials are configured and exposed only over localhost.
  • Monitor the Priam process and port status continuously (port 49401).
  • Keep JMX configuration in sync across all Cassandra nodes.
  • Validate tool compatibility before performing Cassandra upgrades in production.

Conclusion:

The failure was caused by an incompatibility between the Priam tool (used by Commvault) and the version of Apache Cassandra deployed by the client. The deprecated or missing RPCServerRunning attribute in Cassandra led to an HTTP 500 response from Priam, blocking the backup.

The expert strongly recommended discontinuing the use of Priam in production environments due to such reliability issues and instead relying on Cassandra’s built-in nodetool snapshot mechanism for backups. If Commvault must be used, it should be verified that the associated Priam version is explicitly compatible with the Cassandra version in use.