Problem:
Ceph Storage Almost Full but Should Have Space. The client reported that the Ceph storage is nearly full, even though there should be sufficient space available. The output of ceph osd status
indicates that some OSDs have limited available space. The most common cause identified is not deleting the lost+found
directory after a crash or deleting files.
Process:
Step 1: Manual Deletion in lost+found
The immediate workaround is to manually delete files inside the lost+found
directory to free up space. This involves identifying unnecessary files and removing them.
Step 2: Debugging Information
Determine whether it’s a CephFS issue or if the client is using Rados or RBD. Request outputs of the following commands:
ceph osd tree
ceph df
rados lspools
Step 3: Additional Information
ceph osd tree
to check the weights and statuses of OSDs.ceph df
to understand the overall storage utilization.rados lspools
.Step 4: Cluster and Pool Configuration
Step 5: Cluster Expansion
Step 6: Additional Recommendations
Solution:
The immediate solution involves manually clearing space by deleting unnecessary files. However, for a more sustainable solution, understanding the cluster and pool configurations is crucial. Debugging information, such as outputs from relevant commands, helps identify the root cause. The long-term approach may involve adjusting replication factors, adding OSDs, or changing disk configurations to meet storage requirements. Regular monitoring and maintenance are recommended to prevent similar issues in the future.
Conclusion:
This case study provides a structured approach to addressing the reported Ceph storage issue, combining immediate remedies with long-term strategies for better storage management.