The last days I’ve been busy trying to find a “error” on my CRUSH map.
I found that some of my OSD’s where underused or unused at all… I didn’t know why, cause I built a CRUSH map from scratch with the common architecture based on datacenter, rack & cluster. And It was correct from the ceph point of view (It was running on the cluster).
I decided to simplify the map to a much simpler one.
Something like this:
Continue reading “Ceph Troubleshooting: Debugging CRUSH maps”
In my ceph-to-production sprint I found that 4 monitors are not enough… It is a pair number and everybody knows what happens when there’s a pair number on a cluster and some of the quorum member fails :-)
So I decided to add 2 more for a total of 6. We have 2 CPD’s, If one of the CPD’s goes down, there will be 3 monitors online to elect the new master (I hope that works and not as the official documentation says)
That was the easy part…
Continue reading “Ceph Troubleshooting: Adding new monitors to cluster”
Recently I did a basic architecture document (and training) of ceph.
This document will give you the basics to understand the role of the Ceph architecture pieces like can be:
- Monitor Nodes
- Disk (aka OSD) nodes
- Metadata Nodes
- Gateway Nodes
And what the words “RADOS” and “CRUSH” means (again is a very basic definition)
Enjoy with the document/presentation!