Today, I’ll make a little explanation on how to solve this warning coming from ceph status.
Maybe this case does not match your error, but I think the commands I used give hoy a way to follow and solve it.
The whole process is written on the wiki
Continue reading “Troubleshooting CEPH: Degraded data redundancy” →
Second patch for the nutanix provider for terraform.
This time it has been merged into master, so is publicly available.
As I wrote in the commit description:
Hot-plug Options (always):
Hot-plug Options just when increasing it:
Which are the changes that nutanix can perform online (when the VM is running).
I’m hitting the same problem as described on issue #69.
Depending on terraform version, the 2nd apply after the 1st one (creating vm’s) will cause:
- Reboot VM and remove that cdrom (terraform v0.10).
- Ask to remove the cdrom and if removed, the vm crash (kernel panic/bsod) as you’re hot-removing a
IDE bus drive (terraform v > 0.11) and
IDE is not hot plug (usually).
I’ve “patched” the provider so it ignores
CDROM && IDE (bus) changes.
Maybe is not the best solution but as we’re not using cdrom drives at all, using this little patch is a good option for us.
So feel free to clone, test, fork…
PS: I’ve move the path to a standalone branch, so you can apply just only this patch.
The last days I’ve been busy trying to find a “error” on my CRUSH map.
I found that some of my OSD’s where underused or unused at all… I didn’t know why, cause I built a CRUSH map from scratch with the common architecture based on datacenter, rack & cluster. And It was correct from the ceph point of view (It was running on the cluster).
I decided to simplify the map to a much simpler one.
Something like this:
Continue reading “Ceph Troubleshooting: Debugging CRUSH maps” →
In my ceph-to-production sprint I found that 4 monitors are not enough… It is a pair number and everybody knows what happens when there’s a pair number on a cluster and some of the quorum member fails :-)
So I decided to add 2 more for a total of 6. We have 2 CPD’s, If one of the CPD’s goes down, there will be 3 monitors online to elect the new master (I hope that works and not as the official documentation says)
That was the easy part…
Continue reading “Ceph Troubleshooting: Adding new monitors to cluster” →
A little time ago (more than 1 year ago) I began investigating cloud-init. I saw some redhat paper talking about cloud-init and it seemed to be really powerful simplifying massive vm deployments.
Someone close to me told me that: “don’t lose your time, we’ll use terraform/docker/k8s/whatever”
But the inception was already done, I read the documentation and started testing the technology.
What I’ve seen is that cloud-init is everywhere: I think that all linux “cloud” vm’s are using it. It’s really sturdily and simple, it does what it is supposed to do… That is part of its greatness and of its weakness.
The good part is well known: cloud-init service starts when the vm starts and does what you tell it to do through a YAML script: Installs software, create users, perform basic configs…
Its weakness is that cloud-init is a very simple software designed for the cloud, if your cloud architecture is not standard, you will have to make some tricks to bypass them.
For example, I was not using dhcp for VM’s networking and booting a VM with cloud-init without dhcp is really tricky… You can see a YAML script for static network here:
All that has given me a background and a global vision to understand the inner technology used on “cloud” platform (any cloud platform)… it seems that time has proved me right :-)