ciberterminal – Welcome, connect your terminal

Troubleshooting CEPH: Degraded data redundancy

Today, I’ll make a little explanation on how to solve this warning coming from ceph status.
Maybe this case does not match your error, but I think the commands I used give hoy a way to follow and solve it.

The whole process is written on the wiki

Continue reading “Troubleshooting CEPH: Degraded data redundancy” →

terraform-provider-nutanix: Patch to allow some changes to be hotplug

by dodger

Second patch for the nutanix provider for terraform.
This time it has been merged into master, so is publicly available.

As I wrote in the commit description:

Hot-plug Options (always):

Name
Description

Hot-plug Options just when increasing it:

num_vcpus_per_socket
num_sockets
memory_size_mib

Which are the changes that nutanix can perform online (when the VM is running).

terraform-provider-nutanix Patch to ignore CDROM’s

by dodger

I’m hitting the same problem as described on issue #69.
Depending on terraform version, the 2nd apply after the 1st one (creating vm’s) will cause:

Reboot VM and remove that cdrom (terraform v0.10).
Ask to remove the cdrom and if removed, the vm crash (kernel panic/bsod) as you’re hot-removing a IDE bus drive (terraform v > 0.11) and IDE is not hot plug (usually).

I’ve “patched” the provider so it ignores CDROM && IDE (bus) changes.
Maybe is not the best solution but as we’re not using cdrom drives at all, using this little patch is a good option for us.

So feel free to clone, test, fork…
https://github.com/Jorge-Holgado/terraform-provider-nutanix

PS: I’ve move the path to a standalone branch, so you can apply just only this patch.

Ceph Troubleshooting: Debugging CRUSH maps

by dodger

The last days I’ve been busy trying to find a “error” on my CRUSH map.
I found that some of my OSD’s where underused or unused at all… I didn’t know why, cause I built a CRUSH map from scratch with the common architecture based on datacenter, rack & cluster. And It was correct from the ceph point of view (It was running on the cluster).

I decided to simplify the map to a much simpler one.
Something like this:
Simple CRUSH map

Continue reading “Ceph Troubleshooting: Debugging CRUSH maps” →

Ceph Troubleshooting: Adding new monitors to cluster

by dodger

In my ceph-to-production sprint I found that 4 monitors are not enough… It is a pair number and everybody knows what happens when there’s a pair number on a cluster and some of the quorum member fails :-)
So I decided to add 2 more for a total of 6. We have 2 CPD’s, If one of the CPD’s goes down, there will be 3 monitors online to elect the new master (I hope that works and not as the official documentation says)

That was the easy part…

Continue reading “Ceph Troubleshooting: Adding new monitors to cluster” →