Migration options for SUSE Enterprise Storage (SES) users

A few weeks ago, I posted a blog about how SUSE pulled the plug on SUSE Enterprise Storage (SES). SES is a Linux-based computer data storage product developed by SUSE and built on Ceph technology. A quick recap on the matter is that in 2020, SUSE acquired Rancher Labs, a Kubernetes management software vendor, and […]

HAProxy in front of Ceph Manager dashboard

The Ceph Mgr dashboard plugin allows for an easy dashboard which can show you how your Ceph cluster is performing. In certain situations you can’t contact the Mgr daemons directly and you have to place a Proxy server between your computer and the Mgr daemons. This can be done easily with HAProxy and the following […]

VXLAN with VyOS and Ubuntu 18.04

VXLAN Virtual Extensible LAN uses encapsulation technique to encapsulate OSI layer 2 Ethernet frames within layer 4 UDP datagrams. More on this can be found on the link provided. For a Ceph and CloudStack environment I needed to set up a Proof-of-Concept using VXLAN and some refurbished hardware. The main purpose of this PoC is […]

Placement Groups with Ceph Luminous stay in activating state

Placement Groups stuck in activating When migrating from FileStore with BlueStore with Ceph Luminuous you might run into the problem that certain Placement Groups stay stuck in the activating state. 44 activating+undersized+degraded+remapped PG Overdose This is a side-effect of the new PG overdose protection in Ceph Luminous. Too many PGs on your OSDs can cause […]

Quick overview of Ceph version running on OSDs

When checking a Ceph cluster it’s useful to know which versions you OSDs in the cluster are running. There is a very simple on-line command to do this: ceph osd metadata|jq ‘.[].ceph_version’|sort|uniq -c Running this on a cluster which is currently being upgraded to Jewel to Luminous it shows: 10 “ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)” 1670 […]

The Ceph Trafficlight

At PCextreme we have a 700TB Ceph cluster which is used behind our public cloud Aurora Compute which runs Apache CloudStack. Ceph health One of the things we monitor of the Ceph cluster is it’s health. This can be OK, WARN or ERR. It speaks for itself that you always want to see OK, but […]

Ceph Monitors are laggy or clock might be skewed

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections. After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs: mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in […]

Protecting your Ceph pools against removal or property changes

One of the dangers of Ceph was that by accident you could remove a multi TerraByte pool and loose all the data. Although the CLI tools asked you for conformation, librados and all it’s bindings did not. Imagine explaining that you just removed a 200TB pool from your storage system due to a typo in […]

Calculating RADOS objects for RBD images

Ceph’s RBD (RADOS Block Device) is just a thin wrapper on top of RADOS, the object store of Ceph. It stripes (by default) over 4MB objects in RADOS. It’s very simple to calculate which RADOS object corresponds with which sector on your RBD image/block device. First you have to find out the block device’s object […]

Safely backing up your Ceph monitors

So you might wonder: Why do I need to make a backup of my Ceph monitors? I have multiple monitors. That’s true, but would you run into the very unfortunate situation where you loose all you monitors, you loose all your data. The monitors contain very important metadata (pgmap, osdmap, crushmap) to run your cluster. […]