Best practices for cephadm and expanding your Ceph infrastructure

If you want to expand your Ceph cluster by adding a new one, it is good to know what the best practices are. So, here is some more information about cephadm and some tips and tricks for expanding your Ceph cluster. First a little bit about cephadm. Cephadm creates a new Ceph cluster by “bootstrapping” […]

What about our Ceph Fundamentals training?

Currently, we are planning a new Ceph Fundamentals training course for November or December. Because of this, we thought it would be nice to share some more information on the topics of this training course. The 42on Ceph Fundamentals Training is a 2-day online, instructor-led training course. The training is led by one of our […]

Migration options for SUSE Enterprise Storage (SES) users

A few weeks ago, I posted a blog about how SUSE pulled the plug on SUSE Enterprise Storage (SES). SES is a Linux-based computer data storage product developed by SUSE and built on Ceph technology. A quick recap on the matter is that in 2020, SUSE acquired Rancher Labs, a Kubernetes management software vendor, and […]

HAProxy in front of Ceph Manager dashboard

The Ceph Mgr dashboard plugin allows for an easy dashboard which can show you how your Ceph cluster is performing. In certain situations you can’t contact the Mgr daemons directly and you have to place a Proxy server between your computer and the Mgr daemons. This can be done easily with HAProxy and the following […]

VXLAN with VyOS and Ubuntu 18.04

VXLAN Virtual Extensible LAN uses encapsulation technique to encapsulate OSI layer 2 Ethernet frames within layer 4 UDP datagrams. More on this can be found on the link provided. For a Ceph and CloudStack environment I needed to set up a Proof-of-Concept using VXLAN and some refurbished hardware. The main purpose of this PoC is […]

Placement Groups with Ceph Luminous stay in activating state

Placement Groups stuck in activating When migrating from FileStore with BlueStore with Ceph Luminuous you might run into the problem that certain Placement Groups stay stuck in the activating state. 44 activating+undersized+degraded+remapped PG Overdose This is a side-effect of the new PG overdose protection in Ceph Luminous. Too many PGs on your OSDs can cause […]

Quick overview of Ceph version running on OSDs

When checking a Ceph cluster it’s useful to know which versions you OSDs in the cluster are running. There is a very simple on-line command to do this: ceph osd metadata|jq ‘.[].ceph_version’|sort|uniq -c Running this on a cluster which is currently being upgraded to Jewel to Luminous it shows: 10 “ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)” 1670 […]

The Ceph Trafficlight

At PCextreme we have a 700TB Ceph cluster which is used behind our public cloud Aurora Compute which runs Apache CloudStack. Ceph health One of the things we monitor of the Ceph cluster is it’s health. This can be OK, WARN or ERR. It speaks for itself that you always want to see OK, but […]

Ceph Monitors are laggy or clock might be skewed

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections. After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs: mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in […]

Protecting your Ceph pools against removal or property changes

One of the dangers of Ceph was that by accident you could remove a multi TerraByte pool and loose all the data. Although the CLI tools asked you for conformation, librados and all it’s bindings did not. Imagine explaining that you just removed a 200TB pool from your storage system due to a typo in […]