Do not use SMR disks with Ceph

Many new disks like the Seagate He8 disks are using a technique called Shingled Magnetic Recording to increase capacity. As these disks offer a very low price per Gigabyte they seem interesting to use in a Ceph cluster. Performance Due to the nature of SMR these disks are very, very, very bad when it comes […]

Testing Ceph BlueStore with the Kraken release

Ceph version Kraken (11.2.0) has been released and the Release Notes tell us that the new BlueStore backend for the OSDs is now available. BlueStore The current backend for the OSDs is the FileStore which mainly uses the XFS filesystem to store it’s data. To overcome several limitations of XFS and POSIX in general the […]

Running headless VirtualBox inside Nested KVM

For the Ceph training at 42on I use VirtualBox to build Virtual Machines. This is because they work under MacOS, Windows, and Linux. For the internal Git at 42on we use Gitlab and I wanted to use Gitlab’s CI to build my Virtual Machines automatically. As we don’t have any physical hardware at 42on (everything […]

Chown Ceph OSD data directory using GNU Parallel

Starting with Ceph version Jewel (10.2.X) all daemons (MON and OSD) will run under the privileged user ceph. Prior to Jewel daemons were running under root which is a potential security issue. This means data has to change ownership before a daemon running the Jewel code can run. Chown data As the Release Notes state […]

Slow requests with Ceph: ‘waiting for rw locks’

Slow requests in Ceph When a I/O operating inside Ceph is taking more than X seconds, which is 30 by default, it will be logged as a slow request. This is to show you as a admin that something is wrong inside the cluster and you have to take action. Origin of slow requests Slow […]

Using TRIM/DISCARD with Ceph RBD and libvirt

TRIM/DISCARD Using TRIM/DISCARD you can give back free space to a Ceph cluster. Normally, any thin provisioned block device will keep on growing until its maximum size while being used. Using the DISCARD command a underlying block device can be instructed to discard blocks which do not contain data. In the case of Ceph’s RBD […]

The Ceph Trafficlight

At PCextreme we have a 700TB Ceph cluster which is used behind our public cloud Aurora Compute which runs Apache CloudStack. Ceph health One of the things we monitor of the Ceph cluster is it’s health. This can be OK, WARN or ERR. It speaks for itself that you always want to see OK, but […]

Ceph Monitors are laggy or clock might be skewed

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections. After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs: mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in […]

Protecting your Ceph pools against removal or property changes

One of the dangers of Ceph was that by accident you could remove a multi TerraByte pool and loose all the data. Although the CLI tools asked you for conformation, librados and all it’s bindings did not. Imagine explaining that you just removed a 200TB pool from your storage system due to a typo in […]

Rebuilding libvirt under CentOS 7.1 with RBD storage pool support

If you want to use CentOS 7.1 for your hypervisors with Apache CloudStack and Ceph’s RBD as Primary Storage you need to rebuild libvirt. CloudStack requires libvirt to be built with RBD storage pool support. It uses libvirt to manage RBD volumes. By default libvirt under CentOS is not built with this support. (On Ubuntu […]