Testing Ceph BlueStore with the Kraken release

Ceph version Kraken (11.2.0) has been released and the Release Notes tell us that the new BlueStore backend for the OSDs is now available. BlueStore The current backend for the OSDs is the FileStore which mainly uses the XFS filesystem to store it’s data. To overcome several limitations of XFS and POSIX in general the […]

Chown Ceph OSD data directory using GNU Parallel

Starting with Ceph version Jewel (10.2.X) all daemons (MON and OSD) will run under the privileged user ceph. Prior to Jewel daemons were running under root which is a potential security issue. This means data has to change ownership before a daemon running the Jewel code can run. Chown data As the Release Notes state […]

Slow requests with Ceph: ‘waiting for rw locks’

Slow requests in Ceph When a I/O operating inside Ceph is taking more than X seconds, which is 30 by default, it will be logged as a slow request. This is to show you as a admin that something is wrong inside the cluster and you have to take action. Origin of slow requests Slow […]

The Ceph Trafficlight

At PCextreme we have a 700TB Ceph cluster which is used behind our public cloud Aurora Compute which runs Apache CloudStack. Ceph health One of the things we monitor of the Ceph cluster is it’s health. This can be OK, WARN or ERR. It speaks for itself that you always want to see OK, but […]

Ceph Monitors are laggy or clock might be skewed

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections. After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs: mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in […]

Protecting your Ceph pools against removal or property changes

One of the dangers of Ceph was that by accident you could remove a multi TerraByte pool and loose all the data. Although the CLI tools asked you for conformation, librados and all it’s bindings did not. Imagine explaining that you just removed a 200TB pool from your storage system due to a typo in […]

NFS-Ganesha with libcephfs on Ubuntu 14.04

This week I’m testing a lot with CephFS and one of the things I never tried was re-exporting CephFS using NFS-Ganesha and libcephfs. NFS-Ganesha is a NFS server which runs in userspace. It has multiple backends (FSALs) it can use and libcephfs is one of them. libcephfs is a userspace library which you can use […]

Ceph with a cluster and public network on IPv6

I’m a big fan of Ceph and IPv6, so I always try to deploy Ceph over IPv6 when possible. Ceph is the future, just like IPv6 is. Why implement legacy? Recently I did a deployment of Ceph with a public and cluster network running over IPv6. It has a small catch, so I let me […]

PowerDNS backend for a global RADOS Gateway namespace

At my hosting company PCextreme we are building a cloud offering based on Ceph and CloudStack. We call our cloud services Aurora. Our cloud services are composed out of two components: Compute and Objects. For our Aurora Objects service we use the RADOS Gateway from Ceph and we are using the Federated Config to create […]

Calculating RADOS objects for RBD images

Ceph’s RBD (RADOS Block Device) is just a thin wrapper on top of RADOS, the object store of Ceph. It stripes (by default) over 4MB objects in RADOS. It’s very simple to calculate which RADOS object corresponds with which sector on your RBD image/block device. First you have to find out the block device’s object […]