Ceph cluster

Quick overview of Ceph version running on OSDs

Quick overview of Ceph version running on OSDs 1500 1000 Wido den Hollander

When checking a Ceph cluster it’s useful to know which versions you OSDs in the cluster are running. There is a very simple on-line command to do this: ceph osd metadata|jq '.[].ceph_version'|sort|uniq -c Running this on a cluster which is currently being upgraded to Jewel to Luminous it shows: 10 "ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)" 1670…

read more

Do not use SMR disks with Ceph

Do not use SMR disks with Ceph 1500 1000 Wido den Hollander

Many new disks like the Seagate He8 disks are using a technique called Shingled Magnetic Recording to increase capacity. As these disks offer a very low price per Gigabyte they seem interesting to use in a Ceph cluster. Performance Due to the nature of SMR these disks are very, very, very bad when it comes…

read more

Testing Ceph BlueStore with the Kraken release

Testing Ceph BlueStore with the Kraken release 1506 1000 Wido den Hollander

Ceph version Kraken (11.2.0) has been released and the Release Notes tell us that the new BlueStore backend for the OSDs is now available. BlueStore The current backend for the OSDs is the FileStore which mainly uses the XFS filesystem to store it’s data. To overcome several limitations of XFS and POSIX in general the…

read more

Running headless VirtualBox inside Nested KVM

Running headless VirtualBox inside Nested KVM 1500 1001 Wido den Hollander

For the Ceph training at 42on I use VirtualBox to build Virtual Machines. This is because they work under MacOS, Windows and Linux. For the internal Git at 42on we use Gitlab and I wanted to use Gitlab’s CI to build my Virtual Machines automatically. As we don’t have any physical hardware at 42on (everything…

read more

Chown Ceph OSD data directory using GNU Parallel

Chown Ceph OSD data directory using GNU Parallel 1500 1000 Wido den Hollander

Starting with Ceph version Jewel (10.2.X) all daemons (MON and OSD) will run under the privileged user ceph. Prior to Jewel daemons were running under root which is a potential security issue. This means data has to change ownership before a daemon running the Jewel code can run. Chown data As the Release Notes state…

read more

Slow requests with Ceph: ‘waiting for rw locks’

Slow requests with Ceph: ‘waiting for rw locks’ 1500 1000 Wido den Hollander

Slow requests in Ceph When a I/O operating inside Ceph is taking more than X seconds, which is 30 by default, it will be logged as a slow request. This is to show you as a admin that something is wrong inside the cluster and you have to take action. Origin of slow requests Slow…

read more

Using TRIM/DISCARD with Ceph RBD and libvirt

Using TRIM/DISCARD with Ceph RBD and libvirt 1500 1000 Wido den Hollander

TRIM/DISCARD Using TRIM/DISCARD you can give back free space to a Ceph cluster. Normally, any thin provisioned block device will keep on growing until its maximum size while being used. Using the DISCARD command a underlying block device can be instructed to discard blocks which do not contain data. In the case of Ceph’s RBD…

read more

The Ceph Trafficlight

The Ceph Trafficlight 1610 1000 Wido den Hollander

At PCextreme we have a 700TB Ceph cluster which is used behind our public cloud Aurora Compute which runs Apache CloudStack. Ceph health One of the things we monitor of the Ceph cluster is it’s health. This can be OK, WARN or ERR. It speaks for itself that you always want to see OK, but…

read more

Ceph Monitors are laggy or clock might be skewed

Ceph Monitors are laggy or clock might be skewed 1500 1000 Wido den Hollander

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections. After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs: mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in…

read more

Protecting your Ceph pools against removal or property changes

Protecting your Ceph pools against removal or property changes 1506 1001 Wido den Hollander

One of the dangers of Ceph was that by accident you could remove a multi TerraByte pool and loose all the data. Although the CLI tools asked you for conformation, librados and all it’s bindings did not. Imagine explaining that you just removed a 200TB pool from your storage system due to a typo in…

read more

Get

in touch.

    ConsultancyTrainingSupport
    Privacy Preferences

    When you visit our website, it may store information through your browser from specific services, usually in the form of cookies. Here you can change your Privacy Preferences. It is worth noting that blocking some types of cookies may impact your experience on our website and the services we are able to offer.

    Click to enable/disable Google Analytics tracking code.
    Click to enable/disable Google Fonts.

    Visit privacy policy Visit terms and conditions

    Our website uses cookies, mainly from 3rd party services. Define your Privacy Preferences and/or agree to our use of cookies.