Ceph Monitors are laggy or clock might be skewed

Ceph Monitors are laggy or clock might be skewed

Ceph Monitors are laggy or clock might be skewed 150 150 Wido den Hollander

This weekend I got to investigate a Ceph cluster which had issues where the Monitors were constantly performing new elections.

After some investigation on of the three monitors was eating 100% CPU on a single core and kept printing this in the logs:

mon.charlie@2(peon).paxos(paxos updating c 106399655..106400232) lease_expire from mon.0 [2a00:XXX:121:XXX::6789:1]:6789/0 is 2.380296 seconds in the past; mons are probably laggy (or possibly clocks are too skewed)

Digging further I found that the LevelDB store in /var/lib/ceph/mon/X/store.db was 2.5GB in size.

Compact on Start

You can tell the monitor to compact the LevelDB database on start. Add the following to your ceph.conf:

[mon]
mon compact on start = true

Now restart the monitor and it will compact the LevelDB database.

The CPU usage now dropped and the monitors were happy again.

Get

in touch.

ConsultancyTrainingSupport
Privacy Preferences

When you visit our website, it may store information through your browser from specific services, usually in the form of cookies. Here you can change your Privacy Preferences. It is worth noting that blocking some types of cookies may impact your experience on our website and the services we are able to offer.

Click to enable/disable Google Analytics tracking code.
Click to enable/disable Google Fonts.
Visit privacy policy Visit terms and conditions
Our website uses cookies, mainly from 3rd party services. Define your Privacy Preferences and/or agree to our use of cookies.