How to handle large omap objects

How to handle large omap objects

Every once in a while a customer will ask me what to do with these messages:

1 large omap objects

First lets see what this means:

  1. Ceph services are built on top of RADOS
  2. Ceph stores data in relation to Ceph/Rados objects.
  3. Ceph/Rados objects can consist of three major parts:
    1. data: bytestream
    2. key/value pairs: omap data
    3. eXtra ATTRibuteS: called xattrs

Most of the time when seeing the large omap warning is in relation to the RGW workload. The RGW or Rados Gateway provides a S3 and/or SWIFT compatible objects storage interface. The RGW uses several pools to store its data and metadata. The large omaps are mostly related to the .xxx.rgw.index pool. This is where, to no surprise to anyone, the index data is stored.

The RGW stores its bucket index data as omap key/values attached to an object without a bytestream. This object is called a ‘bucked index marker’. For each object that exists within that bucket an omap key/value pair is added to the bucket index marker. This means that if a bucket holds 25 objects, its index marker will have 25 omap keys. Most large omap health messages are related to “too many objects in a single bucket without resharding”.

The value of “osd_deep_scrub_large_omap_object_key_threshold” determines when Ceph will consider an object to have a “large” amount of “omap” keys. The current default value for this is 200000. This was lowered from 2000000 and backported to 14.2.3, 13.2.7 and 12.2.13. The reason for lowering this is because the general consensus is that warning at 2000000 is too late to preempt problems.

Read more in-depth about Ceph in our blog about CRUSH maps through the following link .