Best practices for CephFS

Best practices for CephFS

File storage is a data storage format in which data is stored and managed as files within folders. Why would you choose to use file storage? Well, the advantages of file storage include the following:

1. User-friendly interface: A simple file management and sharing system which are easy to understand for human users which makes it easy to use within an organization.

2. Easily scale-up archives: Each file has a unique address and can be saved manually or automatically in a scale-out system.

3. Data protection: File storage has been around for a long time, so there are many standardized technologies and protocols for data protection.

In the storage world, file storage is provided by direct-attached storage (DAS) and network-attached storage (NAS) systems. However, for cloud environments, Ceph provides an integrated file storage system called Ceph File System, or CephFS. CephFS is a file system built on top of Ceph’s distributed object store, RADOS. It is specially designed to facilitate application portability, as it is compatible with the POSIX-compliant (POSIX is a Portable Operating System Interface). So if you have a use case for file storage as described in the first part of this article, why would you choose to specifically apply CephFS? The answer to this question is a straight-forward one to us; CephFS provides a state-of-the-art, multi-use, highly available, and performant file store for a variety of applications, including traditional use-cases like shared home directories, HPC scratch space, and distributed workflow shared storage.

No alt text provided for this image

For most deployments of Ceph, setting up a CephFS file system is as simple as: ceph fs volume create

The Ceph Orchestrator will automatically create and configure MDS for your file system if the back-end deployment technology supports it (see Orchestrator deployment table). If not, you can easily deploy MDS manually.

Finally, you need to mount CephFS on your client nodes (see Mount CephFS: Prerequisites page). Or choose a command-line shell utility that is available for interactive access or scripting via the cephfs-shell. Use at least the Jewel (v10.2.0) release of Ceph. This is the first release to include stable CephFS code and fsck/repair tools. Make sure you are using the latest point release to get bug fixes.

Using CephFS with a running Ceph Storage Cluster requires at least one active Metadata Server (MDS) daemon, creating the file system, selecting a mount mechanism for clients (FUSE or kernel driver), and configuring authentication credentials for clients.

Note that Ceph releases do not include a kernel, this is versioned and released separately. For the best chance of a happy healthy filesystem, use a single active MDS and do not use snapshots. Both of these are the default. Creating multiple MDS daemons is fine, as these will simply be used as standbys. However, for best stability, you should avoid adjusting max_mds upwards, as this would cause multiple MDS daemons to be active at once.

Please let us know if you have interesting use cases for CephFS or struggle to get it working flawlessly. If you would like to know what NOT to do with your Ceph cluster you can read about it in our blog through the following link https://42on.com/5-more-ways-to-break-your-ceph-cluster/ .

We are hiring!
Are you our new