If you want to expand your Ceph cluster by adding a new one, it is good to know what the best practices are. So, here is some more information about cephadm and some tips and tricks for expanding your Ceph cluster.
First a little bit about cephadm. Cephadm creates a new Ceph cluster by “bootstrapping” on a single host, expanding the cluster to encompass any additional hosts, and then deploying the needed services. Cephadm deploys and manages the Ceph cluster, which is realized by connecting the manager daemons to the host through SSH; the manager daemons can add, delete and update the Ceph container. Cephadm does not rely on external configuration tools such as Ansible, Rook and Salt.
Furthermore, cephadm manages the whole life cycle of the Ceph cluster. This life cycle starts from the boot process. When cephadm creates a small Ceph cluster on a single node, the cluster consists of a monitor and a manager. Then, cephadm uses the command line interface to expand the cluster, adds all hosts and configures all Ceph daemons and services; it can manage this lifecycle through the Ceph command line interface (CLI) or through the dashboard (GUI).
The clusters speed is determined by the slowest member of the cluster, therefore the best practice for expansions is to use fairly similar hardware as the original setup or future grow your infrastructure based on later added hardware performance. Sometimes a few years pass by with the current state and expanding after that might force using newer hardware, but that doesn’t matter as long as you have the same network speed and configuration and also the same amount of RAM in the nodes. The good part (or the ‘cloud way’) is that you don’t need to expand with huge amount of investments, the cluster can be expanded node by node. But from budget point-of-view you should consider expanding more, because workwise it is easier to organize physical installation for multiple nodes comparing to install them once in a while.
However, if your major focus is dollars per gigabyte of storage, then you might gravitate toward the servers that can take the largest number of disks, and the biggest disks you can get. But there are a few considerations with ‘narrow and deep’ clusters: Each node holds a larger percentage of your cluster’s data. Example: in a five-node cluster each node holds 20% of your data. In a ten node cluster, it’s only 10% per node. Because of the redundant way Ceph is designed, hardware loss of a single node in a small cluster is no problem, but will result in substantially more data migration, particularly as the cluster starts to fill, and potentially an outage if you have not configured your cluster’s full and near-full ratios correctly. So, this is important to think about, when designing or adjusting your cluster’s architecture.
Moreover, forcing more network traffic across fewer nodes. Ceph is a network-based storage system, so with fewer nodes you’re forcing a lot of client traffic over fewer NICs. This becomes more so as the number of disks per node increases, as you have more OSDs competing for limited bandwidth. For high disk counts per node, the disk controller may be a bottleneck if it does not have sufficient bandwidth to carry all of your disks at full speed. Therefore, it is advisable to have a good mix of CPU’s, RAM, storage, disk controllers and network controllers within your Ceph architecture. Do not only focus on disks or NVME’s per node.
In summary, Ceph scales very well as the number of nodes increases. Also, having too few nodes in the cluster places more of your cluster’s data on each node, increasing recovery I/O and recovery times, which is not desirable. Lastly, you can start with as little as three nodes, but if you can afford to spread your disks across more nodes, it’s better to do so.
At 42on we love open source and in particular Ceph! If you like these articles, please comment, like and share.
42on helps you with all kinds of Ceph stuff like: cephadm, Ceph infrastructure, Ceph expanding, Ceph cluster, Ceph bootstrapping, Ceph interface, Ceph Ansible