ZFS and Ceph, what a lovely couple they make!

ZFS and Ceph, what a lovely couple they make! 958 467 Michiel Manten

Stable, secure data storage is probably one of the most important things in today’s data driven world. With the ability to scale fast. Combining two great storage solutions provides you with all those in one. ZFS and Ceph are a couple that cannot easily be beaten!

Why is that? The short explanation is scalability. ZFS is a solution which ‘scales up’ as no other, while Ceph is built to ‘scale out’. The term ‘scaling up’ means to extend the storage pool with additional disks which are fully available for the filesystems that use the pool. This model is generally limited by the amount of disks that can be added to a node. ‘Scaling out’ is a different way of growing the storage capacity; not by adding disks (or bigger disks) to a machine or pool, but by adding storage nodes (a storage server with network, compute and storage capacity) to the existing storage capacity. This model is mostly limited by the bandwidth between the different nodes.

That makes it far more easier to grow your storage infrastructure, because you don’t have to change the current hardware architecture expect for the capacity.

Easily scaling up with ZFS

ZFS is a combined file system and logical volume manager partly developed by Sun Microsystems. The ZFS name stands for nothing; briefly assigned the backronym “Zettabyte File System”, it is no longer considered an initialism. ZFS is very scalable, and includes extensive protection against data corruption, support for high storage capacities, efficient data compression, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z, native NFSv4 ACLs, and can be very precisely configured.

Unlike most files systems, ZFS combines the features of a file system and a volume manager. This means that as supposed to other file systems, ZFS can create a file system that spans across a series of drives or a pool. Not only that; you can add storage to a pool by adding more drives. ZFS will handle partitioning and formatting.

Easily scaling out with Ceph

Ceph is a storage solution that provides applications with object, block, and file system storage. All in a single unified storage cluster. It is flexible, exceptionally reliable, and easy to manage. Ceph decouples the storage software from the underlying hardware. This enables you to build much larger storage clusters with less effort. You can scale out storage clusters indefinitely, using economical commodity hardware, and you can replace hardware easily when it malfunctions or fails. I explained more about Ceph storage here: and here:

The two combined

With that said, I often see organizations start using open source software defined storage with ZFS. Looking at the growth potential of open source storage; there is no limit to how fast companies’ data size grow; it’s stable, highly redundant, cheap, and fast. Because of this, the open source storage systems are ‘abused’ to the max. Now when this happens, the environment grows and at one point the storage infrastructure is ten times larger than imagined when started.

At some point the size of the data grows so fast that the ZFS storage controller node(s) are at the maximum capacity of what they can handle. At this moment you will need to migrate the data to a new ZFS system. At this point it would be very nice to have a way to scale the storage out (combine more units) instead of only up (grow units bigger). This is where Ceph storage complements ZFS. With Ceph you will never have to carry out data migrations when you grow because you will add new storage servers to grow capacity or to remove older storage servers; CEPH will always redistribute the data to make optimal use of all capacity of the platform (storage, compute and networking). Where ZFS can start with little hardware investment though, CEPH requires more hardware as it doesn’t accept compromising the data consistency by storing all data (at least) 3 times.

That’s why ZFS and CEPH make such a great storage couple, each with their own specific use cases within the organization. For example; ZFS is often used for creating a backup or to build archive data, while Ceph provides the S3 cloud storage and virtual disk storage for virtual machines. In other cases, ZFS is used for file system storage while Ceph provides the block storage infrastructure.

And with both solutions being open source and software defined, as you can imagine we at Fairbanks and 42on love them both equally for their own merits, and even more as a complementary couple. And whoever said you had to choose your favorite from such a lovely couple? That makes me curious however: do you use both solutions or did you pick only one for your storage infrastructure?

Follow up on Ceph

Follow up on Ceph 1200 400 Michiel Manten

April 3rd, 2019 I posted a Linkedin article about the basics of Ceph, a cloud storage platform often used in combination with OpenStack. Since then, I got a lot of questions to explain a bit more of the mechanics of Ceph. As Fairbanks also incorporated Ceph service provider 42on, I started to dive a little deeper into the subject. In this article I share my understandings of the techniques to answer some of your questions. I hope it helps you understand more about Ceph and why it is such an interesting storage platform. If you have any questions, please feel free to comment or e-mail me. For truly in depth understanding of Ceph I will get you in contact with our new colleagues of 42on.

What does “Ceph is scalable object, block and file system storage” mean?

As explained in my post in april, Ceph is a software defined storage solution that can scale both in performance and capacity. Ceph is used to build multi petabyte storage clusters. The basic building blocks of a Ceph storage cluster are the storage nodes. These storage nodes are commodity servers containing commodity hard drives and/or flash storage. Ceph is ´self healing´ and provides infinite scalability and redundancy and is able to grow in a linear way physically and financially. Financial scalability means that you invest in the amount of storage you need at this moment, not the amount you might need over, for example, five years.

Ceph is designed for scale. And you scale by adding additional storage nodes. You will need multiple servers to satisfy your capacity, performance and resiliency requirements. And as you expand the cluster with extra storage nodes, the capacity, performance and resiliency (if needed) will all increase at the same time.

You don’t need to start with petabytes of storage though. You can actually start small, with just a few storage nodes and expand as your needs increase. Because Ceph manages redundancy in software, you don’t need a RAID controller, therefore a generic server is sufficient. The hardware is simple and the intelligence resides all in software. These servers can exist from different hardware brands and/or generations, so you can expand your Ceph environment at your own pace. Alltogether this means that the risk of hardware vendor lock-in is mitigated. You are not tied to any particular proprietary storage hardware.

What makes Ceph so special?

At the heart of the Ceph storage cluster is the CRUSH algoritm, developed by Sage Weil, the co-creator of Ceph. The CRUSH algoritm allows storage clients to calculate which storage node needs to be contacted for retrieving or storing data. The storage client can determine what to do with data or where to get it.

Ceph is unique because there is no centralised ‘registry’ that keeps track of the location of data on the cluster (metadata). Such a centralised registry can become a performance bottleneck, preventing further expansion, or a single-point-of-failure. This is why Ceph can scale in capacity and performance while assuring availability. At the core of the CRUSH algoritm is the CRUSH map. That map contains information about the storage nodes in the cluster and the rules for storing data. That map is the basis for the calculations the storage client needs to perform in order to decide which storage node to contact.

The CRUSH map is distributed across the cluster from special servers: the ‘monitor’ nodes. Those nodes are contacted by both the storage nodes and the storage clients.

It’s important to keep in mind that while the Ceph monitor nodes are an essential part of your Ceph cluster, they are not in the data path. They do not store or process client data.They only keep track of the cluster state for both clients and individual storage nodes. Data always flows directly from the storage node towards the client and vice versa.

So there is no central bottleneck

A storage client will contact the appropriate storage node directly to store or retrieve data. There are no components in between, except for the network, which you will need to size accordingly. Because there are no intermediate components or proxies that could potentially create a bottleneck, a Ceph cluster can really scale horizontally in both capacity and performance. And while scaling storage and performance, data is protected by redundancy.

How does Ceph provide data redundancy?

To have the most redundant and safe storage infrastructure Ceph provides both replication and erasure encoding. For replication Ceph distributes copies of the data and assures that the copies are stored on different storage nodes.

You are able to configure an infinite amount of replicas. The only downside of storing more replicas are the costs of extra hardware you need to setup to provide the extra raw storage capacity. You may decide that data durability and availability are so important that you may have to sacrifice space and absorb the cost, but in general Ceph advises 3 replica’s as a minimum replica count.

Does Ceph also support erasure encoding?

So what if you think having 3 replicas is too costly? How does Ceph ensure your data?

To explain this technique it would be easy to comparing it to RAID technologies. In that case, I would say that RAID1 resembles the Ceph equivalent of ‘replication’: they offer the best overall performance both are not most storage space efficient. Especially as you need more than one replica of the data to achieve the level of redundancy you need.

This is why we got to RAID5 and RAID6 in the past as an alternative to RAID1 or RAID10. Parity RAID assures redundancy but with much less storage overhead. As always in IT, this comes at a price though: in this case at the cost of storage performance (mostly write performance). Ceph and RAID 5 and 6 both use a type of ‘erasure encoding’ to achieve comparable results. In this example of erasure encoding you are telling Ceph to chop up the data in 8 data segments and 4 parity segments:

You will have only 33% storage overhead for redundancy instead of 50% (or even more) you may face using replication, depending on how many copies you want. This example does assume that you have at least 8 + 4 = 12 storage nodes. But any scheme will do, you could do 6 data segments + 2 parity segments (comparable to RAID6) with only 8 hosts.

What failure domains does Ceph protect against?

Ceph is datacenter aware; the CRUSH map can represent your physical datacenter topology, consisting of racks, rows, rooms, floors, datacenters and so on. You can customise your topology. This allows you to create very clear data storage policies that Ceph will use to assure that the cluster can tollerate failures across certain boundaries. An example of a Ceph infrastructure:

If you want, you can be protected to lose a whole rack. Or a whole row of racks and the cluster could still be fully operational, although performance and capacity are reduced. That much redundancy may cost so much storage that you may not want to employ it for all of your data. That’s no problem. You can create multiple storage pools that each have their own protection level and thus cost.

What is this Object Storage Daemon (OSD) I always read about?

If you read about Ceph, you read a lot about the OSD. This is a service that runs on the storage node. The OSD is the actual workhorse of Ceph, it serves the data from the hard drive or ingests it and stores it on the drive. The OSD also assures storage redundancy, by replicating data to other OSDs based on the CRUSH map. So for every hard drive or solid state drive in the storage node, an OSD will be active. A Ceph environment with 24 hard drives, runs 24 OSDs.

When a drive goes down, the OSD will go down too and the monitor nodes will redistribute an update CRUSH map so the clients are aware and know where to get the data. The OSDs also respond to this update, because 1 replica of some data is lost, they will start to replicate affected data to make it redundant again (across fewer nodes though). After this automatic process the is fully healthy again. This is comparable to having a ‘hot-spare’ without the need for ‘hot-spares’.

When the drive is replaced, the cluster will revert back to the original state. This means that the replaced drive will be filled with data once again to make sure data is spread evenly across all drives within the cluster.

Press release

Press release 1600 899 Michiel Manten


acquires 42on.

Fairbanks, leading provider in open source cloud solutions, announces the acquisition of 42on, leading provider of services for the Ceph storage platform.



Amersfoort, July 23 2019

Fairbanks, leading provider in open source cloud solutions, announces the acquisition of 42on, leading provider of services for the scalable, open source Ceph storage platform.

With this acquisition, Fairbanks expands its service portfolio in the open infrastructure realm. Ruud Harmsen, founder and CEO of Fairbanks: “We have been managing and supporting highly available private OpenStack clouds for our customers for over 7 years now. We’ve recognized that our customers have a growing need for specialized, highly reliable private cloud storage as well. We are already very familiar with the Ceph cloud storage solutions as part of OpenStack clouds and have successfully collaborated with Wido den Hollander and 42on at several mutual customers. We discovered we share a vision on servicing customers and the potentials of open infrastructures, combining our forces became an obvious next step”.

“We are delighted to become a part of Fairbanks,” said Wido den Hollander, world renowned Ceph expert, co-founder and CTO of 42on. “We believe this is the perfect combination of technology, strategy and culture. As 42on will remain a strong brand for Ceph services as it is, the cooperation with Fairbanks enables us to expand our service portfolio for our Ceph-only customers with fully managed services for example and other variants of support than we can provide now. Our customers that are interested in other open infrastructure technologies can of course benefit from Fairbanks’ years long expertise. I am looking forward to exciting times for our employees, both communities as well as our customers!”.

Fairbanks supports organizations with the design, implementation and management of sustainable and innovative open infrastructures since 2012. Open infrastructures enable companies to flexibly follow the growth of the digital organization, replace investments (CAPEX) with operational costs (OPEX) and quickly embrace innovations. Fairbanks helps telecom companies, hosting/service providers, software developers, SaaS companies, universities, governments, banks and eCommerce companies to build and manage OpenStack Cloud environments. Fairbanks is also the ambassador for the OpenStack Foundation in the Benelux.

42on provides high quality Ceph consultancy since 2012. Assisting organizations in designing, deploying and supporting Ceph clusters. Ranging from terabytes to petabytes, spinning disks or all flash, 42on has implemented all kinds and types of Ceph clusters. 42on also is actively involved in the open source communities of Ceph, Apache CloudStack, libvirt and multiple other projects.

For more information about Fairbanks, see

Let’s get in touch.

Ceph: Object, block and file system storage in a single storage cluster

Ceph: Object, block and file system storage in a single storage cluster 2560 1185 Michiel Manten

Cloud storage in your datacenter

Storage growth for the coming years

The International Data Corporation (IDC) has released a report on the ever-growing collective world’s data, a.k.a. the datasphere. Numbers are staggering: the IDC predicts that the collective sum of the world’s data will grow from 33 zettabytes this year to a 175ZB by 2025, growing at a yearly rate of 61%.

Some other remarkable stats for the year 2025 are:

  • The storage industry will ship 42ZB of capacity over the next 7 years.
  • 90ZB of data will be created on IoT devices by 2025.
  • By 2025, 49% of data will be stored in public clouds.
  • 30% of data generated will be consumed in real-time by 2025.

Cloud datamanagement

To facilitate this growth, companies look at ways to store their data safely and effectively. But more over, ways to improve management of ever growing datasets. A great way to start this is by adopting cloud storage solutions in your own infrastructure. Integrating object, block and file storage in a single unified storage cluster while simultaneously delivering high-performance and infinite scalability.

Ceph storage

Ceph is the storage solution that provides applications with object, block, and file system storage. All in a single unified storage cluster. It is flexible, highly reliable and easy to manage. Ceph storage is scalable for thousands of client hosts accessing petabytes of data. Applications can use any of the system interfaces to the same cluster simultaneously, which means your Ceph storage system serves as a flexible foundation for all of your data storage needs.

Ceph storage at a glance

  • Use Ceph for free, and deploy it on commodity hardware, keeping hardware expenses low.
  • Ceph replicates data and makes it fault tolerant.
  • The system is both self-healing and self-managing by design.
  • Proven Enterprise grade technology since 2011.
  • Endlessly scalable and agile by design. The only limits to size and speed of your storage cluster is the hardware.
  • Supported by major vendors and several service models possible, to guarentee uptime.
  • Integrates object, block and file storage in a single unified storage cluster.
  • Build as you grow: You can grow the cluster while migrating virtual machines, keeping the initial investment interesting enough.

Scale out on commodity hardware

Ceph decouples the storage software from the underlying hardware. This enables you to build much larger storage clusters with less effort. You can scale out storage clusters infinitely, using economical commodity hardware, and you can replace hardware easily when it malfunctions or fails. If your cloud journey is going to be hybrid: Ceph integrates with private as well as public clouds like AWS and Azure. The following picture shows an overview of the Ceph services:

Suitable for you?

So wether your current storage is ‘end-of-life’, your current data growth is not sustainable anymore, you want to easily combine object, block and file storage, you want to free yourself from the expensive lock-in proprietary hardware based storage solutions, you want a cloud native storage solution, you want to automate your replication and management, or you just want to find out for yourself what storage options are available at the moment; Ceph is most certainly worth taking a look at.


in touch.

    Privacy Preferences

    When you visit our website, it may store information through your browser from specific services, usually in the form of cookies. Here you can change your Privacy Preferences. It is worth noting that blocking some types of cookies may impact your experience on our website and the services we are able to offer.

    Click to enable/disable Google Analytics tracking code.
    Click to enable/disable Google Fonts.
    Visit privacy policy Visit terms and conditions
    Our website uses cookies, mainly from 3rd party services. Define your Privacy Preferences and/or agree to our use of cookies.