Shared-Everything Architecture vs. Shared-Nothing Architecture
See a comparison of the two architectures and basic principles for choosing the right one.
In a previous post, we discussed the basics of ClickHouse and the challenges presented in two cases, following which we start to consider the shared storage architecture which is also called the disaggregated compute and storage architecture as shown in Figure 1.
As shown in this figure, the computing resources are isolated from the data storage. There are many computing resources. From the perspective of the computing resources, they will see a global shared pool which is the data storage layer. This means all of the data in this pool can be shared across all computing resources.
Under this architecture it has three major advantages.
Advantages of Shared-everything Architecture
First it has better elasticity because computing resource and the storage are isolated and they can be scaled independently based on the demand for the resources. If we want more computing resources, we can add more computing instance; if we need more data storage, we can expand the volume for the data storage layer.
Secondly, it has better scalability. As the data can be shared, theoretically we can scale out to utilise as much computing resources as possible.
Lastly, it's more user friendly for the cluster managers as they don't need to worry about the data consistency, the data replica, and the data charging; and all of these can be delegated to the data storage. For example, companies can utilize the Amazon S3 to store the data.
Disadvantages of Shared-everything Architecture
However, this disaggregated computing architecture does have its disadvantages. Because the computing and the data storage are separated, it has extra latency for the remote access.
Since each architecture has its pros and cons, how do we select the proper architecture for your company unique requirements. Let’s compare the two from four dimensions as shown in Table 1.
Environment. For “shared-nothing” architecture, the on-premise hardware resources are relatively stable as you cannot increase or decrease your hardware resource very frequently. For “shared-everything” architecture, the cloud infrastructure allows much higher degree of elasticity based on users’ demand.
Performance. The “shared-nothing” architecture has very good performance. While “shared-everything” architecture has good performance, users should be able to tolerate minor downgrade for cold data access. We will discuss about ways to improve this in the later part of the article.
Workload. If the work load is evenly distributed at any time, you can utilise your resources fully and there is no waste. Then, you can consider the shared-nothing architecture.
If your workload often sees some spike with the peak and off-peak pattern, you should consider the shared everything architecture because you need to scale up or scale down very efficiently to best utilise your resources utilisation, which can significantly reduce the cost.
Experience. Using ClickHouse requires some expertise with a strong background or terminology on the operation or maintenance of this cluster. In the shared-everything architecture, because you can delegate some challenges to the existing system, it doesn't require strong domain knowledge.
These are some basic principles for how to choose different architecture. Next, we'll explore cloud-native ClickHouse and how we implement it at ByteDance in the next post.
About the author: Niu Zhaojie obtained his PhD from Nanyang Technological University in Singapore. Currently, he is a senior software engineer at Bytedance and building analytics database in cloud.