Traditional Data Warehouse vs. Cloud Data Warehouse
A modern cloud-based data warehouse can overcome the limitations of traditional data warehouse
A modern cloud-based data warehouse design addresses the problems of conventional data warehouse design and gives an enterprise the opportunity to leverage on the cloud’s benefits for data management.
The following section delves deeper into the differences between traditional and cloud data warehouse architectures.
Traditional on-premises data warehouse
A traditional data warehouse is typically defined by a three-tier architectural design approach.
Top tier
The top tier comprises the client-side semantic layer of the architecture. The transformed and sorted information stored in the data warehouse will be used for business in this tier. This tier helps businesses and support communities to perform standard and ad hoc reporting, data exploration, data analytics, and mining.
Middle tier
The middle tier consists of the OLAP (Online Analytical Processing) engine. As this tier is located in the middle, it interacts with the information present in the bottom tier and passes on the insights to the top tier tools, which process it.
Bottom tier
The bottom tier mainly contains data sources, ETL tools, and data warehouse.
1. Data sources
The data sources layer consists of the source data that is pulled or queried and provided to the staging and ETL tools for further processing.
2. Extract, Transform and Load (ETL) tools
Extract, Transform and Load (ETL) tools are very vital because they help in combining logic, source data, and schema into one and load the information to the data warehouse or data mart.
In bottom-up or Kimball's approach, data gets loaded into the data mart first in ETL process, and then it is pushed into enterprise data warehouse.
In top-down or Inman's approach, data gets loaded into the data warehouse first in ETL process, and then it is pushed into respective data marts.
Cloud-based data warehouse
Cloud-based data warehouse architecture is relatively new when compared to traditional data warehouse approaches. Cloud-based architecture means that the actual data warehouses are accessed through the cloud.
There are a wide variety of options, each of which has different architectures for the same benefits of integrating, analysing, and acting on data from different sources. The difference between a cloud-based data warehouse approach compared to that of a traditional approach include:
Upfront capital expenses: The different components required for traditional, on-premises data warehouses mandate high upfront capital expenses. Since the components of cloud architecture are accessed through the cloud and there is no need to purchase physical hardware, these expenses don’t apply.
Operational expenses: While businesses with on-premises data warehouses must deal with upgrade and maintenance costs, the cloud offers a low, pay-per-use model.
Time to market: Cloud-based data warehouse architecture is substantially faster than on-premises data warehouse.
Indirect costs: Potential downtimes and delays in time to market.
Scalability: The elastic nature of cloud resources makes it ideal for scaling of big datasets. Additionally, cloud-based data warehousing options allow you to scale up, down, out and in.
Some of the more notable cloud data warehouses in the market include ByteHouse Cloud Data Platform, Amazon Redshift, Google BigQuery, Snowflake, and Microsoft Azure SQL Data Warehouse.
Test drive ByteHouse cloud data warehouse, or read more about the architecture design considerations behind ByteHouse.