Architecture of ByteHouse

ByteHouse provides MPP 2.0 architecture with shared-everything storage and shared-nothing compute. This can prevent data re-sharding issues from traditional MPP architecture. According to this distinct architecture, ByteHouse can process queries with high performance and also support scalable computing resource management.

Overview of ByteHouse Architecture


Cloud Service

Cloud service combines a bunch of services that could tie together different units of ByteHouse, ranging from metadata management, access control, and data security.

Compute Resources

In ByteHouse, Virtual Warehouse represents isolated compute resources. Query execution is performed by Virtual Warehouse and each virtual warehouse is an MPP compute cluster. Like most MPP architectures, it's easy to scale and manage (e.g. Resizing Virtual Warehouses). Each of them is independent and will not impact others, which can natively support native multi-tenancy mode.

Storage Resources

ByteHouse uses a shared-everything architecture to persist data in HDFS to achieve high availability. By leveraging the columnar storage format and various compression algorithms, ByteHouse is able to handle huge amounts of data with distinguished performance.