CNCHAggregatingMergeTree engine can be used for incremental data aggregation, even on aggregated materialized views. With this engine, you can drastically reduce the number of rows.
The engine is inherited from CNCHMergeTree by altering the logic for data parts merging. It replaces all rows with the same primary key (or sorting key) with a single row within one data part that stores a combination of states of aggregate functions.
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] ( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], ... ) ENGINE =CNCHAggregatingMergeTree() [PARTITION BY expr] [ORDER BY expr] [SAMPLE BY expr] [TTL expr] [SETTINGS name=value, ...]
CNCHAggregatingMergeTree materialized view
Creating a base table using CNCHMergeTree.
CREATE TABLE test.visits ( StartDate DateTime64 NOT NULL, CounterID UInt64, Sign Nullable(Int32), UserID Nullable(Int32) ) ENGINE = CNCHMergeTree ORDER BY (StartDate, CounterID);
Create a materialized view utilising CNCHAggregatingMergeTree.
CREATE MATERIALIZED VIEW test.mv_visits ( StartDate DateTime64 NOT NULL, CounterID UInt64, Visits AggregateFunction(sum, Nullable(Int32)), Users AggregateFunction(uniq, Nullable(Int32)) ) ENGINE = CNCHAggregatingMergeTree() ORDER BY (StartDate, CounterID) AS SELECT StartDate, CounterID, sumState(Sign) AS Visits, uniqState(UserID) AS Users FROM test.visits GROUP BY StartDate, CounterID;
Inserting data into the test.visits table.
INSERT INTO test.visits (StartDate, CounterID, Sign, UserID) VALUES (1667446031, 1, 3, 4) INSERT INTO test.visits (StartDate, CounterID, Sign, UserID) VALUES (1667446031, 1, 6, 3)
To get the aggregated data, we need to execute a query such as SELECT ... GROUP BY ... from the materialized view test.mv_visits:
SELECT StartDate, sumMerge(Visits) AS Visits, uniqMerge(Users) AS Users FROM test.mv_visits GROUP BY StartDate ORDER BY StartDate;
Updated about 2 months ago