Spread SELECTs from reporting apps among cluster nodes
Reporting apps usually generate various customer reports from
SELECT query results.
The load generated by such
ClickHouse cluster may vary depending
on the number of online customers and on the generated report types. It is obvious
that the load must be limited in order to prevent cluster overload.
SELECTs may be routed to a distributed table on a single node. But this increases resource usage (RAM, CPU and network) on the node comparing to other nodes, since it must do final aggregation, sorting and filtering for the data obtained from cluster nodes (shards).
It would be better to create identical distributed tables on each shard and spread
SELECTs among all the available shards.
The following minimal
chproxy config may be used for this use case:
server: http: listen_addr: ":9090" # Networks with reporting servers. allowed_networks: ["10.10.2.0/24"] users: - name: "report" to_cluster: "stats-aggregate" to_user: "readonly" max_concurrent_queries: 6 max_execution_time: 1m clusters: - name: "stats-aggregate" nodes: [ "10.10.20.1:8123", "10.10.20.2:8123" ] users: - name: "readonly" password: "****"