All Integrations
DatabasesBuilt-in Prometheus endpoint

ClickHouse Integration

Monitor ClickHouse with MergeTree merge metrics, query performance percentiles, replication lag, and part count visibility — via the native Prometheus endpoint.

Setup

How It Works

01

Enable Prometheus Endpoint

Add a <prometheus> section to config.xml (or a file in config.d/) to expose ClickHouse metrics on a Prometheus-compatible HTTP endpoint. No external exporter process is required — ClickHouse has this built in.

02

Configure Metrics Exposure

Set endpoint, port, and the metrics/asynchronous_metrics/events/errors booleans in the prometheus config block to control which metric families are exposed. Enable all four for full visibility.

03

Add to TigerOps Scrape Config

Point the TigerOps Collector or your Prometheus remote_write configuration at the ClickHouse Prometheus endpoint. For replicated clusters, add a scrape target for each replica node.

04

Queries, Merges & Parts Flow

Within minutes TigerOps dashboards show query throughput, merge queue depth, active part counts, replication queue lag, memory usage per query, and disk I/O for MergeTree tables.

Capabilities

What You Get Out of the Box

MergeTree Merge Metrics

Active merge count, merge queue depth, bytes merged per second, and parts count per table from the system.merges table. Background merge storms that impact query latency are detected automatically.

Query Performance

Queries per second, query duration percentiles (P50, P95, P99), failed query count, and memory used per query from system.query_log. Slow queries are identified and surfaced in the TigerOps query explorer.

Replication Queue & Lag

Replication queue depth, replica lag in seconds, replication errors, and node-level replication health for ReplicatedMergeTree and ReplicatedReplacingMergeTree tables across all shards.

Part Count & Storage

Active part count, inactive part count, total bytes on disk, and bytes in memory for each MergeTree table. Excessive part counts that degrade query performance are flagged with recommended merge actions.

Memory & Resource Utilization

ClickHouse process memory usage, jemalloc allocator stats, background thread counts, and file descriptor usage. Memory-intensive queries are correlated with memory metric spikes.

ZooKeeper / ClickHouse Keeper

ZooKeeper/Keeper session count, outstanding requests, latency, and node count metrics for replicated table coordination. High ZooKeeper latency events are correlated with replication delays.

Configuration

Enable Prometheus Endpoint

Add five lines to config.xml and ClickHouse starts exposing metrics immediately.

config.xml + otel-collector.yaml
<!-- config.xml — enable the Prometheus endpoint -->
<clickhouse>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>9363</port>
        <metrics>true</metrics>
        <asynchronous_metrics>true</asynchronous_metrics>
        <events>true</events>
        <errors>true</errors>
    </prometheus>
</clickhouse>

<!-- Verify metrics are exposed -->
<!-- curl http://localhost:9363/metrics | head -20 -->

# TigerOps Collector config (otel-collector.yaml)
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: clickhouse
          scrape_interval: 15s
          static_configs:
            - targets:
              - clickhouse-shard1-replica1:9363
              - clickhouse-shard1-replica2:9363
              - clickhouse-shard2-replica1:9363

exporters:
  otlphttp:
    endpoint: https://ingest.tigerops.io/v1/metrics
    headers:
      Authorization: "Bearer ${TIGEROPS_API_KEY}"

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      exporters: [otlphttp]

# Key metrics to alert on
# ClickHouseMetrics_ReplicasMaxQueueSize    > 1000   → replication lag
# ClickHouseAsyncMetrics_NumberOfPartsTotal > 5000   → too many parts
# ClickHouseProfileEvents_QueryMemoryLimitExceeded   → OOM queries
# ClickHouseMetrics_BackgroundMergesAndMutationsPoolTask → merge queue
FAQ

Common Questions

Which ClickHouse versions support the built-in Prometheus endpoint?

The native Prometheus endpoint has been available since ClickHouse 20.1. For older versions, use the clickhouse_exporter third-party exporter. ClickHouse 22.x and 23.x are fully supported with the native endpoint.

Does TigerOps support ClickHouse Cloud?

ClickHouse Cloud exposes metrics via its monitoring API. TigerOps supports ClickHouse Cloud via a cloud metrics integration that polls the ClickHouse Cloud metrics API using your service credentials.

How does TigerOps handle ClickHouse clusters with multiple shards?

Add a scrape target for each ClickHouse shard replica. TigerOps aggregates metrics across shards and provides both cluster-level aggregate views and per-shard breakdowns in dashboards.

Can TigerOps alert on slow ClickHouse queries?

Yes. Enable the query_log system table and configure TigerOps to scrape system.query_log via the HTTP interface. Queries exceeding a configurable duration threshold trigger alerts with the query hash, user, and execution plan.

What is the performance impact of enabling all metric families?

Enabling all four metric families (metrics, asynchronous_metrics, events, errors) adds minimal overhead — under 1% CPU at 15-second scrape intervals. The asynchronous_metrics are pre-calculated by a background thread, so serving them is essentially free.

Get Started

Full ClickHouse Observability via Native Prometheus

MergeTree metrics, query performance, replication lag, and part counts — five config lines to enable.