All Integrations
DatabasesTigerOps agent

TiDB Integration

Monitor TiKV storage metrics, PD scheduling operations, and SQL layer query performance across your TiDB clusters. Get AI-powered root cause analysis spanning the full distributed SQL stack.

Setup

How It Works

01

Scrape TiDB Component Endpoints

TiDB exposes native Prometheus endpoints on each component: TiDB SQL layer (port 10080), TiKV (port 20180), and PD (port 2379). Configure the TigerOps agent to scrape all three for complete cluster visibility.

02

Configure Remote Write

Point the agent to your TigerOps remote-write endpoint. All TiKV region metrics, PD operator counts, and TiDB slow query digest data flow in within minutes of configuration.

03

Enable Slow Query Log Integration

Connect TigerOps to your TiDB slow query log or the information_schema.cluster_slow_query table. TigerOps normalizes query patterns and correlates slow queries with TiKV hotspot regions.

04

Set Region and Latency Alerts

Define SLOs for SQL layer p99 latency, TiKV region leader distribution skew, and PD scheduling queue depth. TigerOps AI correlates Raft heartbeat anomalies with query degradation.

Capabilities

What You Get Out of the Box

TiKV Region & Raft Metrics

Per-store region count, leader count, Raft log lag, apply queue size, and snapshot generation rates. Detect region hot spots and Raft leader imbalance across TiKV stores.

PD Scheduling Visibility

Track PD operator counts by type (add-peer, remove-peer, move-leader), scheduling queue depth, and store balance scores. TigerOps alerts when scheduling storms indicate region instability.

SQL Layer Query Performance

TiDB SQL layer QPS by statement type, transaction duration histograms, cop-task latency, and connection pool utilization. Identify expensive execution plans with normalized query digest tracking.

Coprocessor Task Analysis

Monitor TiKV coprocessor request rates, handle duration, wait duration, and response size. Identify which SQL queries are generating the most coprocessor load per TiKV store.

TiFlash Columnar Metrics

For TiFlash nodes, track delta layer scan rates, stable layer cache hit ratio, and MVCC version counts. Correlate TiFlash query performance with data freshness from TiKV replication lag.

AI Root Cause Analysis

When TiDB query latency spikes, TigerOps AI traces the degradation across SQL layer, PD scheduling, and TiKV coprocessor metrics simultaneously to pinpoint whether the bottleneck is compute, storage, or scheduling.

Configuration

TigerOps Agent Config for TiDB

Scrape all TiDB components — TiDB SQL layer, TiKV, and PD — with a single agent configuration.

tigerops-tidb.yaml
# TigerOps TiDB integration config
# Place at /etc/tigerops/conf.d/tidb.yaml

integrations:
  # TiDB SQL layer
  - name: tidb-sql
    type: prometheus_scrape
    config:
      targets:
        - tidb-0.tidb.svc.cluster.local:10080
        - tidb-1.tidb.svc.cluster.local:10080
      labels:
        component: tidb
    scrape_interval: 15s

  # TiKV stores
  - name: tidb-tikv
    type: prometheus_scrape
    config:
      targets:
        - tikv-0.tikv.svc.cluster.local:20180
        - tikv-1.tikv.svc.cluster.local:20180
        - tikv-2.tikv.svc.cluster.local:20180
      labels:
        component: tikv
    scrape_interval: 15s

  # Placement Driver (PD)
  - name: tidb-pd
    type: prometheus_scrape
    config:
      targets:
        - pd-0.pd.svc.cluster.local:2379
        - pd-1.pd.svc.cluster.local:2379
        - pd-2.pd.svc.cluster.local:2379
      labels:
        component: pd
    scrape_interval: 15s

  # TiFlash (if deployed)
  - name: tidb-tiflash
    type: prometheus_scrape
    config:
      targets:
        - tiflash-0.tiflash.svc.cluster.local:8234
      labels:
        component: tiflash
    scrape_interval: 30s

remote_write:
  endpoint: https://ingest.atatus.net/api/v1/write
  bearer_token: "${TIGEROPS_API_KEY}"

alerts:
  sqlP99LatencyMs: 100
  tikvRegionLeaderImbalancePercent: 20
  pdPendingOperators: 50
FAQ

Common Questions

Which TiDB versions does TigerOps support?

TigerOps supports TiDB 5.x, 6.x, and 7.x, including TiDB Cloud (Serverless and Dedicated). The integration uses native Prometheus endpoints available in all these versions. TiDB on Kubernetes via TiDB Operator is also supported with automatic endpoint discovery.

How does TigerOps monitor TiKV hotspot regions?

TigerOps collects per-store read and write hot region counts from PD and correlates them with TiKV coprocessor queue depths. When a region becomes a hotspot, TigerOps identifies the SQL queries driving the traffic and the affected TiKV store simultaneously.

Can TigerOps monitor TiDB Cloud managed deployments?

Yes. For TiDB Cloud Dedicated clusters, TigerOps connects via the TiDB Cloud metrics API. For TiDB Cloud Serverless, TigerOps bridges CloudWatch-compatible metrics into your TigerOps workspace with automatic label normalization.

How are PD scheduling alerts handled?

TigerOps monitors pd_scheduler_region_heartbeat counts, operator step durations, and pending operator queues. Alerts fire when operator rates exceed baseline and are automatically correlated with ongoing store capacity events or TiKV leader elections.

Does TigerOps support TiCDC replication lag monitoring?

Yes. TigerOps collects TiCDC checkpoint lag, resolved timestamp lag, and sink write latency metrics. It alerts when replication lag exceeds your configured SLO and correlates lag spikes with upstream TiKV write amplification events.

Get Started

Get Full-Stack Visibility Across Your TiDB Cluster

TiKV region metrics, PD scheduling visibility, and AI root cause analysis. Deploy in 5 minutes.