All Integrations
DatabasesTigerOps agent

Vitess Integration

Monitor VTGate query routing, tablet health, and VReplication lag across your Vitess clusters. Get AI-powered resharding impact analysis and throttler event correlation.

Setup

How It Works

01

Scrape Vitess Component Endpoints

Vitess exposes native Prometheus metrics on VTGate (port 15001), VTTablet (port 15101), and VTOrc. Configure the TigerOps agent to scrape all component types for full cluster visibility.

02

Configure Component Discovery

For Kubernetes deployments, TigerOps auto-discovers VTGate and VTTablet pods via label selectors. For bare-metal deployments, specify topology server addresses and the agent reads the component list from etcd or ZooKeeper.

03

Enable VReplication Monitoring

TigerOps monitors VReplication workflows — including resharding, MoveTables, and Materialize — by tracking VReplication lag, rows copied, and workflow health status per stream and tablet.

04

Set Routing and Replication Alerts

Define SLOs for VTGate query latency, tablet health check failures, and VReplication lag thresholds. TigerOps correlates resharding events with query routing latency changes automatically.

Capabilities

What You Get Out of the Box

VTGate Query Routing Metrics

Per-keyspace and per-tablet-type query rates, routing latency, and shard routing errors. TigerOps detects when VTGate is failing to route queries to the correct shard or experiencing plan cache misses.

Tablet Health & Replication

Per-tablet health check status, replication lag, seconds_behind_master, and tablet state transitions (primary, replica, rdonly). Alert on unhealthy tablets before VTGate removes them from serving.

VReplication Lag Monitoring

Per-stream VReplication lag, rows copied, and workflow phase tracking for MoveTables, Reshard, and Materialize operations. Alert when replication lag exceeds your cutover readiness threshold.

Connection Pool & Throttler Metrics

VTTablet connection pool utilization, throttler check rates, and throttler lag metrics. TigerOps surfaces when the Vitess throttler is actively throttling clients due to replication lag.

Shard Query Distribution

Per-shard QPS and latency to identify hot shards and uneven query distribution. TigerOps detects keyspace ID range hotspots that indicate a suboptimal primary vindex or scatter query patterns.

AI Resharding Impact Analysis

TigerOps AI tracks query routing changes before and during Vitess resharding operations, alerting on unexpected latency increases and identifying which query types are most affected by the ongoing shard split.

Configuration

TigerOps Agent Config for Vitess

Scrape all Vitess components — VTGate, VTTablet, and VTOrc — with automatic topology-aware labeling.

tigerops-vitess.yaml
# TigerOps Vitess integration config
# Place at /etc/tigerops/conf.d/vitess.yaml

integrations:
  - name: vitess
    type: vitess
    config:
      # Topology server for component discovery
      topology:
        server_type: etcd   # or zookeeper, consul
        address: etcd.vitess.svc.cluster.local:2379
        root: /vitess/global

      # Or specify components manually for bare-metal
      # vtgate:
      #   targets:
      #     - vtgate-0.vitess.svc:15001
      # vttablet:
      #   targets:
      #     - vttablet-0.vitess.svc:15101

      # Kubernetes-based discovery (alternative)
      kubernetes:
        enabled: true
        namespace: vitess
        vtgate_label_selector: "vitess.io/component=vtgate"
        vttablet_label_selector: "vitess.io/component=vttablet"
        vtorc_label_selector: "vitess.io/component=vtorc"

      # VReplication workflow monitoring
      vreplication:
        enabled: true
        lag_alert_seconds: 30
        track_workflow_phases: true

      # Keyspace topology context
      keyspaces:
        - name: commerce
          shards:
            - "-80"
            - "80-"
        - name: lookup
          shards:
            - "0"

    scrape_interval: 15s

remote_write:
  endpoint: https://ingest.atatus.net/api/v1/write
  bearer_token: "${TIGEROPS_API_KEY}"
FAQ

Common Questions

Which Vitess versions does TigerOps support?

TigerOps supports Vitess 15.x and later, including PlanetScale's Vitess distribution. Both Kubernetes operator (vitess-operator) deployments and bare-metal topologies are supported. The Prometheus metrics endpoints used by TigerOps are available in all supported versions.

How does TigerOps auto-discover VTTablet instances?

For Kubernetes, TigerOps uses the vitess.io/component=vttablet label selector to find all VTTablet pods and scrapes their metrics endpoints. Shard, keyspace, and cell labels from pod annotations are added automatically to all metrics.

Can TigerOps track VReplication workflow progress?

Yes. TigerOps monitors the vreplication_lag_seconds and vreplication_rows_copied metrics per workflow and stream. For MoveTables and Reshard operations, it tracks the current phase (copy, running, error) and estimates time to completion based on current copy rate.

How are Vitess throttler events surfaced in TigerOps?

TigerOps tracks throttler_tablet_lag and throttler_check_error_count metrics from VTTablet. When the throttler activates, TigerOps fires an alert indicating which tablet is throttling, the current replication lag driving the throttle, and the estimated impact on write throughput.

Does TigerOps integrate with VTOrc?

Yes. TigerOps scrapes VTOrc Prometheus metrics to track primary failover events, recovery attempts, and topology change operations. VTOrc-initiated failovers appear as events on your Vitess metrics timeline for incident correlation.

Get Started

Monitor Your Entire Vitess Topology in One Place

VTGate routing metrics, VReplication lag tracking, and AI resharding impact analysis. Deploy in 5 minutes.