DatabasesTigerOps agent

ScyllaDB Integration

Monitor shard-per-core metrics, reactor scheduling latency, and compaction throughput across your ScyllaDB clusters. Get AI-powered root cause analysis before latency SLOs are breached.

Monitor ScyllaDB Book a Demo

Setup

How It Works

Enable Prometheus Metrics Endpoint

ScyllaDB exposes Prometheus metrics natively on port 9180. Verify the endpoint is accessible and configure your TigerOps agent to scrape it — no additional exporter is required.

Deploy TigerOps Agent

Install the TigerOps agent on each ScyllaDB node or as a Kubernetes DaemonSet. The agent scrapes per-shard metrics and aggregates them with topology-aware labels for node, datacenter, and rack.

Configure Keyspace Monitoring

Specify keyspace and table patterns to monitor. TigerOps collects per-table read/write latency, SSTable counts, bloom filter hit rates, and compaction statistics at the shard level.

Set Latency and Compaction Alerts

Define p99 read/write latency SLOs and compaction backlog thresholds. TigerOps correlates reactor stalls with scheduling violations and compaction pressure to surface root causes instantly.

Capabilities

What You Get Out of the Box

Shard-Per-Core Visibility

Per-shard CPU utilization, task queue depth, and reactor utilization across every ScyllaDB core. Identify hot shards and cross-shard task imbalances that degrade throughput.

Reactor Scheduling Latency

Track reactor task scheduling delays (task_quota violations), stall detector activations, and preemption latency. TigerOps alerts when scheduling stalls breach your latency SLO.

Compaction Throughput & Backlog

Monitor compaction input/output byte rates, pending compaction bytes, SSTables per level, and tombstone ratios. Detect compaction debt before it causes read amplification.

Read & Write Latency Histograms

p50, p95, p99, and p999 coordinator and replica read/write latency per keyspace and table. TigerOps baselines seasonal patterns and alerts on statistical anomalies.

Cache Hit Ratios

Row cache and key cache hit rates, eviction counts, and memory utilization per keyspace. Track how cache pressure correlates with disk read amplification and latency spikes.

AI Root Cause Analysis

When latency degrades, TigerOps AI examines reactor stalls, compaction pressure, cache evictions, and network I/O concurrently to identify whether the root cause is CPU, disk, or network saturation.

Configuration

TigerOps Agent Config for ScyllaDB

Configure the TigerOps agent to scrape ScyllaDB's native Prometheus endpoint on each node.

tigerops-scylladb.yaml

# TigerOps ScyllaDB integration config
# Place at /etc/tigerops/conf.d/scylladb.yaml

integrations:
  - name: scylladb
    type: prometheus_scrape
    config:
      # ScyllaDB exposes Prometheus metrics natively on port 9180
      targets:
        - scylla-node-1.internal:9180
        - scylla-node-2.internal:9180
        - scylla-node-3.internal:9180

      # Add topology labels from Scylla REST API
      topology_enrichment:
        enabled: true
        rest_api_port: 10000   # Scylla REST API port

      # Keyspace/table filtering
      metrics_filter:
        keyspaces:
          - my_app_keyspace
          - analytics
        # Per-table metrics (can be high cardinality)
        per_table_metrics: true

      # Metric families to collect
      collect:
        - scylla_reactor_*
        - scylla_scheduler_*
        - scylla_storage_proxy_*
        - scylla_compaction_manager_*
        - scylla_cache_*
        - scylla_commitlog_*

    scrape_interval: 15s

remote_write:
  endpoint: https://ingest.atatus.net/api/v1/write
  bearer_token: "${TIGEROPS_API_KEY}"

# Alert rules
alerts:
  reactorStallMs: 50          # fire if any stall exceeds 50ms
  compactionPendingGb: 10     # pending compaction backlog
  readLatencyP99Ms: 10        # p99 read latency SLO
  writeLatencyP99Ms: 5        # p99 write latency SLO
  cacheHitRatioMin: 0.80      # alert if cache hit rate drops below 80%

FAQ

Common Questions

Which ScyllaDB versions does TigerOps support?

TigerOps supports ScyllaDB Open Source 4.x, 5.x, and 6.x, as well as ScyllaDB Enterprise and ScyllaDB Cloud. The integration uses the native Prometheus endpoint (port 9180), which is available in all supported versions.

How are per-shard metrics aggregated?

TigerOps collects raw per-shard metrics and stores them with a shard label for drill-down analysis. It also computes node-level and cluster-level aggregates automatically so dashboards work at any level of granularity without custom PromQL.

Can TigerOps detect reactor stalls in real time?

Yes. TigerOps monitors scylla_reactor_stalls and scylla_scheduler_stalls counters and fires alerts within one scrape interval (default 15s) of a stall being recorded. The alert payload includes the affected shard ID and concurrent metric context.

How does TigerOps handle multi-datacenter ScyllaDB deployments?

The TigerOps agent uses datacenter and rack labels from ScyllaDB gossip metadata to tag all metrics. You can scope dashboards and alert rules to a specific datacenter or compare cross-DC replication latency without manual label management.

Does TigerOps integrate with Scylla Manager?

Yes. TigerOps can ingest Scylla Manager repair and backup job metrics via its REST API bridge, giving you a unified view of repair progress, backup duration, and cluster health alerts alongside real-time performance metrics.

Get Started

Get Full Shard-Level Visibility into ScyllaDB

Reactor scheduling latency, compaction backlog monitoring, and AI root cause analysis. Deploy in 5 minutes.

Start Free Talk to an Engineer

Explore More

Related Integrations

YugabyteDBDatabases

Neo4jDatabases

VitessDatabases

ElasticsearchDatabases

View all 275+ integrations