Cassandra Integration
Monitor Cassandra clusters with read/write latency percentiles, compaction metrics, GC pause durations, and node ring health — via the JMX Prometheus Exporter.
How It Works
Enable JMX Exporter
Add the Prometheus JMX Exporter Java agent to your Cassandra JVM options in cassandra-env.sh. This exposes all Cassandra MBean metrics as Prometheus-format metrics on a local HTTP endpoint.
Configure cassandra.yaml
Set the JMX exporter config file path in JVM_OPTS with -javaagent:/path/to/jmx_prometheus_javaagent.jar=7070:/etc/cassandra/tigerops-jmx.yaml. Restart each node for the change to take effect.
Add Prometheus Scrape Config
Add a scrape job to your Prometheus config pointing at each Cassandra node's JMX exporter port. Then configure Prometheus remote_write to push metrics to the TigerOps OTLP metrics endpoint.
Latency, GC & Compaction Flow
Within minutes TigerOps dashboards show read and write latency percentiles per keyspace, compaction pending tasks, GC pause durations, dropped mutations, and node-level health status.
What You Get Out of the Box
Read & Write Latency Percentiles
P50, P95, and P99 read and write latency per keyspace and table from the ClientRequest MBeans. TigerOps alerts when latency exceeds your SLO and correlates spikes with compaction activity.
Compaction Metrics
Pending compaction tasks, compaction throughput (MB/s), bytes compacted, and compaction time per SSTable. Heavy compaction storms that compete with read/write I/O are automatically detected and surfaced.
GC Pause Monitoring
G1GC and ZGC pause durations, GC overhead percentage, and heap memory usage from JVM MBeans. Long GC pauses that cause Cassandra coordinator timeouts are correlated with latency spikes.
Node & Ring Health
Live, joining, leaving, and down node counts per datacenter. Token ring completeness and data ownership percentages. TigerOps alerts when the number of live nodes drops below your RF quorum threshold.
Dropped Messages & Timeouts
Dropped mutation, read, and counter mutation counts from the DroppedMessageMetrics MBeans. Timeout rates per operation type are tracked as metrics and trigger alerts when they exceed thresholds.
Cross-DC Replication Lag
Hinted handoff delivery rates and pending hints per datacenter for multi-DC deployments. High hint counts indicate network partitions or node failures that could cause read inconsistency.
Configure JMX Exporter
Add the JMX exporter agent and a scrape config to start collecting Cassandra metrics.
# Download the JMX Prometheus Exporter
wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.20.0/jmx_prometheus_javaagent-0.20.0.jar -O /etc/cassandra/jmx_prometheus_javaagent.jar
# cassandra-env.sh — add JVM agent flag
JVM_OPTS="$JVM_OPTS -javaagent:/etc/cassandra/jmx_prometheus_javaagent.jar=7070:/etc/cassandra/tigerops-jmx.yaml"
# /etc/cassandra/tigerops-jmx.yaml — JMX exporter config
startDelaySeconds: 0
ssl: false
lowercaseOutputName: true
lowercaseOutputLabelNames: true
rules:
# Read/write latency (P50, P95, P99)
- pattern: 'org.apache.cassandra.metrics<type=ClientRequest, scope=Read, name=Latency><>(50thPercentile|95thPercentile|99thPercentile)'
name: cassandra_client_request_read_latency
labels:
quantile: "$1"
- pattern: 'org.apache.cassandra.metrics<type=ClientRequest, scope=Write, name=Latency><>(50thPercentile|95thPercentile|99thPercentile)'
name: cassandra_client_request_write_latency
labels:
quantile: "$1"
# Compaction
- pattern: 'org.apache.cassandra.metrics<type=Compaction, name=PendingTasks><>Value'
name: cassandra_compaction_pending_tasks
# Dropped messages
- pattern: 'org.apache.cassandra.metrics<type=DroppedMessage, scope=(w+), name=Dropped><>OneMinuteRate'
name: cassandra_dropped_messages_rate
labels:
message_type: "$1"
# TigerOps Collector scrape config (otel-collector.yaml)
receivers:
prometheus:
config:
scrape_configs:
- job_name: cassandra
scrape_interval: 15s
static_configs:
- targets:
- cassandra-node-1:7070
- cassandra-node-2:7070
- cassandra-node-3:7070
exporters:
otlphttp:
endpoint: https://ingest.tigerops.io/v1/metrics
headers:
Authorization: "Bearer ${TIGEROPS_API_KEY}"
service:
pipelines:
metrics:
receivers: [prometheus]
exporters: [otlphttp]Common Questions
Which Cassandra versions are supported?
Apache Cassandra 3.11, 4.0, 4.1, and 5.0 are supported. DataStax Enterprise (DSE) 6.8+ is supported using the same JMX exporter approach. The JMX MBean names are consistent across these versions.
Do I need a Prometheus server, or can TigerOps scrape Cassandra directly?
The TigerOps Collector agent (based on OpenTelemetry Collector) can scrape the JMX exporter Prometheus endpoint directly without a separate Prometheus server. Install the collector on each Cassandra node or on a dedicated scrape host.
How does TigerOps monitor Cassandra without impacting performance?
JMX MBean reads are non-blocking and have negligible overhead. The JMX Exporter serves metrics on a separate port from Cassandra's native transport port. We recommend scraping at 15-second intervals for production clusters.
Can TigerOps detect when a Cassandra node is down?
Yes. TigerOps monitors the StorageService.LiveNodes, JoiningNodes, and UnreachableNodes MBeans. When a node transitions to unreachable, TigerOps fires an alert with the node IP, datacenter, and rack information.
Does TigerOps support ScyllaDB as well as Cassandra?
Yes. ScyllaDB exposes a Prometheus-compatible metrics endpoint natively at /metrics on port 9180. Point the TigerOps scrape config at this endpoint — no JMX exporter is required for ScyllaDB.
Full Cassandra Observability via JMX Exporter
Read/write latency, compaction metrics, GC pause monitoring, and ring health — no application code changes.