Google Dataflow Integration
Pipeline throughput, worker utilization, and watermark lag monitoring for Dataflow. Catch streaming pipeline slowdowns early and track batch job progress with AI-powered insights.
How It Works
Create a GCP Service Account
Create a service account with the Monitoring Viewer and Dataflow Viewer roles. TigerOps uses these to collect pipeline metrics, job states, and worker performance data.
Enable Required APIs
Enable the Cloud Monitoring API and Dataflow API in your GCP project. TigerOps will pull job status, worker metrics, and element throughput from Cloud Monitoring.
Configure TigerOps Dataflow
Enter your GCP project credentials in TigerOps. TigerOps auto-discovers running and recently completed Dataflow jobs and begins collecting metrics immediately.
Set Pipeline Health Alerts
Define watermark lag thresholds, worker CPU alerts, and job failure notifications. TigerOps fires alerts when streaming pipelines fall behind and predicts completion time for batch jobs.
What You Get Out of the Box
Pipeline Throughput Monitoring
Track elements processed per second, bytes processed, and data freshness per pipeline stage. TigerOps alerts on throughput drops and correlates them with source lag, worker failures, or backpressure.
Watermark Lag Tracking
Monitor system lag and data freshness lag for streaming Dataflow pipelines. TigerOps fires early alerts when watermark lag grows beyond your configured SLO before downstream consumers are affected.
Worker Utilization
Track CPU utilization, memory usage, and network throughput per Dataflow worker. TigerOps identifies workers approaching resource limits and correlates worker bottlenecks with pipeline slowdowns.
Job State & Failure Tracking
Monitor job state transitions (Running, Draining, Cancelling, Failed). TigerOps alerts on unexpected job failures, captures the error message, and correlates failures with upstream data source changes.
Autoscaling Visibility
Observe Dataflow horizontal autoscaling events — worker count changes, scale-up and scale-down triggers, and the pipeline conditions that drove scaling decisions.
Batch Job Progress Tracking
For batch Dataflow jobs, TigerOps tracks estimated completion percentage, remaining elements, and projected finish time. Alerts fire when batch jobs are predicted to exceed their expected duration SLO.
Dataflow Integration Setup
Configure TigerOps to monitor your Dataflow jobs with GCP service account credentials.
# TigerOps Google Dataflow Integration
# Required IAM roles:
# roles/monitoring.viewer
# roles/dataflow.viewer
integrations:
gcp_dataflow:
project_id: "your-gcp-project-id"
credentials_file: "./tigerops-sa-key.json"
regions:
- us-central1
- us-east1
# Job name prefixes to monitor (empty = all jobs)
job_name_prefixes:
- prod-streaming-
- prod-batch-etl-
scrape_interval: 60s
metrics:
- dataflow.googleapis.com/job/element_count
- dataflow.googleapis.com/job/estimated_bytes
- dataflow.googleapis.com/job/data_watermark_age
- dataflow.googleapis.com/job/system_lag
- dataflow.googleapis.com/job/total_vcpu_time
- dataflow.googleapis.com/job/total_memory_usage_time
alerts:
watermark_lag_seconds: 300
system_lag_seconds: 60
worker_cpu_utilization_percent: 85
job_failure_count: 1
batch_job_overrun_percent: 120Common Questions
Does TigerOps support both Apache Beam streaming and batch Dataflow pipelines?
Yes. TigerOps monitors both streaming and batch Dataflow jobs. Streaming pipelines get watermark lag and system lag tracking, while batch jobs get progress percentage and estimated completion time monitoring.
What is watermark lag and why does TigerOps alert on it?
Watermark lag is the difference between the current processing time and the event-time watermark in a streaming pipeline. High watermark lag means your pipeline is falling behind the real-time data stream. TigerOps alerts when lag exceeds your SLO threshold, giving you time to scale workers or fix backpressure before downstream consumers notice stale data.
Can TigerOps correlate Dataflow pipeline performance with Pub/Sub input sources?
Yes. TigerOps automatically links Dataflow pipeline metrics with the Pub/Sub subscription metrics for pipelines reading from Pub/Sub. When a Pub/Sub backlog spike coincides with Dataflow watermark lag growth, TigerOps surfaces both signals in a single incident view.
How does TigerOps handle Dataflow Flex Templates and Classic Templates differently?
TigerOps monitors both Flex and Classic Template jobs through the same Cloud Monitoring API. Job-level metrics are collected uniformly regardless of template type. Flex Template-specific worker startup metrics are included in worker health monitoring.
Can TigerOps alert when a Dataflow job is not running when it should be?
Yes. TigerOps can monitor expected job schedules. If a regularly scheduled Dataflow job fails to start within a configurable window, TigerOps fires a missing-job alert. This is useful for batch ETL pipelines that run on a fixed schedule.
Keep Your Data Pipelines Running On Time
Watermark lag alerts, worker health monitoring, and batch job SLO tracking for Dataflow. Connect in minutes.