All Integrations
CloudGCP Service Account + Cloud Monitoring API

Google Dataflow Integration

Pipeline throughput, worker utilization, and watermark lag monitoring for Dataflow. Catch streaming pipeline slowdowns early and track batch job progress with AI-powered insights.

Setup

How It Works

01

Create a GCP Service Account

Create a service account with the Monitoring Viewer and Dataflow Viewer roles. TigerOps uses these to collect pipeline metrics, job states, and worker performance data.

02

Enable Required APIs

Enable the Cloud Monitoring API and Dataflow API in your GCP project. TigerOps will pull job status, worker metrics, and element throughput from Cloud Monitoring.

03

Configure TigerOps Dataflow

Enter your GCP project credentials in TigerOps. TigerOps auto-discovers running and recently completed Dataflow jobs and begins collecting metrics immediately.

04

Set Pipeline Health Alerts

Define watermark lag thresholds, worker CPU alerts, and job failure notifications. TigerOps fires alerts when streaming pipelines fall behind and predicts completion time for batch jobs.

Capabilities

What You Get Out of the Box

Pipeline Throughput Monitoring

Track elements processed per second, bytes processed, and data freshness per pipeline stage. TigerOps alerts on throughput drops and correlates them with source lag, worker failures, or backpressure.

Watermark Lag Tracking

Monitor system lag and data freshness lag for streaming Dataflow pipelines. TigerOps fires early alerts when watermark lag grows beyond your configured SLO before downstream consumers are affected.

Worker Utilization

Track CPU utilization, memory usage, and network throughput per Dataflow worker. TigerOps identifies workers approaching resource limits and correlates worker bottlenecks with pipeline slowdowns.

Job State & Failure Tracking

Monitor job state transitions (Running, Draining, Cancelling, Failed). TigerOps alerts on unexpected job failures, captures the error message, and correlates failures with upstream data source changes.

Autoscaling Visibility

Observe Dataflow horizontal autoscaling events — worker count changes, scale-up and scale-down triggers, and the pipeline conditions that drove scaling decisions.

Batch Job Progress Tracking

For batch Dataflow jobs, TigerOps tracks estimated completion percentage, remaining elements, and projected finish time. Alerts fire when batch jobs are predicted to exceed their expected duration SLO.

Configuration

Dataflow Integration Setup

Configure TigerOps to monitor your Dataflow jobs with GCP service account credentials.

tigerops-dataflow.yaml
# TigerOps Google Dataflow Integration
# Required IAM roles:
#   roles/monitoring.viewer
#   roles/dataflow.viewer

integrations:
  gcp_dataflow:
    project_id: "your-gcp-project-id"
    credentials_file: "./tigerops-sa-key.json"
    regions:
      - us-central1
      - us-east1

    # Job name prefixes to monitor (empty = all jobs)
    job_name_prefixes:
      - prod-streaming-
      - prod-batch-etl-

    scrape_interval: 60s

    metrics:
      - dataflow.googleapis.com/job/element_count
      - dataflow.googleapis.com/job/estimated_bytes
      - dataflow.googleapis.com/job/data_watermark_age
      - dataflow.googleapis.com/job/system_lag
      - dataflow.googleapis.com/job/total_vcpu_time
      - dataflow.googleapis.com/job/total_memory_usage_time

    alerts:
      watermark_lag_seconds: 300
      system_lag_seconds: 60
      worker_cpu_utilization_percent: 85
      job_failure_count: 1
      batch_job_overrun_percent: 120
FAQ

Common Questions

Does TigerOps support both Apache Beam streaming and batch Dataflow pipelines?

Yes. TigerOps monitors both streaming and batch Dataflow jobs. Streaming pipelines get watermark lag and system lag tracking, while batch jobs get progress percentage and estimated completion time monitoring.

What is watermark lag and why does TigerOps alert on it?

Watermark lag is the difference between the current processing time and the event-time watermark in a streaming pipeline. High watermark lag means your pipeline is falling behind the real-time data stream. TigerOps alerts when lag exceeds your SLO threshold, giving you time to scale workers or fix backpressure before downstream consumers notice stale data.

Can TigerOps correlate Dataflow pipeline performance with Pub/Sub input sources?

Yes. TigerOps automatically links Dataflow pipeline metrics with the Pub/Sub subscription metrics for pipelines reading from Pub/Sub. When a Pub/Sub backlog spike coincides with Dataflow watermark lag growth, TigerOps surfaces both signals in a single incident view.

How does TigerOps handle Dataflow Flex Templates and Classic Templates differently?

TigerOps monitors both Flex and Classic Template jobs through the same Cloud Monitoring API. Job-level metrics are collected uniformly regardless of template type. Flex Template-specific worker startup metrics are included in worker health monitoring.

Can TigerOps alert when a Dataflow job is not running when it should be?

Yes. TigerOps can monitor expected job schedules. If a regularly scheduled Dataflow job fails to start within a configurable window, TigerOps fires a missing-job alert. This is useful for batch ETL pipelines that run on a fixed schedule.

Get Started

Keep Your Data Pipelines Running On Time

Watermark lag alerts, worker health monitoring, and batch job SLO tracking for Dataflow. Connect in minutes.