All Integrations
CloudAPI integration + webhook

Railway Integration

Monitor service deployment metrics, resource utilization, and environment health across your Railway infrastructure. Catch deployment crashes and memory exhaustion before they impact your users.

Setup

How It Works

01

Connect via Railway API Token

Generate a Railway API token in your account settings and add it to TigerOps. The integration immediately discovers all projects, environments, and services across your Railway account via the Railway GraphQL API.

02

Configure Deployment Webhooks

Add the TigerOps webhook URL to your Railway project settings. Deployment lifecycle events — queued, building, deploying, succeeded, and crashed — are captured and correlated with metric changes in real time.

03

Set Resource & Deployment SLOs

Define per-service CPU and memory thresholds. TigerOps alerts when Railway services approach their resource allocation limits and flags deployment crash patterns across your environments.

04

Correlate Across Services & Environments

TigerOps maps Railway service dependency graphs and correlates deployment failures with upstream service metrics — identifying whether a crash was caused by the service itself or a degraded dependency.

Capabilities

What You Get Out of the Box

Deployment Success Rate Tracking

Track deployment success rates per service and per environment. Detect when a service starts failing builds or crashing on startup with a time series of deployment outcomes and build logs summaries.

CPU & Memory Utilization

Real-time CPU and memory usage per Railway service replica. Alert when services approach their Railway plan memory limits and predict memory growth trends before OOM kills take down your service.

Environment Health Dashboard

Unified health view per Railway environment (production, staging, PR environments). Track the number of healthy vs. crashed services per environment and alert when production environment health degrades.

Network Egress Monitoring

Monitor outbound network traffic per service and across your Railway project. Alert on unexpected egress spikes that could indicate runaway data exports or misconfigured services sending unnecessary data.

Volume & Persistent Storage Health

Track Railway volume storage utilization for services using persistent volumes. Alert when storage approaches Railway's volume limits before your service runs out of disk space and crashes.

AI Root Cause Analysis

When Railway services crash or degrade, TigerOps AI correlates the deployment timestamp, resource utilization spike, dependency service health, and build log error patterns to surface the root cause.

Configuration

Railway API & Webhook Setup

Connect your Railway project to TigerOps via API token and deployment webhooks.

tigerops-railway-config.yaml
# TigerOps Railway integration configuration

integrations:
  railway:
    # Railway API token (Account Settings > Tokens)
    apiToken: ${RAILWAY_API_TOKEN}
    # Poll interval for resource metrics
    pollInterval: 60s

    # Project and environment scope (empty = all)
    projects: []
    environments:
      - production
      - staging

    # Per-service resource SLOs
    serviceSLOs:
      "api":
        cpuWarningPct: 70
        cpuCriticalPct: 90
        memoryWarningMB: 400
        memoryCriticalMB: 480      # Railway Standard plan = 512MB
        deployFailureAlertAfter: 2  # alert after 2 consecutive failures
      "worker":
        memoryWarningMB: 300
        deployFailureAlertAfter: 1

# Add deployment webhook in Railway dashboard:
# Project Settings > Webhooks > Add Webhook
# URL: https://ingest.atatus.net/webhooks/railway/deploy
# Events: DEPLOY_STARTED, DEPLOY_SUCCESS, DEPLOY_FAILED, CRASH_RESTART

# TigerOps annotates dashboards with Railway deployment events
# and alerts on crash restart loops automatically.
FAQ

Common Questions

How does TigerOps collect resource metrics from Railway services?

TigerOps uses the Railway Metrics API (available via the Railway GraphQL API) to collect CPU and memory utilization per service replica. Deployment events are collected via webhooks for real-time correlation with metric anomalies.

Can TigerOps monitor Railway PR environments separately from production?

Yes. Railway PR environments are discovered automatically and labeled with their environment name. You can set separate (less strict) alert thresholds for PR environments while keeping tight SLOs on production, preventing PR deployments from generating unnecessary alerts.

Does TigerOps support Railway services with multiple replicas?

Yes. TigerOps aggregates metrics across all replicas of a Railway service and also tracks per-replica health. If one replica is consuming significantly more CPU than others, TigerOps surfaces the imbalance as a potential issue.

Can TigerOps alert me when a Railway cron service misses its schedule?

Yes. For Railway cron services, TigerOps monitors deployment execution frequency. If a cron service hasn't executed within the expected window of its configured schedule, TigerOps fires a missed execution alert with the expected vs. actual execution time.

How do I correlate Railway deploys with application error rate changes?

TigerOps annotates all time series dashboards with Railway deployment markers. When you see an error rate spike, click the deployment marker to see which service was deployed, the commit message, and the author — enabling instant deploy correlation without switching tools.

Get Started

Stop Discovering Railway Crashes After Your Users Do

Deployment crash alerts, memory limit monitoring, and deploy correlation. Connect via API token in 2 minutes.