All Integrations
ContainersHelm chart

Karpenter Integration

Monitor node provisioning latency, consolidation events, and capacity allocation for Karpenter. Optimize scaling responsiveness and cloud spend with AI-powered analytics.

Setup

How It Works

01

Enable Karpenter Metrics Endpoint

Karpenter exposes Prometheus metrics at port 8000 by default. TigerOps deploys a ServiceMonitor targeting the karpenter namespace and scrapes provisioning, disruption, and controller reconciliation metrics automatically.

02

Deploy TigerOps via Helm

Add the TigerOps Helm chart and configure the Karpenter integration. The chart creates the necessary RBAC to read NodePool, EC2NodeClass, and NodeClaim CRDs for enriching metrics with provisioner context.

03

Configure NodePool & Cloud Provider Mapping

Map Karpenter NodePools to cost centers and team labels in TigerOps. This enables per-team node cost attribution and capacity utilization tracking across different instance families and spot/on-demand mixes.

04

Set Provisioning & Cost Alerts

Define alert thresholds for provisioning latency, consolidation frequency, and over-provisioning ratios. TigerOps correlates Karpenter decisions with pod scheduling events to explain why nodes were provisioned or disrupted.

Capabilities

What You Get Out of the Box

Node Provisioning Latency Tracking

Track time-to-ready for each node from NodeClaim creation through cloud provider API call, instance launch, kubelet registration, and pod scheduling. P50/P95/P99 provisioning latency per NodePool and instance type.

Consolidation & Disruption Event Monitoring

Monitor Karpenter consolidation decisions — underutilized node deletions, node replacements, and disruption budget enforcement. Track disruption rates per NodePool and correlate with workload pod disruption budgets.

Capacity Allocation & Bin Packing Metrics

Measure node allocatable capacity vs. requested capacity per NodePool. Track spot vs. on-demand ratios, instance family distribution, and availability zone spread as Karpenter makes placement decisions.

NodeClaim Lifecycle & Drift Detection

Track NodeClaim state transitions (pending, launched, registered, ready, terminating) and drift detection events where nodes deviate from their NodePool spec and are marked for replacement.

Cloud Provider API Call Metrics

Monitor EC2 (or other cloud provider) API call rates, throttling events, and error rates from Karpenter. Identify when cloud provider rate limits are causing provisioning delays during scaling surges.

AI Scaling Pattern Analysis

TigerOps AI learns your workload scaling patterns and identifies suboptimal Karpenter configurations — overly aggressive consolidation, misconfigured disruption budgets, or NodePool constraints causing avoidable provisioning delays.

Configuration

TigerOps ServiceMonitor for Karpenter

Configure metric scraping and NodePool cost attribution for Karpenter monitoring.

tigerops-karpenter-values.yaml
# TigerOps Helm values for Karpenter integration
# helm repo add tigerops https://charts.atatus.net
# helm install tigerops tigerops/tigerops -f values.yaml

global:
  apiKey: "${TIGEROPS_API_KEY}"
  remoteWriteEndpoint: https://ingest.atatus.net/api/v1/write

karpenter:
  enabled: true
  namespace: kube-system   # or karpenter
  metricsPort: 8000
  scrapeInterval: 15s

  # Enrich metrics with NodePool and NodeClaim CRD data
  crdEnrichment:
    enabled: true
    resources:
      - nodepools.karpenter.sh
      - nodeclaims.karpenter.sh
      - ec2nodeclasses.karpenter.k8s.aws

  # Cost attribution per NodePool
  costAttribution:
    enabled: true
    cloudProvider: aws
    region: us-east-1
    # Map NodePool to team/cost-center
    nodepoolLabels:
      - team
      - environment

  # Provisioning latency alerting
  alerts:
    provisioningLatencyP95Seconds: 120
    consolidationDisruptionRatePct: 20
    cloudProviderThrottlingPerMin: 5
    nodeClaimFailedCount: 1

  # Spot interruption signal integration (AWS)
  spotInterruption:
    enabled: true
    eventBridgeIntegration: true
FAQ

Common Questions

Which Karpenter versions does TigerOps support?

TigerOps supports Karpenter v0.32 and later, including the stable v1 API (NodePool, EC2NodeClass, NodeClaim). Earlier versions using Provisioner and AWSNodeTemplate CRDs are also supported via the legacy metrics path.

How does TigerOps calculate node provisioning latency end-to-end?

TigerOps correlates the karpenter_nodeclaims_created timestamp with node Ready condition time from the Kubernetes node object. This gives true end-to-end latency from Karpenter decision through cloud API, AMI pull, kubelet startup, and node registration.

Can TigerOps help identify when Karpenter consolidation is disrupting too many pods?

Yes. TigerOps tracks pod disruption events correlated with Karpenter consolidation decisions. If your disruption budget is too permissive, TigerOps surfaces the consolidation-to-disruption ratio and recommends tighter budget configurations.

Does TigerOps support Karpenter on non-AWS cloud providers?

Yes. TigerOps collects the core Karpenter controller metrics that are cloud-provider-agnostic. For AWS, it adds EC2-specific metrics (Spot interruption signals, instance type costs). Azure AKS and GCP GKE Karpenter providers are also supported.

How does TigerOps integrate Karpenter metrics with cost monitoring?

TigerOps enriches NodeClaim metrics with cloud provider instance pricing data. This enables per-NodePool cost attribution, spot savings rate calculation, and idle capacity cost identification — all correlated with your actual workload resource requests.

Get Started

Understand Every Karpenter Provisioning Decision and Its Cost

Provisioning latency, consolidation rates, spot savings, and cloud API health — correlated with your workloads in real time.