Karpenter Integration
Monitor node provisioning latency, consolidation events, and capacity allocation for Karpenter. Optimize scaling responsiveness and cloud spend with AI-powered analytics.
How It Works
Enable Karpenter Metrics Endpoint
Karpenter exposes Prometheus metrics at port 8000 by default. TigerOps deploys a ServiceMonitor targeting the karpenter namespace and scrapes provisioning, disruption, and controller reconciliation metrics automatically.
Deploy TigerOps via Helm
Add the TigerOps Helm chart and configure the Karpenter integration. The chart creates the necessary RBAC to read NodePool, EC2NodeClass, and NodeClaim CRDs for enriching metrics with provisioner context.
Configure NodePool & Cloud Provider Mapping
Map Karpenter NodePools to cost centers and team labels in TigerOps. This enables per-team node cost attribution and capacity utilization tracking across different instance families and spot/on-demand mixes.
Set Provisioning & Cost Alerts
Define alert thresholds for provisioning latency, consolidation frequency, and over-provisioning ratios. TigerOps correlates Karpenter decisions with pod scheduling events to explain why nodes were provisioned or disrupted.
What You Get Out of the Box
Node Provisioning Latency Tracking
Track time-to-ready for each node from NodeClaim creation through cloud provider API call, instance launch, kubelet registration, and pod scheduling. P50/P95/P99 provisioning latency per NodePool and instance type.
Consolidation & Disruption Event Monitoring
Monitor Karpenter consolidation decisions — underutilized node deletions, node replacements, and disruption budget enforcement. Track disruption rates per NodePool and correlate with workload pod disruption budgets.
Capacity Allocation & Bin Packing Metrics
Measure node allocatable capacity vs. requested capacity per NodePool. Track spot vs. on-demand ratios, instance family distribution, and availability zone spread as Karpenter makes placement decisions.
NodeClaim Lifecycle & Drift Detection
Track NodeClaim state transitions (pending, launched, registered, ready, terminating) and drift detection events where nodes deviate from their NodePool spec and are marked for replacement.
Cloud Provider API Call Metrics
Monitor EC2 (or other cloud provider) API call rates, throttling events, and error rates from Karpenter. Identify when cloud provider rate limits are causing provisioning delays during scaling surges.
AI Scaling Pattern Analysis
TigerOps AI learns your workload scaling patterns and identifies suboptimal Karpenter configurations — overly aggressive consolidation, misconfigured disruption budgets, or NodePool constraints causing avoidable provisioning delays.
TigerOps ServiceMonitor for Karpenter
Configure metric scraping and NodePool cost attribution for Karpenter monitoring.
# TigerOps Helm values for Karpenter integration
# helm repo add tigerops https://charts.atatus.net
# helm install tigerops tigerops/tigerops -f values.yaml
global:
apiKey: "${TIGEROPS_API_KEY}"
remoteWriteEndpoint: https://ingest.atatus.net/api/v1/write
karpenter:
enabled: true
namespace: kube-system # or karpenter
metricsPort: 8000
scrapeInterval: 15s
# Enrich metrics with NodePool and NodeClaim CRD data
crdEnrichment:
enabled: true
resources:
- nodepools.karpenter.sh
- nodeclaims.karpenter.sh
- ec2nodeclasses.karpenter.k8s.aws
# Cost attribution per NodePool
costAttribution:
enabled: true
cloudProvider: aws
region: us-east-1
# Map NodePool to team/cost-center
nodepoolLabels:
- team
- environment
# Provisioning latency alerting
alerts:
provisioningLatencyP95Seconds: 120
consolidationDisruptionRatePct: 20
cloudProviderThrottlingPerMin: 5
nodeClaimFailedCount: 1
# Spot interruption signal integration (AWS)
spotInterruption:
enabled: true
eventBridgeIntegration: trueCommon Questions
Which Karpenter versions does TigerOps support?
TigerOps supports Karpenter v0.32 and later, including the stable v1 API (NodePool, EC2NodeClass, NodeClaim). Earlier versions using Provisioner and AWSNodeTemplate CRDs are also supported via the legacy metrics path.
How does TigerOps calculate node provisioning latency end-to-end?
TigerOps correlates the karpenter_nodeclaims_created timestamp with node Ready condition time from the Kubernetes node object. This gives true end-to-end latency from Karpenter decision through cloud API, AMI pull, kubelet startup, and node registration.
Can TigerOps help identify when Karpenter consolidation is disrupting too many pods?
Yes. TigerOps tracks pod disruption events correlated with Karpenter consolidation decisions. If your disruption budget is too permissive, TigerOps surfaces the consolidation-to-disruption ratio and recommends tighter budget configurations.
Does TigerOps support Karpenter on non-AWS cloud providers?
Yes. TigerOps collects the core Karpenter controller metrics that are cloud-provider-agnostic. For AWS, it adds EC2-specific metrics (Spot interruption signals, instance type costs). Azure AKS and GCP GKE Karpenter providers are also supported.
How does TigerOps integrate Karpenter metrics with cost monitoring?
TigerOps enriches NodeClaim metrics with cloud provider instance pricing data. This enables per-NodePool cost attribution, spot savings rate calculation, and idle capacity cost identification — all correlated with your actual workload resource requests.
Understand Every Karpenter Provisioning Decision and Its Cost
Provisioning latency, consolidation rates, spot savings, and cloud API health — correlated with your workloads in real time.