Containerd Integration
Monitor low-level container runtime metrics, lifecycle events, and cgroup resource usage for containerd. Full visibility into image pulls, sandbox health, and shim stability.
How It Works
Install the TigerOps Agent
Deploy the TigerOps agent on each node running containerd. The agent auto-connects to the containerd gRPC socket at /run/containerd/containerd.sock and begins scraping runtime metrics immediately.
Enable CRI Metrics Endpoint
Configure containerd to expose its metrics endpoint by setting metrics.address in /etc/containerd/config.toml. TigerOps scrapes this Prometheus-compatible endpoint every 15 seconds.
Configure cgroup Scraping
Enable cgroup v2 metrics collection in the TigerOps agent config to capture per-container CPU throttling, memory pressure, and blkio stats directly from the Linux cgroup hierarchy.
Set Runtime Alerts
Define alert thresholds for container restart rates, image pull failures, and sandbox creation latency. TigerOps correlates runtime anomalies with Kubernetes pod events for full context.
What You Get Out of the Box
Container Lifecycle Event Tracking
Track container create, start, stop, and delete events in real time. TigerOps records lifecycle event latency and surfaces containers with abnormal churn rates or repeated crash loops.
Image Pull Latency Monitoring
Monitor image pull duration, layer download throughput, and snapshot unpack time. Identify slow registry responses and large image layers that degrade pod startup performance.
cgroup Resource Metrics
Per-container CPU quota usage, throttle percentage, memory working set, cache, and swap from cgroup v2. Detect containers approaching resource limits before OOMKill events occur.
Sandbox & Shim Health
Monitor containerd-shim process counts, sandbox creation success rates, and pause container health. Alert on shim crashes that indicate runtime instability at the node level.
Snapshot & Content Store Metrics
Track overlay snapshot creation latency, content store utilization, and garbage collection duration. Identify disk pressure caused by orphaned snapshots and un-GCed image layers.
AI Runtime Anomaly Detection
TigerOps AI baselines container startup times, image pull rates, and resource usage per workload. Automatic alerts fire when runtime behavior deviates from established patterns.
TigerOps Agent Config for Containerd
Configure the TigerOps agent to scrape containerd metrics and cgroup stats on each node.
# TigerOps Agent — containerd integration config
# Place at /etc/tigerops/agent.yaml on each node
containerd:
enabled: true
socket: /run/containerd/containerd.sock
# Namespaces to monitor (empty = all namespaces)
namespaces:
- k8s.io
- default
# Scrape the built-in Prometheus metrics endpoint
metricsEndpoint:
address: "127.0.0.1:1338"
scrapeInterval: 15s
# Lifecycle event subscription
events:
enabled: true
topics:
- /tasks/start
- /tasks/exit
- /containers/create
- /containers/delete
- /images/pull
# cgroup resource collection (v1 or v2 auto-detected)
cgroups:
enabled: true
scrapeInterval: 15s
# Collect per-container breakdown
perContainer: true
remoteWrite:
endpoint: https://ingest.atatus.net/api/v1/write
bearerToken: "${TIGEROPS_API_KEY}"
# Alert thresholds
alerts:
containerRestartRatePerMin: 5
imagePullLatencySeconds: 30
cgroupMemoryUsagePct: 90
sandboxCreationFailures: 1Common Questions
Does TigerOps support both cgroup v1 and cgroup v2 for containerd monitoring?
Yes. The TigerOps agent detects the cgroup version automatically. For cgroup v1, it reads subsystem files from /sys/fs/cgroup. For cgroup v2 (unified hierarchy), it reads from the unified mount point. Both paths produce equivalent CPU, memory, and blkio metric sets.
How does TigerOps connect to the containerd gRPC socket?
The TigerOps agent mounts the containerd socket (/run/containerd/containerd.sock) via a hostPath volume when running in Kubernetes, or directly on the host. It uses the containerd gRPC API to subscribe to events and query namespace-scoped container state.
Can I monitor containerd on nodes not running Kubernetes?
Yes. TigerOps supports standalone containerd deployments. Install the agent directly on the host, point it at the containerd socket, and it will discover all namespaces and containers without any Kubernetes context required.
How are containerd metrics correlated with Kubernetes pod metrics?
TigerOps joins containerd container IDs with Kubernetes pod metadata via the CRI API. This maps low-level runtime metrics (shim health, cgroup usage) to pod names, namespaces, and deployments so you can trace runtime issues to specific workloads.
What containerd versions are supported?
TigerOps supports containerd 1.5 and later. Versions 1.6+ with the built-in Prometheus metrics endpoint at /metrics are fully supported. For older versions, the agent falls back to direct cgroup and event subscription collection.
Get Full Visibility Into Your Container Runtime Layer
Lifecycle events, cgroup metrics, and image pull telemetry — all correlated with your Kubernetes workloads. Deploy in minutes.