Azure AKS Integration
Monitor Azure Kubernetes Service clusters with full node, workload, and network telemetry. Get predictive alerts and AI root cause analysis before incidents impact your workloads.
How It Works
Create a Service Principal
Register an Azure Service Principal with the Monitoring Reader role scoped to your AKS resource group. TigerOps uses this identity to pull metrics from Azure Monitor without cluster-level credentials.
Enable Azure Monitor Diagnostics
Turn on Diagnostic Settings for your AKS cluster and route kube-apiserver, kube-controller-manager, and node metrics to a Log Analytics workspace or Event Hub.
Deploy the TigerOps Collector
Install the TigerOps Helm chart into your AKS cluster. The collector scrapes kubelet, cAdvisor, and kube-state-metrics endpoints and forwards data to TigerOps remote-write.
Configure Alert Policies
Set node CPU, memory, and pod restart thresholds. TigerOps AI correlates pod eviction events with node resource pressure and surfaces root cause automatically.
What You Get Out of the Box
Node-Level Resource Metrics
Track CPU, memory, disk I/O, and network throughput per node and node pool. Spot resource imbalances across system and user pools before workloads are evicted.
Workload Health Tracking
Deployment, DaemonSet, StatefulSet, and Job status with pod restart counts, OOMKill events, and container crash loop detection across every namespace.
Network Telemetry
Ingress controller request rates, pod-to-pod latency via eBPF, and Azure CNI plugin metrics including IP allocation exhaustion warnings.
Control Plane Visibility
API server request latency, etcd health, scheduler queue depth, and controller-manager reconciliation lag pulled directly from Azure Monitor Diagnostics.
Node Pool Autoscaler Insights
Cluster Autoscaler decisions, scale-out trigger reasons, and node provisioning duration. Understand why your cluster scaled and whether the scale event resolved the pressure.
AI Root Cause Correlation
When a pod enters CrashLoopBackOff, TigerOps AI links the event to node memory pressure, upstream throttling, or config changes deployed in the same window.
ARM Template — Diagnostic Settings
Enable Azure Monitor Diagnostic Settings on your AKS cluster to stream logs and metrics to TigerOps.
// ARM template — AKS Diagnostic Settings for TigerOps
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"aksClusterName": { "type": "string" },
"workspaceId": { "type": "string" }
},
"resources": [
{
"type": "Microsoft.ContainerService/managedClusters/providers/diagnosticSettings",
"apiVersion": "2021-05-01-preview",
"name": "[concat(parameters('aksClusterName'), '/Microsoft.Insights/tigerops')]",
"properties": {
"workspaceId": "[parameters('workspaceId')]",
"logs": [
{ "category": "kube-apiserver", "enabled": true },
{ "category": "kube-controller-manager", "enabled": true },
{ "category": "kube-scheduler", "enabled": true },
{ "category": "kube-audit", "enabled": true }
],
"metrics": [
{ "category": "AllMetrics", "enabled": true, "retentionPolicy": { "days": 30, "enabled": true } }
]
}
}
]
}
# TigerOps collector Helm install
helm repo add tigerops https://charts.atatus.net
helm install tigerops-collector tigerops/collector \
--namespace tigerops --create-namespace \
--set remoteWrite.endpoint=https://ingest.atatus.net/api/v1/write \
--set remoteWrite.bearerToken="${TIGEROPS_API_KEY}" \
--set cluster.name="${AKS_CLUSTER_NAME}" \
--set azure.subscriptionId="${AZURE_SUBSCRIPTION_ID}"Common Questions
Does TigerOps support AKS with Azure Active Directory integration?
Yes. TigerOps authenticates via a Service Principal with Monitoring Reader permissions. It does not need AAD pod identity or Workload Identity for metric collection, though Workload Identity is supported if you prefer it for the collector pod.
Can I monitor multiple AKS clusters in one TigerOps workspace?
Yes. Each cluster is identified by a cluster label injected at collection time. You can filter dashboards by cluster, region, or environment tag and set independent alert policies per cluster.
How does TigerOps handle AKS node pool upgrades?
The TigerOps collector uses a DaemonSet that reschedules automatically as nodes drain and reprovision during an upgrade. Metric continuity is maintained and upgrade-related pod disruptions are annotated on your dashboards.
What is the overhead of the TigerOps collector on AKS?
The collector DaemonSet requests 50m CPU and 64Mi memory per node by default. It scrapes metrics every 15 seconds and batches remote-write payloads to minimise egress cost from Azure.
Does TigerOps support Windows node pools in AKS?
Yes. The TigerOps Windows agent collects CPU, memory, and disk metrics from Windows node pools. Container-level metrics are available for Windows containers running on those nodes via the Windows Container runtime metrics endpoint.
Full Visibility Into Your AKS Clusters
Node telemetry, workload health, and AI-powered root cause analysis. Deploy in 5 minutes.