Azure Cosmos DB Integration
Track request units, latency percentiles, and partition health for Azure Cosmos DB. Detect hot partitions and RU throttling before they impact your globally distributed application.
How It Works
Assign Monitoring Reader Role
Grant the TigerOps Service Principal the Monitoring Reader role on your Cosmos DB account. No database-level credentials are required — TigerOps reads only from Azure Monitor.
Enable Diagnostic Settings
Configure Diagnostic Settings on your Cosmos DB account to stream DataPlaneRequests, PartitionKeyStatistics, and QueryRuntimeStatistics to a Log Analytics workspace.
Connect TigerOps Workspace
Enter your Azure Tenant ID, Client ID, and Subscription ID in TigerOps. The integration validates access and begins ingesting RU consumption and latency metrics within minutes.
Set RU Budget Alerts
Define per-container RU/s thresholds. TigerOps alerts when normalized RU consumption exceeds your provisioned throughput and forecasts when you need to scale.
What You Get Out of the Box
Request Unit Consumption
Per-database, per-container, and per-operation RU/s consumption versus provisioned throughput. Identify expensive queries, over-provisioned containers, and autoscale trigger patterns.
Latency Percentile Tracking
End-to-end read and write latency at p50, p95, and p99 per region and per partition key range. Detect hot partitions causing tail latency spikes before users complain.
Partition Health Analysis
Partition key distribution statistics, storage per physical partition, and throughput imbalance ratios. TigerOps flags partition skew before it causes throttling.
Throttling & 429 Rate Tracking
HTTP 429 TooManyRequests rate per container and operation type. Correlate throttle events with application retry storms and downstream latency increases.
Multi-Region Replication Lag
Replication latency between write and read regions, consistency level compliance, and conflict resolution statistics for multi-master Cosmos DB accounts.
Query Performance Insights
Top expensive queries by RU cost and execution count surfaced from QueryRuntimeStatistics. TigerOps AI recommends indexing changes to reduce RU spend.
Diagnostic Settings for Cosmos DB
Enable all relevant diagnostic categories to get full RU, latency, and partition visibility in TigerOps.
#!/bin/bash
# TigerOps — Cosmos DB Diagnostic Settings setup
COSMOS_ACCOUNT="my-cosmos-account"
RESOURCE_GROUP="my-resource-group"
WORKSPACE_ID="/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.OperationalInsights/workspaces/tigerops-workspace"
# Enable all Cosmos DB diagnostic categories
az monitor diagnostic-settings create \
--name tigerops-cosmos-diagnostics \
--resource "/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.DocumentDB/databaseAccounts/${COSMOS_ACCOUNT}" \
--workspace "${WORKSPACE_ID}" \
--metrics '[{"category":"Requests","enabled":true}]' \
--logs '[
{"category":"DataPlaneRequests", "enabled":true},
{"category":"QueryRuntimeStatistics", "enabled":true},
{"category":"PartitionKeyStatistics", "enabled":true},
{"category":"PartitionKeyRUConsumption", "enabled":true},
{"category":"ControlPlaneRequests", "enabled":true}
]'
# Create Service Principal for TigerOps
az ad sp create-for-rbac \
--name tigerops-cosmos-reader \
--role "Monitoring Reader" \
--scopes "/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}"
echo "Add these values to TigerOps Azure integration settings:"
echo " Tenant ID: ${TENANT_ID}"
echo " Client ID: ${CLIENT_ID}"
echo " Client Secret: ${CLIENT_SECRET}"
echo " Subscription ID: ${SUBSCRIPTION_ID}"Common Questions
Which Cosmos DB APIs does TigerOps support?
TigerOps supports Core (SQL), MongoDB, Cassandra, Gremlin, and Table APIs. Metrics are collected at the Azure Monitor level and are API-agnostic. Query-level insights are available for Core SQL and MongoDB APIs via diagnostic logs.
How does TigerOps detect hot partitions?
TigerOps reads PartitionKeyStatistics from Cosmos DB diagnostic logs and calculates the coefficient of variation across physical partitions. When one partition handles more than 20% above average traffic, an alert is raised with the hot partition key prefix.
Can TigerOps alert me before RU throttling starts?
Yes. TigerOps tracks normalized RU consumption (NormalizedRUConsumption metric) and alerts when it approaches 80% of provisioned throughput, giving you time to scale or optimize queries before 429 errors reach your application.
Does TigerOps support serverless Cosmos DB accounts?
Yes. For serverless accounts, TigerOps tracks total RU consumed per hour, peak operation rates, and storage growth. Throughput-based alerts are adapted for serverless capacity limits rather than provisioned RU/s.
How are Cosmos DB alerts correlated with application traces?
TigerOps links 429 throttle events and latency spikes to application traces that were active during the same window. You can see exactly which API endpoints, microservices, or batch jobs caused the RU spike.
Stop Chasing Cosmos DB Throttles After the Fact
RU consumption, partition health, and AI-powered query insights. Connect in 5 minutes.