Milvus Integration
Monitor vector search latency, collection metrics, and index build performance across your Milvus clusters. Get per-collection SLO tracking and AI root cause analysis for your AI application infrastructure.
How It Works
Enable Milvus Prometheus Metrics
Enable the built-in Prometheus metrics endpoint in your Milvus milvus.yaml configuration. Set metrics.enableSystem: true and the TigerOps agent auto-discovers all queryNodes, dataNodes, indexNodes, and the rootCoord.
Deploy TigerOps Agent via Helm
Add the TigerOps Helm chart to your Milvus namespace. The agent scrapes the Milvus metrics endpoint and forwards all vector search, index build, and data compaction metrics to your TigerOps workspace.
Configure Collection-Level Dashboards
TigerOps auto-discovers all Milvus collections and creates per-collection dashboards for search latency, entity count, and index build status. Set collection-specific p99 latency SLOs and entity growth alerts.
Enable Cross-Service Correlation
Link Milvus search latency spikes to upstream embedding generation delays or downstream application error rates. TigerOps correlates your vector database with the full AI inference pipeline.
What You Get Out of the Box
Vector Search Latency Tracking
Per-collection search latency at p50, p95, and p99 with nq (number of query vectors), topK, and segment count dimensions. Identify which collections or query parameters are causing latency regressions.
Collection & Segment Metrics
Per-collection entity counts, segment count, growing vs. sealed segment ratios, flush operation frequency, and compaction task rates for complete collection lifecycle visibility.
Index Build Performance
Index build task queue depth, indexNode CPU utilization during build, HNSW/IVF/DiskANN build time per segment, and index build failure rates across all collections.
QueryNode Resource Utilization
Per-queryNode memory usage for loaded collections, segment cache hit rates, disk I/O for DiskANN indexes, and CPU utilization during vector distance computation.
Data Ingestion & Compaction
Insert throughput rates, WAL queue depth, dataNode flush latency, compaction task progress, and growing segment size to keep your collection segments healthy and search performance optimal.
AI Search Performance Analysis
When search latency spikes, TigerOps AI identifies whether the cause is a large nq batch, an unloaded collection segment, an index build contention event, or a queryNode memory pressure issue.
TigerOps Agent for Milvus
Configure the TigerOps agent to scrape Milvus Prometheus metrics and forward them to your workspace.
# TigerOps Milvus Agent Configuration
# First, enable metrics in milvus.yaml:
# metricConfig:
# enableSystem: true
# port: 9091
receivers:
milvus_prometheus:
# Scrape all Milvus component metrics endpoints
endpoints:
- url: "http://milvus-rootcoord:9091/metrics"
component: rootcoord
- url: "http://milvus-querynode:9091/metrics"
component: querynode
- url: "http://milvus-datanode:9091/metrics"
component: datanode
- url: "http://milvus-indexnode:9091/metrics"
component: indexnode
collection_interval: 15s
milvus_sdk:
# SDK-level health checks
endpoint: "milvus-proxy:19530"
collections:
- name: "embeddings_v2"
search_latency_p99_ms: 200
min_loaded_replicas: 2
- name: "user_profiles"
search_latency_p99_ms: 100
health_check_interval: 30s
exporters:
tigerops:
endpoint: "https://ingest.atatus.net/api/v1/write"
bearer_token: "${TIGEROPS_API_KEY}"
send_interval: 15s
alerts:
index_build_queue_depth: 50
growing_segment_count_warning: 100
querynode_memory_pct: 85Common Questions
Which Milvus versions and deployment modes does TigerOps support?
TigerOps supports Milvus 2.2 and later in standalone, cluster, and Zilliz Cloud managed modes. The Prometheus metrics endpoint is supported in all modes. For Zilliz Cloud, the TigerOps API integration uses the Zilliz Cloud metrics API.
How does TigerOps track Milvus search latency per collection?
TigerOps collects the milvus_querynode_search_latency_seconds histogram metric labeled by collection_name. This gives per-collection p50, p95, and p99 search latency without any application-side instrumentation changes.
Can TigerOps alert when a Milvus collection is not fully loaded into memory?
Yes. TigerOps monitors the collection load state via the Milvus SDK health check and the queryNode segment loading metrics. If a collection is partially loaded or a load operation stalls, TigerOps fires an alert with the affected collection name and load progress.
How does TigerOps monitor Milvus index build progress?
TigerOps tracks the milvus_indexnode_build_index_latency metric and the index task queue depth from the rootCoord. You receive alerts when index builds fall behind insertion rate (growing segment accumulation) or when an index build fails.
Does TigerOps support monitoring Milvus with multiple replicas per collection?
Yes. TigerOps tracks per-replica search latency and load balancing effectiveness when collections are loaded with multiple replicas. You can see if search traffic is evenly distributed across replicas and identify which replica is the latency bottleneck.
Stop Discovering Milvus Search Latency Regressions After Deployment
Per-collection SLO tracking, index build monitoring, and AI root cause analysis. Deploy in 5 minutes.