# VictoriaMetrics — Historical Metrics Store VictoriaMetrics instance for querying historical vLLM + DCGM metrics (March 13, 2026 onward) that couldn't be backfilled into M3DB. ## Why VictoriaMetrics Instead of M3DB? M3DB doesn't support backfill. Period. See the [main README](../README.md#why-backfill-doesnt-work) for the full story. VictoriaMetrics has a first-class `/api/v1/import` endpoint that accepts data with any timestamp — no `bufferPast` gates, no block size hacks, no special namespaces. You just send the data and it works. ## Architecture ``` ┌─────────────────────────────────────────────────┐ │ Vultr VKE Cluster │ │ │ Mimir ──import──▶ VictoriaMetrics (1 pod, 200Gi NVMe) │ │ ↓ PromQL queries │ │ Traefik (TLS + basic auth) │ │ ↓ │ │ vm.vultrlabs.dev │ └─────────────────────────────────────────────────┘ Grafana queries both: - M3DB (m3db.vultrlabs.dev) → real-time data (1h blocks, going forward) - VictoriaMetrics (vm.vultrlabs.dev) → historical data (Mar 13–present) ``` ## Quick Start ### 1. Deploy VictoriaMetrics ```bash # Apply manifests kubectl apply -k . # Wait for pod to be running kubectl -n victoriametrics get pods -w # Verify it's healthy kubectl -n victoriametrics port-forward svc/victoriametrics 8428:8428 & curl http://localhost:8428/health ``` ### 2. Configure DNS Get the Traefik LoadBalancer IP and point `vm.vultrlabs.dev` at it: ```bash kubectl -n traefik get svc traefik ``` ### 3. Set Up Basic Auth Generate htpasswd and update the secret in `04-basic-auth-middleware.yaml`: ```bash htpasswd -nb vultr_vm # Copy output, base64 encode it: echo -n '' | base64 # Update the secret and apply kubectl apply -f 04-basic-auth-middleware.yaml ``` ### 4. Run Backfill ```bash # Create the secret with Mimir credentials kubectl create secret generic backfill-credentials \ --from-literal=mimir-password='YOUR_MIMIR_PASSWORD' -n victoriametrics # Upload the backfill script as a configmap kubectl create configmap backfill-script \ --from-file=backfill.py=backfill.py -n victoriametrics # Run the backfill pod kubectl apply -f backfill-pod.yaml # Watch progress kubectl logs -f backfill -n victoriametrics # Cleanup when done kubectl delete pod backfill -n victoriametrics kubectl delete configmap backfill-script -n victoriametrics kubectl delete secret backfill-credentials -n victoriametrics ``` ### 5. Verify ```bash # In-cluster kubectl -n victoriametrics exec deploy/victoriametrics -- \ curl -s 'http://localhost:8428/api/v1/query?query=vllm:prompt_tokens_total' | python3 -m json.tool # External (with auth) curl -u vultr_vm: "https://vm.vultrlabs.dev/api/v1/query?query=up" ``` ## Grafana Configuration Add VictoriaMetrics as a **Prometheus** datasource: - **URL:** `https://vm.vultrlabs.dev` (with basic auth) - **In-cluster URL:** `http://victoriametrics.victoriametrics.svc.cluster.local:8428` ### Mixed Queries (M3DB + VictoriaMetrics) Use a **Mixed** datasource in Grafana to query both: 1. Create two Prometheus datasources: - `M3DB` → `https://m3db.vultrlabs.dev` - `VictoriaMetrics` → `https://vm.vultrlabs.dev` 2. Create a **Mixed** datasource that includes both 3. In dashboards, use the mixed datasource — Grafana sends the query to both backends and merges results Alternatively, use dashboard variables to let users toggle between datasources for different time ranges. ## Metrics Stored | Metric | Description | |--------|-------------| | `vllm:prompt_tokens_total` | vLLM prompt token count | | `vllm:generation_tokens_total` | vLLM generation token count | | `DCGM_FI_DEV_GPU_UTIL` | GPU utilization (DCGM) | All metrics are tagged with `tenant=serverless-inference-cluster` and `cluster=serverless-inference-cluster`. ## VictoriaMetrics API Reference | Endpoint | Purpose | |----------|---------| | `/api/v1/import` | Import data (Prometheus format) | | `/api/v1/export` | Export data | | `/api/v1/query` | PromQL instant query | | `/api/v1/query_range` | PromQL range query | | /health | Health check | | /metrics | Internal metrics | ## Storage - **Size:** 200Gi NVMe (Vultr Block Storage) - **StorageClass:** `vultr-block-storage-vm` (Retain policy — data survives PVC deletion) - **Retention:** 2 years - **Volume expansion:** `kubectl edit pvc victoriametrics-data -n victoriametrics` ## Useful Commands ```bash # Check VM health kubectl -n victoriametrics exec deploy/victoriametrics -- curl -s http://localhost:8428/health # Check storage stats kubectl -n victoriametrics exec deploy/victoriametrics -- \ curl -s 'http://localhost:8428/api/v1/query?query=vm_rows' | python3 -m json.tool # Query historical data curl -u vultr_vm: \ "https://vm.vultrlabs.dev/api/v1/query_range?query=vllm:prompt_tokens_total&start=1773360000&end=1742000000&step=60" # Restart VM (if needed) kubectl rollout restart deployment/victoriametrics -n victoriametrics # Scale to 0 (preserve data, stop the pod) kubectl scale deployment/victoriametrics --replicas=0 -n victoriametrics ``` ## Re-running Backfill If you need to import additional time ranges or new metrics: 1. Edit `backfill.py` — update `START_TS`, `END_TS`, or `METRICS` 2. Recreate the configmap and pod (see step 4 above) 3. VictoriaMetrics is idempotent for imports — duplicate data points are merged, not duplicated To convert timestamps: ```bash # Date → Unix timestamp date -u -d '2026-03-13 00:00:00' +%s # 1773360000 # Unix timestamp → date date -u -d @1773360000 ```