186 lines
6.1 KiB
Markdown
186 lines
6.1 KiB
Markdown
# VictoriaMetrics — Historical Metrics Store
|
||
|
||
VictoriaMetrics instance for querying historical vLLM + DCGM metrics (March 13, 2026 onward) that couldn't be backfilled into M3DB.
|
||
|
||
## Why VictoriaMetrics Instead of M3DB?
|
||
|
||
M3DB doesn't support backfill. Period. See the [main README](../README.md#why-backfill-doesnt-work) for the full story.
|
||
|
||
VictoriaMetrics has a first-class `/api/v1/import` endpoint that accepts data with any timestamp — no `bufferPast` gates, no block size hacks, no special namespaces. You just send the data and it works.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────┐
|
||
│ Vultr VKE Cluster │
|
||
│ │
|
||
Mimir ──import──▶ VictoriaMetrics (1 pod, 200Gi NVMe) │
|
||
│ ↓ PromQL queries │
|
||
│ Traefik (TLS + basic auth) │
|
||
│ ↓ │
|
||
│ victoriametrics.vultrlabs.dev │
|
||
└─────────────────────────────────────────────────┘
|
||
|
||
Grafana queries both:
|
||
- M3DB (m3db.vultrlabs.dev) → real-time data (1h blocks, going forward)
|
||
- VictoriaMetrics (victoriametrics.vultrlabs.dev) → historical data (Mar 13–present)
|
||
```
|
||
|
||
## Quick Start
|
||
|
||
### 1. Deploy VictoriaMetrics
|
||
|
||
```bash
|
||
# Apply manifests
|
||
kubectl apply -k .
|
||
|
||
# Wait for pod to be running
|
||
kubectl -n victoriametrics get pods -w
|
||
|
||
# Verify it's healthy
|
||
kubectl -n victoriametrics port-forward svc/victoriametrics 8428:8428 &
|
||
curl http://localhost:8428/health
|
||
```
|
||
|
||
### 2. Configure DNS
|
||
|
||
Get the Traefik LoadBalancer IP and point `victoriametrics.vultrlabs.dev` at it:
|
||
|
||
```bash
|
||
kubectl -n traefik get svc traefik
|
||
```
|
||
|
||
### 3. Set Up Basic Auth
|
||
|
||
Generate htpasswd and update the secret in `04-basic-auth-middleware.yaml`:
|
||
|
||
```bash
|
||
htpasswd -nb vultr_vm <your-password>
|
||
# Copy output, base64 encode it:
|
||
echo -n '<htpasswd-output>' | base64
|
||
# Update the secret and apply
|
||
kubectl apply -f 04-basic-auth-middleware.yaml
|
||
```
|
||
|
||
### 4. Run Backfill
|
||
|
||
```bash
|
||
# Create the secret with Mimir credentials
|
||
kubectl create secret generic backfill-credentials \
|
||
--from-literal=mimir-password='YOUR_MIMIR_PASSWORD' -n victoriametrics
|
||
|
||
# Upload the backfill script as a configmap
|
||
kubectl create configmap backfill-script \
|
||
--from-file=backfill.py=backfill.py -n victoriametrics
|
||
|
||
# Run the backfill pod
|
||
kubectl apply -f backfill-pod.yaml
|
||
|
||
# Watch progress
|
||
kubectl logs -f backfill -n victoriametrics
|
||
|
||
# Cleanup when done
|
||
kubectl delete pod backfill -n victoriametrics
|
||
kubectl delete configmap backfill-script -n victoriametrics
|
||
kubectl delete secret backfill-credentials -n victoriametrics
|
||
```
|
||
|
||
### 5. Verify
|
||
|
||
```bash
|
||
# In-cluster
|
||
kubectl -n victoriametrics exec deploy/victoriametrics -- \
|
||
curl -s 'http://localhost:8428/api/v1/query?query=vllm:prompt_tokens_total' | python3 -m json.tool
|
||
|
||
# External (with auth)
|
||
curl -u vultr_vm:<password> "https://victoriametrics.vultrlabs.dev/api/v1/query?query=up"
|
||
```
|
||
|
||
## Grafana Configuration
|
||
|
||
Add VictoriaMetrics as a **Prometheus** datasource:
|
||
|
||
- **URL:** `https://victoriametrics.vultrlabs.dev` (with basic auth)
|
||
- **In-cluster URL:** `http://victoriametrics.victoriametrics.svc.cluster.local:8428`
|
||
|
||
### Mixed Queries (M3DB + VictoriaMetrics)
|
||
|
||
Use a **Mixed** datasource in Grafana to query both:
|
||
|
||
1. Create two Prometheus datasources:
|
||
- `M3DB` → `https://m3db.vultrlabs.dev`
|
||
- `VictoriaMetrics` → `https://victoriametrics.vultrlabs.dev`
|
||
|
||
2. Create a **Mixed** datasource that includes both
|
||
|
||
3. In dashboards, use the mixed datasource — Grafana sends the query to both backends and merges results
|
||
|
||
Alternatively, use dashboard variables to let users toggle between datasources for different time ranges.
|
||
|
||
## Metrics Stored
|
||
|
||
| Metric | Description |
|
||
|--------|-------------|
|
||
| `vllm:prompt_tokens_total` | vLLM prompt token count |
|
||
| `vllm:generation_tokens_total` | vLLM generation token count |
|
||
| `DCGM_FI_DEV_GPU_UTIL` | GPU utilization (DCGM) |
|
||
|
||
All metrics are tagged with `tenant=serverless-inference-cluster` and `cluster=serverless-inference-cluster`.
|
||
|
||
## VictoriaMetrics API Reference
|
||
|
||
| Endpoint | Purpose |
|
||
|----------|---------|
|
||
| `/api/v1/import` | Import data (Prometheus format) |
|
||
| `/api/v1/export` | Export data |
|
||
| `/api/v1/query` | PromQL instant query |
|
||
| `/api/v1/query_range` | PromQL range query |
|
||
| /health | Health check |
|
||
| /metrics | Internal metrics |
|
||
|
||
## Storage
|
||
|
||
- **Size:** 200Gi NVMe (Vultr Block Storage)
|
||
- **StorageClass:** `vultr-block-storage-vm` (Retain policy — data survives PVC deletion)
|
||
- **Retention:** 2 years
|
||
- **Volume expansion:** `kubectl edit pvc victoriametrics-data -n victoriametrics`
|
||
|
||
## Useful Commands
|
||
|
||
```bash
|
||
# Check VM health
|
||
kubectl -n victoriametrics exec deploy/victoriametrics -- curl -s http://localhost:8428/health
|
||
|
||
# Check storage stats
|
||
kubectl -n victoriametrics exec deploy/victoriametrics -- \
|
||
curl -s 'http://localhost:8428/api/v1/query?query=vm_rows' | python3 -m json.tool
|
||
|
||
# Query historical data
|
||
curl -u vultr_vm:<password> \
|
||
"https://victoriametrics.vultrlabs.dev/api/v1/query_range?query=vllm:prompt_tokens_total&start=1773360000&end=1742000000&step=60"
|
||
|
||
# Restart VM (if needed)
|
||
kubectl rollout restart deployment/victoriametrics -n victoriametrics
|
||
|
||
# Scale to 0 (preserve data, stop the pod)
|
||
kubectl scale deployment/victoriametrics --replicas=0 -n victoriametrics
|
||
```
|
||
|
||
## Re-running Backfill
|
||
|
||
If you need to import additional time ranges or new metrics:
|
||
|
||
1. Edit `backfill.py` — update `START_TS`, `END_TS`, or `METRICS`
|
||
2. Recreate the configmap and pod (see step 4 above)
|
||
3. VictoriaMetrics is idempotent for imports — duplicate data points are merged, not duplicated
|
||
|
||
To convert timestamps:
|
||
|
||
```bash
|
||
# Date → Unix timestamp
|
||
date -u -d '2026-03-13 00:00:00' +%s # 1773360000
|
||
|
||
# Unix timestamp → date
|
||
date -u -d @1773360000
|
||
```
|