victoriametrics/README.md

# VictoriaMetrics — Historical Metrics Store

VictoriaMetrics instance for querying historical vLLM + DCGM metrics (March 13, 2026 onward) that couldn't be backfilled into M3DB.

## Why VictoriaMetrics Instead of M3DB?

M3DB doesn't support backfill. Period. See the [main README](../README.md#why-backfill-doesnt-work) for the full story.

VictoriaMetrics has a first-class `/api/v1/import` endpoint that accepts data with any timestamp — no `bufferPast` gates, no block size hacks, no special namespaces. You just send the data and it works.

## Architecture

```
                 ┌─────────────────────────────────────────────────┐
                 │               Vultr VKE Cluster                 │
                 │                                                 │
Mimir ──import──▶ VictoriaMetrics (1 pod, 200Gi NVMe)            │
                 │   ↓ PromQL queries                              │
                 │   Traefik (TLS + basic auth)                    │
                 │   ↓                                             │
                 │   victoriametrics.vultrlabs.dev                              │
                 └─────────────────────────────────────────────────┘

Grafana queries both:
  - M3DB (m3db.vultrlabs.dev) → real-time data (1h blocks, going forward)
  - VictoriaMetrics (victoriametrics.vultrlabs.dev) → historical data (Mar 13–present)
```

## Quick Start

### 1. Deploy VictoriaMetrics

```bash
# Apply manifests
kubectl apply -k .

# Wait for pod to be running
kubectl -n victoriametrics get pods -w

# Verify it's healthy
kubectl -n victoriametrics port-forward svc/victoriametrics 8428:8428 &
curl http://localhost:8428/health
```

### 2. Configure DNS

Get the Traefik LoadBalancer IP and point `victoriametrics.vultrlabs.dev` at it:

```bash
kubectl -n traefik get svc traefik
```

### 3. Set Up Basic Auth

Generate htpasswd and update the secret in `04-basic-auth-middleware.yaml`:

```bash
htpasswd -nb vultr_vm <your-password>
# Copy output, base64 encode it:
echo -n '<htpasswd-output>' | base64
# Update the secret and apply
kubectl apply -f 04-basic-auth-middleware.yaml
```

### 4. Run Backfill

```bash
# Create the secret with Mimir credentials
kubectl create secret generic backfill-credentials \
  --from-literal=mimir-password='YOUR_MIMIR_PASSWORD' -n victoriametrics

# Upload the backfill script as a configmap
kubectl create configmap backfill-script \
  --from-file=backfill.py=backfill.py -n victoriametrics

# Run the backfill pod
kubectl apply -f backfill-pod.yaml

# Watch progress
kubectl logs -f backfill -n victoriametrics

# Cleanup when done
kubectl delete pod backfill -n victoriametrics
kubectl delete configmap backfill-script -n victoriametrics
kubectl delete secret backfill-credentials -n victoriametrics
```

### 5. Verify

```bash
# In-cluster
kubectl -n victoriametrics exec deploy/victoriametrics -- \
  curl -s 'http://localhost:8428/api/v1/query?query=vllm:prompt_tokens_total' | python3 -m json.tool

# External (with auth)
curl -u vultr_vm:<password> "https://victoriametrics.vultrlabs.dev/api/v1/query?query=up"
```

## Grafana Configuration

Add VictoriaMetrics as a **Prometheus** datasource:

- **URL:** `https://victoriametrics.vultrlabs.dev` (with basic auth)
- **In-cluster URL:** `http://victoriametrics.victoriametrics.svc.cluster.local:8428`

### Mixed Queries (M3DB + VictoriaMetrics)

Use a **Mixed** datasource in Grafana to query both:

1. Create two Prometheus datasources:
   - `M3DB` → `https://m3db.vultrlabs.dev`
   - `VictoriaMetrics` → `https://victoriametrics.vultrlabs.dev`

2. Create a **Mixed** datasource that includes both

3. In dashboards, use the mixed datasource — Grafana sends the query to both backends and merges results

Alternatively, use dashboard variables to let users toggle between datasources for different time ranges.

## Metrics Stored

| Metric | Description |
|--------|-------------|
| `vllm:prompt_tokens_total` | vLLM prompt token count |
| `vllm:generation_tokens_total` | vLLM generation token count |
| `DCGM_FI_DEV_GPU_UTIL` | GPU utilization (DCGM) |

All metrics are tagged with `tenant=serverless-inference-cluster` and `cluster=serverless-inference-cluster`.

## VictoriaMetrics API Reference

| Endpoint | Purpose |
|----------|---------|
| `/api/v1/import` | Import data (Prometheus format) |
| `/api/v1/export` | Export data |
| `/api/v1/query` | PromQL instant query |
| `/api/v1/query_range` | PromQL range query |
| /health | Health check |
| /metrics | Internal metrics |

## Storage

- **Size:** 200Gi NVMe (Vultr Block Storage)
- **StorageClass:** `vultr-block-storage-vm` (Retain policy — data survives PVC deletion)
- **Retention:** 2 years
- **Volume expansion:** `kubectl edit pvc victoriametrics-data -n victoriametrics`

## Useful Commands

```bash
# Check VM health
kubectl -n victoriametrics exec deploy/victoriametrics -- curl -s http://localhost:8428/health

# Check storage stats
kubectl -n victoriametrics exec deploy/victoriametrics -- \
  curl -s 'http://localhost:8428/api/v1/query?query=vm_rows' | python3 -m json.tool

# Query historical data
curl -u vultr_vm:<password> \
  "https://victoriametrics.vultrlabs.dev/api/v1/query_range?query=vllm:prompt_tokens_total&start=1773360000&end=1742000000&step=60"

# Restart VM (if needed)
kubectl rollout restart deployment/victoriametrics -n victoriametrics

# Scale to 0 (preserve data, stop the pod)
kubectl scale deployment/victoriametrics --replicas=0 -n victoriametrics
```

## Re-running Backfill

If you need to import additional time ranges or new metrics:

1. Edit `backfill.py` — update `START_TS`, `END_TS`, or `METRICS`
2. Recreate the configmap and pod (see step 4 above)
3. VictoriaMetrics is idempotent for imports — duplicate data points are merged, not duplicated

To convert timestamps:

```bash
# Date → Unix timestamp
date -u -d '2026-03-13 00:00:00' +%s    # 1773360000

# Unix timestamp → date
date -u -d @1773360000
```
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
+								# VictoriaMetrics — Historical Metrics Store
 								VictoriaMetrics instance for querying historical vLLM + DCGM metrics (March 13, 2026 onward) that couldn't be backfilled into M3DB.
 								## Why VictoriaMetrics Instead of M3DB?
 								M3DB doesn't support backfill. Period. See the [main README](../README.md#why-backfill-doesnt-work) for the full story.
 								VictoriaMetrics has a first-class `/api/v1/import` endpoint that accepts data with any timestamp — no `bufferPast` gates, no block size hacks, no special namespaces. You just send the data and it works.
 								## Architecture
 								```
 								                 ┌─────────────────────────────────────────────────┐
 								                 │               Vultr VKE Cluster                 │
 								                 │                                                 │
 								Mimir ──import──▶ VictoriaMetrics (1 pod, 200Gi NVMe)            │
 								                 │   ↓ PromQL queries                              │
 								                 │   Traefik (TLS + basic auth)                    │
 								                 │   ↓                                             │
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								                 │   victoriametrics.vultrlabs.dev                              │
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
+								                 └─────────────────────────────────────────────────┘
 								Grafana queries both:
 								  - M3DB (m3db.vultrlabs.dev) → real-time data (1h blocks, going forward)
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								  - VictoriaMetrics (victoriametrics.vultrlabs.dev) → historical data (Mar 13–present)
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
+								```
 								## Quick Start
 								### 1. Deploy VictoriaMetrics
 								```bash
 								# Apply manifests
 								kubectl apply -k .
 								# Wait for pod to be running
 								kubectl -n victoriametrics get pods -w
 								# Verify it's healthy
 								kubectl -n victoriametrics port-forward svc/victoriametrics 8428:8428 &
 								curl http://localhost:8428/health
 								```
 								### 2. Configure DNS
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								Get the Traefik LoadBalancer IP and point `victoriametrics.vultrlabs.dev` at it:
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
 								```bash
 								kubectl -n traefik get svc traefik
 								```
 								### 3. Set Up Basic Auth
 								Generate htpasswd and update the secret in `04-basic-auth-middleware.yaml`:
 								```bash
 								htpasswd -nb vultr_vm <your-password>
 								# Copy output, base64 encode it:
 								echo -n '<htpasswd-output>' | base64
 								# Update the secret and apply
 								kubectl apply -f 04-basic-auth-middleware.yaml
 								```
 								### 4. Run Backfill
 								```bash
 								# Create the secret with Mimir credentials
 								kubectl create secret generic backfill-credentials \
 								  --from-literal=mimir-password='YOUR_MIMIR_PASSWORD' -n victoriametrics
 								# Upload the backfill script as a configmap
 								kubectl create configmap backfill-script \
 								  --from-file=backfill.py=backfill.py -n victoriametrics
 								# Run the backfill pod
 								kubectl apply -f backfill-pod.yaml
 								# Watch progress
 								kubectl logs -f backfill -n victoriametrics
 								# Cleanup when done
 								kubectl delete pod backfill -n victoriametrics
 								kubectl delete configmap backfill-script -n victoriametrics
 								kubectl delete secret backfill-credentials -n victoriametrics
 								```
 								### 5. Verify
 								```bash
 								# In-cluster
 								kubectl -n victoriametrics exec deploy/victoriametrics -- \
 								  curl -s 'http://localhost:8428/api/v1/query?query=vllm:prompt_tokens_total' | python3 -m json.tool
 								# External (with auth)
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								curl -u vultr_vm:<password> "https://victoriametrics.vultrlabs.dev/api/v1/query?query=up"
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
+								```
 								## Grafana Configuration
 								Add VictoriaMetrics as a **Prometheus** datasource:
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								- **URL:** `https://victoriametrics.vultrlabs.dev` (with basic auth)
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
+								- **In-cluster URL:** `http://victoriametrics.victoriametrics.svc.cluster.local:8428`
 								### Mixed Queries (M3DB + VictoriaMetrics)
 								Use a **Mixed** datasource in Grafana to query both:
 . Create two Prometheus datasources:
 								   - `M3DB` → `https://m3db.vultrlabs.dev`
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								   - `VictoriaMetrics` → `https://victoriametrics.vultrlabs.dev`
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
 . Create a **Mixed** datasource that includes both
 . In dashboards, use the mixed datasource — Grafana sends the query to both backends and merges results
 								Alternatively, use dashboard variables to let users toggle between datasources for different time ranges.
 								## Metrics Stored
 								| Metric | Description |
 								|--------|-------------|
 								| `vllm:prompt_tokens_total` | vLLM prompt token count |
 								| `vllm:generation_tokens_total` | vLLM generation token count |
 								| `DCGM_FI_DEV_GPU_UTIL` | GPU utilization (DCGM) |
 								All metrics are tagged with `tenant=serverless-inference-cluster` and `cluster=serverless-inference-cluster`.
 								## VictoriaMetrics API Reference
 								| Endpoint | Purpose |
 								|----------|---------|
 								| `/api/v1/import` | Import data (Prometheus format) |
 								| `/api/v1/export` | Export data |
 								| `/api/v1/query` | PromQL instant query |
 								| `/api/v1/query_range` | PromQL range query |
 								| /health | Health check |
 								| /metrics | Internal metrics |
 								## Storage
 								- **Size:** 200Gi NVMe (Vultr Block Storage)
 								- **StorageClass:** `vultr-block-storage-vm` (Retain policy — data survives PVC deletion)
 								- **Retention:** 2 years
 								- **Volume expansion:** `kubectl edit pvc victoriametrics-data -n victoriametrics`
 								## Useful Commands
 								```bash
 								# Check VM health
 								kubectl -n victoriametrics exec deploy/victoriametrics -- curl -s http://localhost:8428/health
 								# Check storage stats
 								kubectl -n victoriametrics exec deploy/victoriametrics -- \
 								  curl -s 'http://localhost:8428/api/v1/query?query=vm_rows' | python3 -m json.tool
 								# Query historical data
 								curl -u vultr_vm:<password> \
-												Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev

											
										
										
											2026-04-09 19:33:58 +00:00
+								  "https://victoriametrics.vultrlabs.dev/api/v1/query_range?query=vllm:prompt_tokens_total&start=1773360000&end=1742000000&step=60"
-												Add VictoriaMetrics for historical metrics (Mar 13+)

- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions

											
										
										
											2026-04-09 19:29:18 +00:00
 								# Restart VM (if needed)
 								kubectl rollout restart deployment/victoriametrics -n victoriametrics
 								# Scale to 0 (preserve data, stop the pod)
 								kubectl scale deployment/victoriametrics --replicas=0 -n victoriametrics
 								```
 								## Re-running Backfill
 								If you need to import additional time ranges or new metrics:
 . Edit `backfill.py` — update `START_TS`, `END_TS`, or `METRICS`
 . Recreate the configmap and pod (see step 4 above)
 . VictoriaMetrics is idempotent for imports — duplicate data points are merged, not duplicated
 								To convert timestamps:
 								```bash
 								# Date → Unix timestamp
 								date -u -d '2026-03-13 00:00:00' +%s    # 1773360000
 								# Unix timestamp → date
 								date -u -d @1773360000
 								```