VictoriaMetrics has a first-class /api/v1/import endpoint that accepts data with any timestamp — no bufferPast gates, no block size hacks, no special namespaces. You just send the data and it works.

Architecture

                 ┌─────────────────────────────────────────────────┐
                 │               Vultr VKE Cluster                 │
                 │                                                 │
Mimir ──import──▶ VictoriaMetrics (1 pod, 200Gi NVMe)            │
                 │   ↓ PromQL queries                              │
                 │   Traefik (TLS + basic auth)                    │
                 │   ↓                                             │
                 │   victoriametrics.vultrlabs.dev                              │
                 └─────────────────────────────────────────────────┘

Grafana queries both:
  - M3DB (m3db.vultrlabs.dev) → real-time data (1h blocks, going forward)
  - VictoriaMetrics (victoriametrics.vultrlabs.dev) → historical data (Mar 13–present)

Quick Start

1. Deploy VictoriaMetrics

# Apply manifests
kubectl apply -k .

# Wait for pod to be running
kubectl -n victoriametrics get pods -w

# Verify it's healthy
kubectl -n victoriametrics port-forward svc/victoriametrics 8428:8428 &
curl http://localhost:8428/health

2. Configure DNS

Get the Traefik LoadBalancer IP and point victoriametrics.vultrlabs.dev at it:

kubectl -n traefik get svc traefik

3. Set Up Basic Auth

Generate htpasswd and update the secret in 04-basic-auth-middleware.yaml:

htpasswd -nb vultr_vm <your-password>
# Copy output, base64 encode it:
echo -n '<htpasswd-output>' | base64
# Update the secret and apply
kubectl apply -f 04-basic-auth-middleware.yaml

4. Run Backfill

# Create the secret with Mimir credentials
kubectl create secret generic backfill-credentials \
  --from-literal=mimir-password='YOUR_MIMIR_PASSWORD' -n victoriametrics

# Upload the backfill script as a configmap
kubectl create configmap backfill-script \
  --from-file=backfill.py=backfill.py -n victoriametrics

# Run the backfill pod
kubectl apply -f backfill-pod.yaml

# Watch progress
kubectl logs -f backfill -n victoriametrics

# Cleanup when done
kubectl delete pod backfill -n victoriametrics
kubectl delete configmap backfill-script -n victoriametrics
kubectl delete secret backfill-credentials -n victoriametrics

5. Verify

# In-cluster
kubectl -n victoriametrics exec deploy/victoriametrics -- \
  curl -s 'http://localhost:8428/api/v1/query?query=vllm:prompt_tokens_total' | python3 -m json.tool

# External (with auth)
curl -u vultr_vm:<password> "https://victoriametrics.vultrlabs.dev/api/v1/query?query=up"

Grafana Configuration

Add VictoriaMetrics as a Prometheus datasource:

URL: https://victoriametrics.vultrlabs.dev (with basic auth)
In-cluster URL: http://victoriametrics.victoriametrics.svc.cluster.local:8428

Mixed Queries (M3DB + VictoriaMetrics)

Use a Mixed datasource in Grafana to query both:

Create two Prometheus datasources:
- M3DB → https://m3db.vultrlabs.dev
- VictoriaMetrics → https://victoriametrics.vultrlabs.dev
Create a Mixed datasource that includes both
In dashboards, use the mixed datasource — Grafana sends the query to both backends and merges results

Alternatively, use dashboard variables to let users toggle between datasources for different time ranges.

Metrics Stored

Metric	Description
`vllm:prompt_tokens_total`	vLLM prompt token count
`vllm:generation_tokens_total`	vLLM generation token count
`DCGM_FI_DEV_GPU_UTIL`	GPU utilization (DCGM)

All metrics are tagged with tenant=serverless-inference-cluster and cluster=serverless-inference-cluster.

VictoriaMetrics API Reference

Endpoint	Purpose
`/api/v1/import`	Import data (Prometheus format)
`/api/v1/export`	Export data
`/api/v1/query`	PromQL instant query
`/api/v1/query_range`	PromQL range query
/health	Health check
/metrics	Internal metrics

Storage

Size: 200Gi NVMe (Vultr Block Storage)
StorageClass: vultr-block-storage-vm (Retain policy — data survives PVC deletion)
Retention: 2 years
Volume expansion: kubectl edit pvc victoriametrics-data -n victoriametrics

Useful Commands

# Check VM health
kubectl -n victoriametrics exec deploy/victoriametrics -- curl -s http://localhost:8428/health

# Check storage stats
kubectl -n victoriametrics exec deploy/victoriametrics -- \
  curl -s 'http://localhost:8428/api/v1/query?query=vm_rows' | python3 -m json.tool

# Query historical data
curl -u vultr_vm:<password> \
  "https://victoriametrics.vultrlabs.dev/api/v1/query_range?query=vllm:prompt_tokens_total&start=1773360000&end=1742000000&step=60"

# Restart VM (if needed)
kubectl rollout restart deployment/victoriametrics -n victoriametrics

# Scale to 0 (preserve data, stop the pod)
kubectl scale deployment/victoriametrics --replicas=0 -n victoriametrics

Re-running Backfill

If you need to import additional time ranges or new metrics:

Edit backfill.py — update START_TS, END_TS, or METRICS
Recreate the configmap and pod (see step 4 above)
VictoriaMetrics is idempotent for imports — duplicate data points are merged, not duplicated

To convert timestamps:

# Date → Unix timestamp
date -u -d '2026-03-13 00:00:00' +%s    # 1773360000

# Unix timestamp → date
date -u -d @1773360000

README.md Unescape Escape

VictoriaMetrics — Historical Metrics Store

Why VictoriaMetrics Instead of M3DB?