youkaichao
|
ebf778061d
|
monitor metrics of tokens per step using cudagraph batchsizes (#11031)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-09 22:35:36 -08:00 |
|
tomeras91
|
7c32b6861e
|
[Frontend] correctly record prefill and decode time metrics (#10853)
Signed-off-by: Tomer Asida <tomera@ai21.com>
|
2024-12-03 19:13:31 +00:00 |
|
cduk
|
b7954776fd
|
[core] Avoid metrics log noise when idle - include speculative decodi… (#10809)
|
2024-12-02 01:49:48 +00:00 |
|
Russell Bryant
|
efa9084628
|
[Core] Avoid metrics log noise when idle (#8868)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-19 21:05:25 +00:00 |
|
Travis Johnson
|
272e31c0bd
|
[Bugfix] Guard for negative counter metrics to prevent crash (#10430)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-11-19 04:57:10 +00:00 |
|
harrywu
|
874f551b36
|
[Metrics] add more metrics (#4464)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-12 00:17:38 +08:00 |
|
tomeras91
|
ac04a97a9f
|
[Frontend] Add max_tokens prometheus metric (#9881)
Signed-off-by: Tomer Asida <tomera@ai21.com>
|
2024-11-04 22:53:24 +00:00 |
|
Kunjan
|
0ad216f575
|
[MISC] Set label value to timestamp over 0, to keep track of recent history (#9777)
Signed-off-by: Kunjan Patel <kunjanp@google.com>
|
2024-10-29 19:52:19 +00:00 |
|
科英
|
74fc2d77ae
|
[Misc] Add metrics for request queue time, forward time, and execute time (#9659)
|
2024-10-29 10:32:56 -07:00 |
|
Kunjan
|
9bb10a7d27
|
[MISC] Add lora requests to metrics (#9477)
Co-authored-by: Kunjan Patel <kunjanp_google_com@vllm.us-central1-a.c.kunjanp-gke-dev-2.internal>
|
2024-10-18 20:50:18 +00:00 |
|
Russell Bryant
|
776dbd74f1
|
[CI/Build] mypy: Resolve some errors from checking vllm/engine (#9267)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-16 22:55:59 +00:00 |
|
youkaichao
|
cbc2ef5529
|
[misc] hide best_of from engine (#9261)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
|
2024-10-10 21:30:44 -07:00 |
|
Cody Yu
|
3ac50b47d0
|
[MISC] Add prefix cache hit rate to metrics (#7606)
|
2024-08-19 11:52:07 -07:00 |
|
Robert Shaw
|
e3b318216d
|
[ Bugfix ] Fix Prometheus Metrics With zeromq Frontend (#7279)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-08-18 20:19:48 +00:00 |
|
Thomas Parnell
|
2f808e69ab
|
[Bugfix] StatLoggers: cache spec decode metrics when they get collected. (#6645)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2024-07-23 23:05:05 +00:00 |
|
Antoni Baum
|
5f0b9933e6
|
[Bugfix] Fix Ray Metrics API usage (#6354)
|
2024-07-17 19:40:10 +00:00 |
|
Cody Yu
|
160e1d8c99
|
[Misc] Log spec decode metrics (#6454)
|
2024-07-16 20:37:10 +00:00 |
|
Thomas Parnell
|
7508a3dc34
|
[Misc] Fix typos in spec. decode metrics logging. (#6470)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2024-07-16 13:55:15 +00:00 |
|
sroy745
|
80ca1e6a3a
|
[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker (#5348)
|
2024-07-01 00:33:05 -07:00 |
|
William Lin
|
906a19cdb0
|
[Misc] Extend vLLM Metrics logging API (#5925)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
|
2024-06-29 10:36:06 +08:00 |
|
Cyrus Leung
|
0e9164b40a
|
[mypy] Enable type checking for test directory (#5017)
|
2024-06-15 04:45:31 +00:00 |
|
SangBin Cho
|
e7c46b9527
|
[Scheduler] Warning upon preemption and Swapping (#4647)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
|
2024-05-13 23:50:44 +09:00 |
|
Robert Shaw
|
4dc8026d86
|
[Bugfix] Fix 307 Redirect for /metrics (#4523)
|
2024-05-01 09:14:13 -07:00 |
|
Ronen Schaffer
|
bf480c5302
|
Add more Prometheus metrics (#2764)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
|
2024-04-28 15:59:33 -07:00 |
|
Roy
|
87f545ba6f
|
[Misc] Fix logger format typo (#4396)
|
2024-04-27 13:45:02 +08:00 |
|
SangBin Cho
|
a88081bf76
|
[CI] Disable non-lazy string operation on logging (#4326)
Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>
|
2024-04-26 00:16:58 -07:00 |
|
Cade Daniel
|
62b8aebc6f
|
[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (#3951)
|
2024-04-23 08:02:36 +00:00 |
|
zspo
|
ec8e3c695f
|
[Bugfix] fix_log_time_in_metrics (#4050)
|
2024-04-13 07:52:36 -07:00 |
|
Michael Feil
|
c2b4a1bce9
|
[Doc] Add typing hints / mypy types cleanup (#3816)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-04-11 17:17:21 -07:00 |
|
SangBin Cho
|
01bfb22b41
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
Zhuohan Li
|
2f8844ba08
|
Re-enable the 80 char line width limit (#3305)
|
2024-03-10 19:49:14 -07:00 |
|
Allen.Dou
|
29e70e3e88
|
allow user chose log level by --log-level instead of fixed 'info'. (#3109)
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-03-01 23:28:41 +00:00 |
|
Allen.Dou
|
9289e577ec
|
add cache_config's info to prometheus metrics. (#3100)
|
2024-02-29 06:15:18 +00:00 |
|
Harry Mellor
|
ef978fe411
|
Port metrics from aioprometheus to prometheus_client (#2730)
|
2024-02-25 11:54:00 -08:00 |
|
Robert Shaw
|
93b38bea5d
|
Refactor Prometheus and Add Request Level Metrics (#2316)
|
2024-01-31 14:58:07 -08:00 |
|
Simon Mo
|
5313c2cb8b
|
Add Production Metrics in Prometheus format (#1890)
|
2023-12-02 16:37:44 -08:00 |
|