From 81eee050181eb7bd1dad8f41d2b06af231ed29db Mon Sep 17 00:00:00 2001 From: biondizzle Date: Fri, 22 May 2026 17:09:53 +0000 Subject: [PATCH] README: add test harness instructions --- README.md | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/README.md b/README.md index c5552d9f..ec68e7e5 100644 --- a/README.md +++ b/README.md @@ -156,6 +156,51 @@ dsv4/ --- +## Test Harness + +Scripts in `tests/` for running tests on the B200 (`root@45.76.247.107`): + +### `run_test.sh` — Run a test in a screen session + +```bash +# On the B200: +cd /root/dsv4-nvfp4-workspace/kernel +bash tests/run_test.sh tests/unit/test_fmha_v3.py +``` + +What it does: +1. Kills any existing `kernel-test` screen and **SIGKILLs all child processes** (handles deadlocked GPU procs that ignore SIGHUP) +2. Deletes the old log file +3. Starts a new `screen -dmS kernel-test` running the test +4. Logs output to `/tmp/kernel-test.log` +5. Verifies the screen started + +### `check_log.sh` — Check test progress + +```bash +bash tests/check_log.sh +``` + +Shows the log contents and whether the screen is still running. + +### Local → B200 workflow + +```bash +# 1. Edit locally, commit, push +cd ~/dev/nvfp4-megamoe-kernel +git add -A && git commit -m "my change" && git push + +# 2. SSH to B200, pull, run +ssh root@45.76.247.107 +cd /root/dsv4-nvfp4-workspace/kernel && git pull +bash tests/run_test.sh tests/unit/test_fmha_v3_stage_c_full.py + +# 3. Check results +bash tests/check_log.sh +``` + +--- + ## Stage C: Online Softmax — SINGLE-TILE ONLY ### What We Have