[Doc] Fix typo in documentation (#14783)

Signed-off-by: yasu52 <tsuguro4649@gmail.com>
2025-03-13 20:33:09 -07:00
parent d47807ba08
commit 3fb17d26c8
13 changed files with 19 additions and 19 deletions
--- a/docs/source/training/rlhf.md
+++ b/docs/source/training/rlhf.md
@@ -1,6 +1,6 @@
 # Reinforcement Learning from Human Feedback

-Reinforcement Learning from Human Feedback (RLHF) is a technique that fine-tunes language models using human-generated preference data to align model outputs with desired behaviours.
+Reinforcement Learning from Human Feedback (RLHF) is a technique that fine-tunes language models using human-generated preference data to align model outputs with desired behaviors.

 vLLM can be used to generate the completions for RLHF. The best way to do this is with libraries like [TRL](https://github.com/huggingface/trl), [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) and [verl](https://github.com/volcengine/verl).