[Performance] Add prefetch for checkpoints to OS page cache (#36012)
Signed-off-by: Artem Perevedentsev <aperevedents@nvidia.com>
This commit is contained in:
committed by
GitHub
parent
9b005edc48
commit
f5e59ee7a6
@@ -62,6 +62,9 @@ class LoadConfig:
|
||||
This is recommended for models on network filesystems (e.g., Lustre, NFS)
|
||||
as it avoids inefficient random reads, significantly speeding up model
|
||||
initialization. However, it uses more CPU RAM.
|
||||
- "prefetch": Checkpoint files are read into the OS page cache before
|
||||
workers load them, speeding up the model loading phase. Useful on
|
||||
network or high-latency storage.
|
||||
- "torchao": Weights are loaded in upfront and then reconstructed
|
||||
into torchao tensor subclasses. This is used when the checkpoint
|
||||
was quantized using torchao and saved using safetensors.
|
||||
|
||||
Reference in New Issue
Block a user