adbd85366b16a9ccfb381304fbf9b3780cdd58ce
SmolLM3-3B LoRA — Tool Calling Fine-Tune
LoRA adapter training to make SmolLM3-3B a tool-calling savant.
Quick Start
# Build
docker build -t smollora .
# Run full pipeline (prepare data + train)
docker run --gpus all \
-v /path/on/host/output:/data/lora-output \
smollora
# Skip data prep if you already have processed data
docker run --gpus all \
-e SKIP_PREP=1 \
-v /path/on/host/processed:/data/processed \
-v /path/on/host/output:/data/lora-output \
smollora
Environment Variables
| Var | Default | Description |
|---|---|---|
MODEL |
HuggingFaceTB/SmolLM3-3B |
Base model (HF repo or local path) |
DATA_DIR |
/data/processed |
Processed data directory |
OUTPUT_DIR |
/data/lora-output |
Training output directory |
EPOCHS |
3 |
Training epochs |
BATCH_SIZE |
4 |
Per-device batch size |
LR |
2e-4 |
Learning rate |
LORA_R |
16 |
LoRA rank |
MAX_LENGTH |
4096 |
Max sequence length |
SKIP_PREP |
0 |
Set to 1 to skip data preparation |
Datasets
Three datasets combined and converted to SmolLM3's native token format:
- interstellarninja/tool-calls-multiturn — Multi-turn tool calling conversations
- NousResearch/Hermes-Function-Calling-V1 — Hermes-format function calling
- Salesforce/xLAM-function-calling-60k — Large-scale function calling (60k samples)
Only conversations containing tool calls are kept. All are normalized to SmolLM3's special tokens:
- Tool calls →
startPos/endPos(token IDs 128002/128016) - Tool responses →
eni/eni_result(token IDs 128013/128014)
LoRA Configuration
- Rank: 16
- Alpha: 32
- Target modules: q/k/v/o projections + gate/up/down MLP
- Dropout: 0.05
- Scheduler: Cosine with 3% warmup
- Optimizer: AdamW (fused)
- Gradient checkpointing: Enabled
Output
The trained adapter is saved to $OUTPUT_DIR/final/. To use with vLLM:
# Merge adapter into base model (recommended for vLLM)
python -m peft import PeftModel
# Or pass the adapter path directly with --enable-lora
SSH Deployment
# On GPU box, after SSH-ing in:
docker run --gpus all -v ~/smol-data:/data smollora
# Or with local model cache:
docker run --gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-v ~/smol-data:/data \
smollora
Description
Languages
Python
78.2%
Jinja
15%
Shell
4.9%
Dockerfile
1.9%