When the model runs out of tokens while still reasoning (no think-end
emitted), all text goes to the reasoning field with zero content — the
model appears silent to the client.
Streaming fix: yield an extra content delta with the extracted reasoning
text before the finish chunk, so the client can see the output.
Non-streaming fix: move reasoning to content when finish_reason=length
and content is None.
Also adds the patched serving.py to the Dockerfile.