Reasoning Traces
Reasoning Traces
Handling Reasoning Traces in Multi-Turn Conversations
How reasoning flows through a conversation
Turn 1: User sends message
↓
Model generates: <think>reasoning</think> content + tool_calls
↓
vLLM parses into: { reasoning_content, content, tool_calls }
↓
Turn 2: Client appends full assistant message (reasoning_content + content + tool_calls)
Client appends tool result
Client sends updated history
↓
Chat template re-wraps reasoning_content in <think>...</think> during tokenization
↓
Model sees prior chain-of-thought → generates next step correctlyQuick reference: assistant message shape
Field
Required
Notes
Field name: reasoning vs reasoning_content
reasoning vs reasoning_contentLayer
Field name
Direction
Python implementation
Installation
Complete agentic loop
TypeScript implementation
Installation
Complete agentic loop
OpenRouter integration
Debugging upstream requests
Common pitfalls
1. xml_in_reasoning — tool call XML inside reasoning field
xml_in_reasoning — tool call XML inside reasoning field2. reasoning_content ignored on self-hosted vLLM
reasoning_content ignored on self-hosted vLLM3. content: null on assistant tool-call turns
content: null on assistant tool-call turns4. Missing vLLM serving flags
Flag
Purpose
vLLM serving reference
Minimal serving command
Production serving command (example only)
Context length guidance
Last updated

