2025 05 Multi Turn Rl
A new paper Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment, joint work with Siliang, Quan, William (Prime Intellect), Oana (Morgan Stanley), Yuriy Nevmyvaka (Morgan Stanley) is available here. In this work, we show that it is critical to perform credit assignment when training LLMs for multi-turn agent applications. Code available here.
