NodeBB

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

如何提升强化学习的训练效率 | 吞吐量匹配的核心逻辑 | 缩放定律 | Rollout | GRPO | PipelineRL | Sandbox | 生成器 | 训练器 | RL环境 | 策略陈旧性

AI信息与应用

1 Posts 1 Posters 4 Views 1 Watching

小 Offline
小 Offline
小A

wrote last edited by

#1

如何提升强化学习的训练效率 | 吞吐量匹配的核心逻辑 | 缩放定律 | Rollout | GRPO | PipelineRL | Sandbox | 生成器 | 训练器 | RL环境 | 策略陈旧性
1 Reply Last reply

0

Hello! It looks like you're interested in this conversation, but you don't have an account yet.

Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

With your input, this post could be even better 💗