谷歌浏览器插件
订阅小程序
在清言上使用

DAPO: an Open-Source LLM Reinforcement Learning System at Scale

Qiying Yu,Zheng Zhang, Ruofei Zhu, Yufeng Yuan, Xiaochen Zuo,Yu Yue, Weinan Dai, Tiantian Fan, Gaohong Liu, Lingjun Liu, Xin Liu,Haibin Lin, Zhiqi Lin, Bole Ma, Guangming Sheng, Yuxuan Tong, Chi Zhang, Mofan Zhang, Wang Zhang, Hang Zhu, Jinhua Zhu,Jiaze Chen,Jiangjie Chen, Chengyi Wang, Hongli Yu,Yuxuan Song, Xiangpeng Wei,Hao Zhou,Jingjing Liu,Wei-Ying Ma,Ya-Qin Zhang, Lin Yan, Mu Qiao,Yonghui Wu,Mingxuan Wang

arxiv(2025)

引用 0|浏览52
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要