Enhancing LLM Reasoning Via Critique Models with Test-Time and Training-Time Supervision
Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, Wei He, Boyang Hong, Shihan Do,Wenyu Zhan,Xiao Wang,Rui Zheng,Tao Ji, Xiaowei Shi, Yitao Zhai,Rongxiang Weng,Jingang Wang,Xunliang Cai,Tao Gui,Zuxuan Wu,Qi Zhang,Xipeng Qiu,Xuanjing Huang,Yu-Gang Jiang CoRR(2024)
AI 理解论文
溯源树
样例
