Enhancing Reinforcement Learning with Dense Rewards from Language Model CriticMeng Cao,Lei Shu, Lei Yu,Yun Zhu,Nevan Wichers,Yinxiao Liu,Lei MengEMNLP 2024(2024)引用 6|浏览1AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要