Muon is Scalable for LLM Training
Jingyuan Liu,Jianlin Su,Xingcheng Yao, Zhejun Jiang,Guokun Lai,Yulun Du, Yidao Qin, Weixin Xu, Enzhe Lu, Junjie Yan, Yanru Chen, Huabin Zheng,Yibo Liu, Shaowei Liu, Bohong Yin,Weiran He, Han Zhu,Yuzhi Wang, Jianzhou Wang, Mengnan Dong, Zheng Zhang, Yongsheng Kang,Hao Zhang, Xinran Xu,Yutao Zhang,Yuxin Wu,Xinyu Zhou,Zhilin Yang CoRR(2025)
AI 理解论文
溯源树
样例
