DoReMi: Optimizing Data Mixtures Speeds Up Language Model PretrainingSang Michael Xie,Hieu Pham,Xuanyi Dong,Nan Du,Hanxiao Liu,Yifeng Lu,Percy Liang,Quoc V Le,Tengyu Ma,Adams Wei YuNeurIPS 2023(2023)引用 171|浏览1537关键词language models,pretraining,domain reweighting,data curationAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要