Chrome Extension
WeChat Mini Program
Use on ChatGLM

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Zhenghao Lin, Zihao Tang,Xiao Liu,Yeyun Gong, Yi Cheng,Qi Chen,Hang Li, Ying Xin,Ziyue Yang,Kailai Yang,Yu Yan, Xiao Liang,Shuai Lu, Yiming Huang,Zheheng Luo, Lei Qu, Xuan Feng, Yaoxiang Wang, Yuqing Xia, Feiyang Chen, Yuting Jiang, Yasen Hu, Hao Ni,Binyang Li,Guoshuai Zhao, Jui-Hao Chiang, Zhongxin Guo,Chen Lin,Kun Kuang,Wenjie Li,Yelong Shen,Jian Jiao,Peng Cheng,Mao Yang

CoRR(2025)

Cited 0|Views6
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined