Chrome Extension
WeChat Mini Program
Use on ChatGLM
AI Reads Science
Chat
编组 4Search
Chat
编组 3ChatPaper

57,300,455

Researchers

310,374,719

Publications

8,934,850

Concepts

2,220,206,101

Citations
Follow
Explore
Report
Trend
Input keywords, let AI filter and summarize latest papers
The following are popular content recommendations, and the recommendations are more accurate after adding subscriptions
Topic
Hardware-Aligned and Natively Trainable Sparse Attention
The latest paper from DeepSeek introduces a new attention mechanism — NSA, a locally trainable sparse attention mechanism for ultra-fast long-context training and inference.
YiFan Zhang,Shanglin Lei,Runqi Qiao,Zhuoma GongQue,Xiaoshuai Song,Guanting Dong, Qiuna Tan, Zhe Wei, Peiqing Yang, Ye Tian, Yadong Xue, Xiaofei Wang,
CoRR (2024)
Cited0Views9797
Download
Bibtex
ChatPaper
3.5 Star
0
9797
Computing Research Repository (2024)
Cited7Views1720
Download
Bibtex
ChatPaper
3.5 Star
7
1720
Expand all 5 New Papers
Topic
Mixture of Block Attention for Long-Context LLMs
Kimi proposed a new attention mechanism, MoBA, which combines the principles of MoE and improves the efficiency of LLMs in long-text scenarios without sacrificing performance.
Minghao Xu, Lichuan Xiang,Xu Cai,Hongkai Wen
CoRR (2024)
Cited2Views1818
Download
Bibtex
ChatPaper
3.5 Star
2
1818
Benjamin Warner, Antoine Chaffin,Benjamin Clavié,Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas,Faisal Ladhak, Tom Aarsen,Nathan Cooper,Griffin Adams,
CoRR (2024)
Cited64Views1306
Download
Bibtex
ChatPaper
3.5 Star
64
1306
Frank F. Xu, Yufan Song, Boxuan Li, Yuxuan Tang, Kritanjali Jain, Mengxue Bao, Zora Z. Wang,Xuhui Zhou, Zhitong Guo, Murong Cao, Mingyang Yang, Hao Yang Lu,
Computing Research Repository (2024)
Cited19Views1126
Download
Bibtex
ChatPaper
3.5 Star
19
1126
Expand all 5 New Papers
Popular Recommendation
Popular Viewed Papers&Topics
This paper introduces a new technique called SparQ Attention, which can significantly reduce the memory bandwidth requirements of generative large language models during inference, thereby improving the throughput of LLM inference.
Luka Ribar,Ivan Chelombiev, Luke Hudlass-Galley,Charlie Blake, Carlo Luschi, Douglas Orr
CoRR (2023)
Cited0Views10351
Download
Bibtex
ChatPaper
3.5 Star
0
10351
Scaling up the size of vision models has become a practical trend to obtain more powerful visual representations. But is "bigger" always "better" in the future? This paper discusses the aspects of larger vision models that may not be necessary.
Baifeng Shi, Ziyang Wu, Maolin Mao,Xin Wang,Trevor DarrellTop Scholar
arXiv (2024)
Cited0Views8983
Download
Bibtex
ChatPaper
5.0 Star
0
8983
Ziyin Zhang, Chaoyu Chen,Bingchang Liu, Cong Liao, Zi Gong,Hang Yu,Jianguo Li,Rui Wang
CoRR (2023)
Cited4Views18950
Download
Bibtex
ChatPaper
4.5 Star
4
18950
Minghua Liu,Ruoxi Shi,Linghao Chen, Zhuoyang Zhang,Chao Xu,Xinyue Wei,Hansheng Chen, Chong Zeng, Jiayuan Gu,Hao SuTop Scholar
CVPR 2024 (2023)
Cited41Views7323
Download
Bibtex
ChatPaper
4.3 Star
41
7323
CoRR (2023)
Cited15Views5057
Download
Bibtex
ChatPaper
4.0 Star
15
5057
Hongxuan Zhang,Zhining Liu, Jiaqi Zheng ,Chenyi Zhuang, Jinjie Gu,Guihai ChenTop Scholar
CoRR (2023)
Cited0Views3692
Download
Bibtex
ChatPaper
3.5 Star
0
3692

Loading more RecommendationsGet more recommendations Get More RecommendationsAdd KeywordSet your interests to get accurate recommendation

gongan
期刊/会议
查看更多期刊/会议
京ICP备20011824号-11gongan京公网安备11010802035176号© 2005-2025 AMiner