Chrome Extension
WeChat Mini Program
Use on ChatGLM

SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models

Zixiang Xu, Yanbo Wang,Yue Huang, Jiayi Ye,Haomin Zhuang, Zirui Song, Lang Gao, Chenxi Wang, Zhaorun Chen, Yujun Zhou, Sixian Li, Wang Pan, Yue Zhao,Jieyu Zhao, Xiangliang Zhang,Xiuying Chen

arxiv(2025)

Cited 0|Views0
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined