WeChat Mini Program
Old Version Features

STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals

Weihang Su,Yiran Hu, Anzhe Xie,Qingyao Ai, Zibing Que, Ning Zheng,Yun Liu,Weixing Shen,Yiqun Liu

CoRR(2024)

Cited 0|Views28
Abstract
Statute retrieval aims to find relevant statutory articles for specificqueries. This process is the basis of a wide range of legal applications suchas legal advice, automated judicial decisions, legal document drafting, etc.Existing statute retrieval benchmarks focus on formal and professional queriesfrom sources like bar exams and legal case documents, thereby neglectingnon-professional queries from the general public, which often lack preciselegal terminology and references. To address this gap, we introduce the STAtuteRetrieval Dataset (STARD), a Chinese dataset comprising 1,543 query casescollected from real-world legal consultations and 55,348 candidate statutoryarticles. Unlike existing statute retrieval datasets, which primarily focus onprofessional legal queries, STARD captures the complexity and diversity of realqueries from the general public. Through a comprehensive evaluation of variousretrieval baselines, we reveal that existing retrieval approaches all fallshort of these real queries issued by non-professional users. The best methodonly achieves a Recall@100 of 0.907, suggesting the necessity for furtherexploration and additional research in this area. All the codes and datasets are available at:https://github.com/oneal2000/STARD/tree/main
More
Translated text
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined