Majorana Demonstrator Data Release for AI/ML Applications
Computing Research Repository (CoRR)(2023)
Pacific Northwest National Laboratory | Oak Ridge National Laboratory Department of Physics and Astronomy | National Research Center "Kurchatov Institute" Kurchatov Complex of Theoretical and Experimental Physics | University of South Dakota Department of Physics | Triangle Universities Nuclear Laboratory Department of Physics and Astronomy | North Carolina State University Department of Physics | Duke University Department of Physics | University of Washington Department of Physics | Lawrence Berkeley National Laboratory Nuclear Science Division | South Dakota Mines | Los Alamos National Laboratory | Medioambientales y Tecnológicas Centro de Investigaciones Energéticas | University of Tennessee Department of Physics and Astronomy | Osaka University Research Center for Nuclear Physics | Indiana University IU Center for Exploration of Energy and Matter | Williams College Physics Department | Oak Ridge National Laboratory | Tennessee Tech University | Queen's University Department of Physics | Technische Universität Physik Department and Excellence Cluster Universe | University of South Carolina Department of Physics and Astronomy | Joint Institute for Nuclear Research
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance

被引用52 | 浏览
被引用28 | 浏览
被引用2 | 浏览