Chrome Extension

WeChat Mini Program

Use on ChatGLM

Log in

Academic Profile User Profile

My Following Paper Collections Browse History

Aligners: Decoupling LLMs and Alignment

Lilian Ngweta,Mayank Agarwal,Subha Maity,Alex Gittens,Yuekai Sun,Mikhail Yurochkin

EMNLP 2024（2024）

Cited 2|Views38

Abstract

Large Language Models (LLMs) need to be aligned with human expectations toensure their safety and utility in most applications. Alignment is challenging,costly, and needs to be repeated for every LLM and alignment criterion. Wepropose to decouple LLMs and alignment by training aligner models that can beused to align any LLM for a given criteria on an as-needed basis, thus alsoreducing the potential negative impacts of alignment on performance. Our recipefor training the aligner models solely relies on synthetic data generated witha (prompted) LLM and can be easily adjusted for a variety of alignmentcriteria. We illustrate our method by training an "ethical" aligner and verifyits efficacy empirically.

More

Translated text

Key words

Interoperable

Bibtex

AI Read Science

AI Summary

AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.

Example

Background

Key content

Introduction

Methods

Results

Related work

Fund

Key content

Pretraining has recently greatly promoted the development of natural language processing (NLP)
We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance

Try using models to generate summary,it takes about 60s

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Related Papers

Reference papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova

2019

被引用130783 | 浏览

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models.

Samuel Gehman,Suchin Gururangan,Maarten Sap,Yejin Choi,Noah A. Smith

2020

被引用1264 | 浏览

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai,Saurav Kadavath,Sandipan Kundu,Amanda Askell,Jackson Kernion,Andy Jones,Anna Chen,Anna Goldie,Azalia Mirhoseini,Cameron McKinnon,Carol Chen,Catherine Olsson,

2022

被引用1567 | 浏览

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Yizhong Wang,Yeganeh Kordi,Swaroop Mishra,Alisa Liu,Noah A. Smith,Daniel Khashabi,Hannaneh Hajishirzi

2023

被引用2182 | 浏览

Conversational Automated Program Repair

Chunqiu Steven Xia,Lingming Zhang

2023

被引用36 | 浏览

GPTScore: Evaluate As You Desire.

Jinlan Fu,See-Kiong Ng,Zhengbao Jiang,Pengfei Liu

2024

被引用553 | 浏览

Self-Refine: Iterative Refinement with Self-Feedback

Aman Madaan,Niket Tandon,Prakhar Gupta,Skyler Hallinan,Luyu Gao,Sarah Wiegreffe,Uri Alon,Nouha Dziri,Shrimai Prabhumoye,Yiming Yang,Shashank Gupta,Bodhisattwa Prasad Majumder,

2023

被引用1596 | 浏览

Teaching Large Language Models to Self-Debug

Xinyun Chen,Maxwell Lin,Nathanael Schärli,Denny Zhou

2024

被引用657 | 浏览

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision.

Zhiqing Sun,Yikang Shen,Qinhong Zhou,Hongxin Zhang,Zhenfang Chen,David Daniel Cox,Yiming Yang,Chuang Gan

2023

被引用361 | 浏览

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron,Louis Martin,Kevin Stone,Peter Albert,Amjad Almahairi,Yasmine Babaei,Nikolay Bashlykov,Soumya Batra,Prajjwal Bhargava,Shruti Bhosale,Dan Bikel,Lukas Blecher,

2023

被引用14798 | 浏览

Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies

Liangming Pan,Michael Saxon,Wenda Xu,Deepak Nathani,Xinyi Wang,William Yang Wang

2024

被引用55 | 浏览

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Yang Liu,Yuanshun Yao,Jean-Francois Ton,Xiaoying Zhang,Ruocheng Guo,Hao Cheng,Yegor Klochkov,Muhammad Faaiz Taufiq,Hang Li

2023

被引用351 | 浏览

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：该论文提出了一种方法，通过训练可以用于对任何大型语言模型（LLM）按需进行对齐的“对齐器”模型，以此将LLM和对齐过程解耦，减少对齐过程可能对性能产生的负面影响。

【方法】：该研究提出的方法是训练专门的“对齐器”模型，这些模型使用由提示的LLM生成的合成数据进行训练，能够根据不同的对齐标准对LLM进行对齐。

【实验】：研究通过训练一个“伦理”对齐器模型并使用实验验证其效果，实验中使用的数据集为合成数据，具体数据集名称未提及，实验结果显示该对齐器模型有效。

去 AI 文献库对话