FLIRT: Feedback Loop In-context Red TeamingNinareh Mehrabi,Palash Goyal,Christophe Dupuy,Qian Hu,Shalini Ghosh,Richard Zemel,Kai-Wei Chang,Aram Galstyan,Rahul GuptaICLR 2024(2024)引用 62|浏览301关键词Safety,Red-teaming,Generative AI,Adversarial Machine LearningAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要