Open Problems in Mechanistic Interpretability
Lee Sharkey,Bilal Chughtai, Joshua Batson, Jack Lindsey,Jeff Wu, Lucius Bushnaq,Nicholas Goldowsky-Dill, Stefan Heimersheim, Alejandro Ortega, Joseph Bloom,Stella Biderman,Adria Garriga-Alonso,Arthur Conmy,Neel Nanda, Jessica Rumbelow,Martin Wattenberg, Nandi Schoots, Joseph Miller,Eric J. Michaud,Stephen Casper,Max Tegmark,William Saunders,David Bau,Eric Todd,Atticus Geiger,Mor Geva, Jesse Hoogland,Daniel Murfet, Tom McGrath CoRR(2025)
AI 理解论文
溯源树
样例
