David Bau - Publications

Jaden Fiotto-Kaufman, Alexander R Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Michael Ripa, Adam Belfki, Nikhil Prakash, Sumeet Multani, Carla Brodley, Arjun Guha, Jonathan Bell, Byron Wallace, David Bau. NNsight and NDIF: Democratizing Access to Foundation Model Internals. Proceedings of the 2025 International Conference on Learning Representations. (ICLR 2025)

Samuel Marks, Can Rager, Eric J Michaud, Yonatan Belinkov, David Bau, Aaron Mueller. Sparse feature circuits: Discovering and editing interpretable causal graphs in language models. Proceedings of the 2025 International Conference on Learning Representations. (ICLR 2025 oral)

Koyena Pal, David Bau, Renée J Miller. Model Lakes. Proceedings of the 28th International Conference on Extending Database Technology. (EDBT 2025)

Adam Karvonen, Benjamin Wright, Can Rager, Rico Angell, Jannik Brinkmann, Logan Smith, Claudio Mayrink Verdun, David Bau, Samuel Marks. Measuring progress in dictionary learning for language model interpretability with board game models. Advances in Neural Information Processing Systems 37. (NeurIPS 2024).

Sheridan Feucht, David Atkinson, Byron Wallace, David Bau. Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs. Findings of the Association for Computational Linguistics. (EMNLP 2024)

Arnab Sen Sharma, David Atkinson, David Bau. Locating and Editing Factual Associations in Mamba. Proceedings of the 2024 Conference on Langauge Modeling. (COLM 2024)

Kenneth Li, Tianle Liu, Naomi Bashkansky, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg. Measuring and Controlling Instruction Instability in Language Model Dialogs. Proceedings of the 2024 Conference on Langauge Modeling. (COLM 2024)

Stephen Casper, Carson Ezell, Charlotte Siegmann, Noarm Kolt, Taylor Lynn Curtis, Benjamin Bucknall,Andreas Haupt, Kevin Wei, Jeremy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell. Black-box Access is Insufficient for Rigorous AI Audits. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 2024)

Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau. Linearity of Relation Decoding in Transformer Language Models. Proceedings of the 2024 International Conference on Learning Representations. (ICLR 2024 spotlight)

Eric Todd, Millicent Li, Arnab Sen Sharma, Aaron Mueller, Byron C Wallace, David Bau. Function Vectors in Large Language Models. Proceedings of the 2024 International Conference on Learning Representations. (ICLR 2024)

Nikhil Prakash, Tamar Rott Shaham, Tal Haklay, Yonatan Belinkov, David Bau. Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking. Proceedings of the 2024 International Conference on Learning Representations. (ICLR 2024)

Rohit Gandikota, Joanna Materzyńska, Tingrui Zhou, Antonio Torralba, David Bau. Concept Sliders: LORA adaptors for precise control in diffusion models. Proceedings of the European Conference on Computer Vision (ECCV 2024)

Maxwell Jones, Sheng-Yu Wang, Nupur Kumari, David Bau, Jun-Yan Zhu. Customizing Text-to-Image Models with a Single Image Pair.. SIGGRAPH Asia 2024 Conference Papers (SIGGRAPH Asia 2024)

Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau. Unified Concept Editing in Diffusion Models. Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. (WACV 2024)

Koyena Pal, Jiuding Sun, Andrew Yuan, Byron C. Wallace, and David Bau. Future Lens: Anticipating Subsequent Tokens from a Single Hidden State. SIGNLL Conference on Computational Natural Language Learning. (CoNLL 2023)

Sarah Schwettmann, Tamar Rott Shaham, Joanna Materzyńska, Neil Chowdhury, Shuang Li, Jacob Andreas, David Bau, and Antonio Torralba. A Function Interpretation Benchmark for Evaluating Interpretability Methods. Advances in Neural Information Processing Systems 36. (NeurIPS 2023).

Rohit Gandikota, Joanna Materzyńska, Jaden Fiotto-Kaufman, David Bau. Erasing Concepts from Diffusion Models. Proceedings of the 2023 IEEE International Conference on Computer Vision (ICCV 2023).

Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau. Mass-Editing Memory in a Transformer. Eleventh International Conference on Learning Representations. (ICLR 2023 spotlight).

Kenneth Li, Aspen K Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg. Emergent world representations: Exploring a sequence model trained on a synthetic task. Eleventh International Conference on Learning Representations. (ICLR 2023 oral).

Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov. Locating and Editing Factual Associations in GPT. Advances in Neural Information Processing Systems 36. (NeurIPS 2022).

Sheng-Yu Wang, David Bau, Jun-Yan Zhu. Rewriting Geometric Rules of a GAN. ACM Transactions on Graphics (TOG). (SIGGRAPH 2022)

Joanna Materzyńska, Antonio Torralba, David Bau. Disentangling Visual and Written Concepts in CLIP. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (CVPR 2022 oral)

Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvilli, Antonio Torralba, Jacob Andreas. Natural Language Descriptions of Deep Visual Features. Proceedings of the International Conference on Learning Representations. (ICLR 2022)

Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, and Aleksander Madry. Editing a classifier by rewriting its prediction rules. Advances in Neural Information Processing Systems 34. (NeuIPS 2021)

Emma Andrews, David Bau, and Jeremiah Blanchard. From Droplet to Lilypad: Present and Future of Dual-Modality Environments. 2021 IEEE Symposium on Visual Languages and Human-Centric Computing. (VL/HCC 2021)

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba. Toward a Visual Concept Vocabulary for GAN Latent Space. Proceedings of the IEEE/CVF International Conference on Computer Vision. (ICCV 2021)

Sheng-Yu Wang, David Bau, and Jun-Yan Zhu. Sketch Your Own GAN. Proceedings of the IEEE/CVF International Conference on Computer Vision. (ICCV 2021)

David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, and Antonio Torralba. Rewriting a Deep Generative Model. Proceedings of the European Conference on Computer Vision. (ECCV 2020 oral)

Lucy Chai, David Bau, Ser-Nam Lim, and Phillip Isola. What makes fake images detectable? Understanding properties that generalize. Proceedings of the European Conference on Computer Vision. (ECCV 2020)

Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba. Diverse Image Generation via Self-Conditioned GANs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (CVPR 2020)

David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Seeing What a GAN Cannot Generate. Proceedings of the IEEE International Conference on Computer Vision, pp. 4502-4511. (ICCV 2019 oral presentation)

David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, and Antonio Torralba. Semantic Photo Manipulation with a Generative Image Prior. ACM Transactions on Graphics (TOG) 38, no. 4. (SIGGRAPH 2019)

Didac Suris, Adria Recasens, David Bau, David Harwath, James Glass, and Antonio Torralba. Learning words by drawing images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (CVPR 2019)

David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, and Antonio Torralba. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. Proceedings of the Seventh International Conference on Learning Representations. (ICLR 2019)

David Weintrop, David Bau, and Uri Wilensky. The cloud is the limit: A case study of programming on the web, with the web. International Journal of Child-Computer Interaction 20. (IJCCI 2019)

Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, Lalana Kagal. Explaining Explanations: An Overview of Interpretability of Machine Learning. Proceedings of the IEEE 5th International Conference on Data Science and Advanced Analytics. (DSAA 2018)

Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba. Interpretable Basis Decomposition for Visual Explanation. Proceedings of the European Conference on Computer Vision. (ECCV 2018)

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba. Network Dissection: Quantifying Interpretability of Deep Visual Representations. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017 oral presentation)

David Bau, Matt Dawson M, Anthony Bau, C.S. Pickens Pencil Code: Block Code for a Text World. Proceedings of the 14th International Conference on Interaction Design and Children. pp 445-448. (IDC 2015)

Ming Zhao, Jay Yagnik, Hartwig Adam, David Bau. Large Scale Learning and Recognition of Faces in Web Videos. 8th IEEE International Conference on Automatic Face and Gesture Recognition. (FG 2008)

David Bau, Induprakas Kodukula, Vladimir Kotlyar, Keshav Pingali, Paul Stodghill. Solving Alignment Using Elementary Linear Algebra. Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science Volume 892, pp 46-60. (LCPC 1994)

Journal Articles

Grace W. Lindsay and David Bau. Testing methods of neural systems understanding. Cognitive Systems Research (2023): 101156.

David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, and Antonio Torralba. Understanding the role of individual units in a deep neural network. Proceedings of the National Academy of Sciences (PNAS), Volume 117, no. 48, December 1 2020, pp. 30071-30078.

David Bau, Bolei Zhou, Aude Oliva, Antonio Torralba: Interpreting Deep Visual Representations via Network Dissection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) Volume 41 Issue 9, September 2019, pp. 2131-2145.

David Bau, Jeff Gray, Caitlin Kelleher, Josh Sheldon, Franklyn Turbak. Learnable Programming: Blocks and Beyond. Communications of the ACM (CACM) Volume 60 Issue 6, June 2017. pp. 72-80.

Workshop Papers

Sheridan Feucht, Byron C Wallace, David Bau. Inducing Induction in Llama via Linear Probe Interventions. The 7th BlackboxNLP Workshop at EMNLP (BlackboxNLP 2024).

Nicholas Vincent, David Bau, Sarah Schwettmann, Joshua Tan. An Alternative to Regulation: The Case for Public AI. Socially Responsible Language Modelling Research workshop at NeurIPS (RegML 2023, NeurIPS 2023 Workshop).

Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau. Testing Language Model Agents Safely in the Wild. Socially Responsible Language Modelling Research workshop at NeurIPS (SoLaR 2023, NeurIPS 2023 Workshop).

Xander Davies, Max Nadeau, Nikhil Prakash, Tamar Rott Shaham, David Bau. Discovering Variable Binding Circuitry with Desiderata. Workshop on Challenges in Deployable Generative AI (ICML 2023 Workshop)

David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba Horses With Blue Jeans - Creating New Worlds by Rewriting a GAN. 4th Workshop on Machine Learning for Creativity and Design (NeurIPS 2020 Workshop)

David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Inverting Layers of a Large Generator. ICLR Debugging Machine Learning Models Workshop. (ICLR 2019 workshop)

Jonathan Frankle, David Bau. Dissecting Pruned Neural Networks. ICLR Debugging Machine Learning Models Workshop. (ICLR 2019 workshop)

Saksham Aggarwal, David Anthony Bau, David Bau. A blocks-based editor for HTML code. IEEE Blocks and Beyond Workshop, pp. 83-85. (VL/HCC 2015 workshop)

David Bau, Anthony Bau. A Preview of Pencil Code: A Tool for Developing Mastery of Programming. Proceedings of the 2nd Workshop on Programming for Mobile & Touch. (PROMOTO 2014)

Book

Lloyd N. Trefethen, David Bau. Numerical Linear Algebra. (373pp.) Society for Industrial and Applied Mathematics. (1997)