Eugene Wu

Bio

Eugene Wu is an associate professor of computer science at Columbia University. He is broadly interested in technologies that help users play with their data. His goal is for users at all technical levels to effectively and quickly make sense of their information. He is interested in solutions that ultimately improve the interface between users and data, and uses techniques borrowed from fields such as data management, systems, crowd sourcing, visualization, and HCI. Eugene Wu received his Ph.D. from MIT, B.S. from Cal, and was a postdoc in the AMPLab. A profile, an obit.

Eugene Wu has received the VLDB 2018 10-year test of time award, best-of-conference citations at ICDE and VLDB, the SIGMOD 2016 best demo award, the NSF CAREER, and the Google, Adobe, and Amazon faculty awards.

Info

ewu@cs.columbia.edu
421 Mudd, 500 W 120th St
Twitter: @sirrice
BSky: @eugenewu.net
Github: sirrice, cudbg
OH: Tues 10-11AM EST 421 Mudd
CV
Blog

Co-Chair: Data, Media & Society
Co-Director: DAPLab
Advisor: CS+Journalism MS Dual Degree
Member: Columbia DB, CS, DSI


Joining The Lab

I am broadly interested in problems where academic research has a competitive advantage. Currently, I’m working on three classes of projects:

I strongly suggest PhD (and Postdoc) applicants read The PhD Application from a Faculty’s Perspective

  • PhDs: read lab’s work, provide evidence you can conduct research in the lab, include “bananas” in the subject line.
  • Postdocs: how you can best make use of my expertise? what’s a project we could work on? Include “satsuma” in the subject line.
  • Interns + UGrad + Masters: Fill out this form

All Publications

All Selected DB VIS ML LLM Data Search
  1. Suna: Scalable Causal Confounder Discovery over Relational Data
    Jiaxiang Liu, Eugene Wu
    In Review 2025
  2. Jade: Design Independence Via Physical Visualization Design
    Yiru Chen, Xupeng Li, Jeff Tao, Lana Ramjit, Ravi Netravali, Subrata Mitra, Aditya Parameswaran, Javad Ghaderi, Dan Rubenstein, Eugene Wu
    In Review 2025
  3. Database Theory + X: Database Visualization
    Eugene Wu
    ICDT Database Theory + X 2025
  4. The Fast and the Private: Task-based Dataset Search
    Zezhou Huang, Jiaxiang Liu, Haonan Wang, Eugene Wu
    CIDR 2024 Slides
  5. DIG: The Data Interface Grammar
    Yiru Chen, Jeffrey Tao, Eugene Wu
    HILDA at SIGMOD 2023
  6. PI2: Generating Visual Analysis Interfaces From Queries
    Yiru Chen, Eugene Wu
    SIGMOD 2022
  7. View Composition Algebra for Ad Hoc Comparisons
    Eugene Wu
    TVCG 2022
  8. Continuous Prefetch for Interactive Data Applications
    Haneen Mohammed, Ziyun Wei, Ravi Netravali, Eugene Wu
    VLDB 2020 Talk Video Blogpost
  9. Complaint-driven Training Data Debugging for Query 2.0
    Young Wu, Lampros Flokas, Jiannan Wang, Eugene Wu
    SIGMOD 2020 Talk Video Blogpost
  10. DeepBase: Deep Inspection of Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick, Eugene Wu
    SIGMOD 2019
  11. Ten Years of Web Tables
    Michael Cafarella, Alon Halevy, Daisy Zhe Wang, Hongrae Lee, Jayant Madhavan, Cong Yu, Eugene Wu
    PVLDB 2018 Invited Paper,
  12. A "Probabilistic" Model of Research
    Eugene Wu
    Blog Post 2018
  13. Smoke: Fine-grained Lineage at Interactive Speeds
    Fotis Psallidas, Eugene Wu
    VLDB 2018
  14. ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models
    Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J. Franklin, Ken Goldberg
    Arxiv 2016
  15. Explaining Data in Visual Analytic Systems
    Eugene Wu
    Doctoral Thesis 2015
  16. The Case for Data Visualization Management Systems
    Eugene Wu, Leilani Battle, Samuel Madden
    VLDB 2014
  17. Scorpion: Explaining Away Outliers in Aggregate Queries
    Eugene Wu, Samuel Madden
    VLDB 2013 (Best-of) Slides
  18. Human-powered Sorts and Joins
    Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller
    VLDB 2012
  19. High-performance complex event processing over streams
    Eugene Wu, Yanlei Diao, Shariq Rizvi
    SIGMOD 2006