-
Suna: Scalable Causal Confounder Discovery over Relational Data
Jiaxiang Liu, Eugene Wu
In Review 2025
-
Jade: Design Independence Via Physical Visualization Design
Yiru Chen, Xupeng Li, Jeff Tao, Lana Ramjit, Ravi Netravali, Subrata Mitra, Aditya Parameswaran, Javad Ghaderi, Dan Rubenstein, Eugene Wu
In Review 2025
-
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Shreya Shankar, Aditya Parameswaran, Eugene Wu
arXiv 2024
-
Where Does Database Research Go From Here?
Eugene Wu
SIGMOD Blog 2024
-
Kitana: A Data-as-a-Service Platform
Zachary Huang, Pranav Subramaniam, Raul Fernandez, Eugene Wu
In Review 2023
-
Database Theory + X: Database Visualization
Eugene Wu
ICDT Database Theory + X 2025
-
Design-Specific Transformations in Visualization
Eugene Wu, Remco Chang
ArXiV 2024
-
Data-Centric Text-to-SQL with Large Language Models
Zezhou Huang, Shuo Zhang, Kechen Liu, Eugene Wu
Table Representation Learning at NeurIPS 2024
-
DynoClass: A Dynamic Table-Class Detection System Without the Need for Predefined Ontologies
Haonan Wang, Kechen Liu, Jiaxiang Liu, Eugene Wu
Table Representation Learning at NeurIPS 2024
-
Transform Table to Database Using Large Language Models
Zezhou Huang, Jia Guo, Eugene Wu
Tabular Data Analysis (TaDA) Workshop at VLDB2024
-
spade: Synthesizing Data Quality Assertions for Large Language Model Pipelines
Shreya Shankar, Haotian Li, Parth Asawa, Madelon Hulsebos, Yiming Lin, J.D. Zamfirescu-Pereira, Harrison Chase, Will Fu-Hinthorn, Aditya G. Parameswaran, Eugene Wu
VLDB 2025
-
Cocoon: Semantic Table Profiling Using Large Language Models
Zachary Huang, Eugene Wu
HILDA Workshop at SIGMOD 2024
-
SET: Searching Effective Supervised Learning Augmentations in Large Tabular Data Repositories
Jerry Liu, Zachary Huang, Eugene Wu
GUIDEAI Workshop at SIGMOD 2024
-
Lightweight Materialization for Fast Dashboards Over Joins
Zachary Huang, Eugene Wu
SIGMOD 2024
-
Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL
Zezhou Huang, Pavan Kalyan, Eugene Wu
Table Representation Learning Workshop at NeurIPS 2023
-
The Fast and the Private: Task-based Dataset Search
Zezhou Huang, Jiaxiang Liu, Haonan Wang, Eugene Wu
CIDR 2024
Slides
-
JoinBoost: Grow Trees Over Normalized Data Using Only SQL
Zezhou Huang, Rathijit Sen, Jiaxiang Liu, Eugene Wu
VLDB 2023
-
Saibot: A Differentially Private Data Search Platform
Zezhou Huang, Jiaxiang Liu, Daniel Alabi, Raul Castro Fernandez, Eugene Wu
VLDB 2023
-
Pollock: A Data Loading Benchmark
Gerardo Vitagliano, Mazhar Hameed, Lan Jiang, Lucas Reisener, Eugene Wu, Felix Naumann
VLDB 2023
-
ConnectorX: Accelerating Data Loading From Databases to Dataframes
Xiaoying Wang, Weiyuan Wu, Jinze Wu, Yizhou Chen, Nick Zrymiak, Changbo Qu, Lampros Flokas, George Chow, Jiannan Wang, Tianzheng Wang, Eugene Wu, Qingqing Zhou
VLDB 2023
-
SmokedDuck Demonstration: SQLStepper
Haneen Mohammed, Charlie Summers, Sughosh Kaushik, Eugene Wu
SIGMOD Demo 2023
-
Teaching Data Science by Visualizing Data Table Transformations: Pandas Tutor for Python, Tidy Data Tutor for R, and SQL Tutor
Sam Lau, Sean Kross, Eugene Wu, Philip Guo
DataEd at SIGMOD 2023
-
Analysis Errors Over Semantic Layers and How To Avoid Them
Zezhou Huang, Pavan Kalyan, Eugene Wu
HILDA at SIGMOD 2023
-
DIG: The Data Interface Grammar
Yiru Chen, Jeffrey Tao, Eugene Wu
HILDA at SIGMOD 2023
-
Random Forests over Normalized Data in CPU-GPU DBMSes
Zezhou Huang, Pavan Kalyan, Rathijit Sen, Eugene Wu
DaMoN at SIGMOD 2023
-
OM3: An Ordered Multi-level Min-Max Representation for Interactive Progressive Visualization of Time Series
Yunhai Wang, Yuchun Wang, Xin Chen, Yue Zhao, Fan Zhang, Eugene Wu, Chi-Wing Fu, Xiaohui Yu
SIGMOD 2023
-
NL2INTERFACE: Interactive Visualization Interface Generation from Natural Language Queries
Yiru Chen, Ryan Li, Austin Mac, Tianbao Xie, Tao Yu, Eugene Wu
VIS nlviz workshop 2022
-
How Do Captions Affect Visualization Reading?
Shelly Cheng, Hazel Zhu, Eugene Wu
VIS Viscomm 2022
-
Extending the View Composition Algebra to Hierarchical Data
Eugene Wu
arXiV 2022
-
A Grammar for Hypothesis-Driven Visual Analysis
Ashley Suh, Yilan Jiang, Ab Mosca, Eugene Wu, Remco Chang
ArXiV 2022
-
A Sensorless Drone-based System for Mapping Indoor 3D Airflow Gradients
Yanchen Liu, Minghui Zhao, Stephen Xia, Eugene Wu, Xiaofan Jiang
MobiSys 2022 Demo
-
How I Stopped Worrying About Training Data Bugs and Started Complaining
Lampros Flokas, Weiuan Wu, Jiannan Wang, Nakul Verma, Eugene Wu
DEEM Workshop 2022
-
Interactive Interface Generation in Notebooks
Jeffrey Tao, Yiru Chen, Eugene Wu
SIGMOD 2022 demo
-
PI2: Generating Visual Analysis Interfaces From Queries
Yiru Chen, Eugene Wu
SIGMOD 2022
-
View Composition Algebra for Ad Hoc Comparisons
Eugene Wu
TVCG 2022
-
Reptile: Aggregation-level Explanations for Hierarchical Data
Zachary Huang, Eugene Wu
SIGMOD 2022
-
A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More
Iddo Drori, Sunny Tran, Roman Wang, Newman Cheng, Kevin Liu, Leonard Tang, Elizabeth Ke, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, Gilbert Strang
PNAS 2022
-
Complaint-Driven Training Data Debugging at Interactive Speeds
Lampros Flokas, Young Wu, Jiannan Wang, Nakul Verma, Eugene Wu
SIGMOD 2022
-
Dynamic Breakpoints for Y-axis Scales
Jacob Fisher, Remco Chang, Eugene Wu
InfoVIS 2021 (short paper)
-
Enabling SQL-based training data debugging for federated learning
Young Wu, Yejia Liu, Lampros Flokas, Jiannan Wang, Eugene Wu
VLDB 2022
-
Explaining SQL-ML Queries with Bayesian Optimization
Brandon Lockhard, Jiannan Wang, Eugene Wu
VLDB 2021
-
DIEL: Interactive Visualization Beyond the Here and Now
Yifan Wu, Remco Chang, Joseph Hellerstein, Arvind Satyanarayan, Eugene Wu
VIS 2021
-
PopFactor: Live-Streamer Behavior and Popularity
Robert Netzorg, Lauren Arnett, Augustin Chaintreau, Eugene Wu
ICWSM 2021
-
Impact of Cognitive Biases on Progressive Visualization
Marianne Procopio, Ab Mosca, Carlos Scheidegger, Eugene Wu, Remco Chang
TVCG 2021
-
From Cleaning Before ML to Cleaning For ML
Felix Neutatz, Binger Chen, Ziawasch Abedjan, Eugene Wu
Invited, IEEE Data Engineering Bulletin 2021
-
Facilitating Exploration with Interaction Snapshots under High Latency
Yifan Wu, Remco Chang, Joe Hellerstein, Eugene Wu
InfoVIS (short paper) 2020
-
ActiveDeeper: A Model-based Active Data Enrichment system
Liang Zhao, Qingcan Li, Pei Wang, Jiannan Wang, Eugene Wu
VLDB 2020 demo
-
Continuous Prefetch for Interactive Data Applications
Haneen Mohammed, Ziyun Wei, Ravi Netravali, Eugene Wu
VLDB 2020
Talk Video
Blogpost
-
Complaint-driven Training Data Debugging for Query 2.0
Young Wu, Lampros Flokas, Jiannan Wang, Eugene Wu
SIGMOD 2020
Talk Video
Blogpost
-
Physical Visualization Design
Lana Ramjit, Zhaoning Kong, Ravi Netravali, Eugene Wu
SIGMOD (demo) 2020
-
Towards Complaint-driven ML Workflow Debugging
Lampros Flokas, Young Wu, Jiannan Wang, Eugene Wu
MLOps 2020
-
Monte Carlo Tree Search for Generating Interactive Data Analysis Interfaces
Yiru Chen, Eugene Wu
Intelligent Process Automation (IPA) 2020
-
Acorn: Aggressive Result Caching in Spark SQL
Lana Ramjit, Matteo Interlandi, Eugene Wu, Ravi Netravali
SOCC 2019
-
AlphaClean: Automatic Generation of Data Cleaning Pipelines
Sanjay Krishnan, Eugene Wu
ArXiv 2019
-
Towards Democratizing Relational Data Visualization
Nan Tang, Eugene Wu, Guoliang Li
SIGMOD 2019 Tutorial
-
Precision Interfaces
Qianrui Zhang, Haoci Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
SIGMOD 2019
-
Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment
Pei Wang, Jiannan Wang, Ryan Shea, Eugene Wu
SIGMOD 2019
-
Cross-platform Interactions and Popularity in the Live-streaming Community
Lauren Arnett, Robert Netzorg, Augustin Chaintreau, Eugene Wu
CHI Latebreaking 2019
-
DeepBase: Deep Inspection of Neural Networks
Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick, Eugene Wu
SIGMOD 2019
-
Deep Neural Inspection Using DeepBase
Yiru Chen, Yiliang Shi, Boyuan Chen, Thibault Sellam, Carl Vondrick, Eugene Wu
LearnSys 2018 Workshop at NIPS
-
CIDR2: Crazier Innovations in Databases JOIN Reinforcement-learning Research
Eugene Wu
CIDR 2019 Abstract
-
Ten Years of Web Tables
Michael Cafarella, Alon Halevy, Daisy Zhe Wang, Hongrae Lee, Jayant Madhavan, Cong Yu, Eugene Wu
PVLDB 2018 Invited Paper,
-
At a Glance: Approximate Entropy as a Measure of Line Chart Visualization Complexity
Gabriel Ryan, Abigail Mosca, Remco Chang, Eugene Wu
InfoVIS 2018
Code
-
Provenance in Interactive Visualizations
Fotis Psallidas, Eugene Wu
HILDA 2018
-
Leveraging Quality Prediction Models for Automatic Writing Feedback
Hamed Nilforoshan, Eugene Wu
ICWSM 2018
-
Precision Interfaces for Different Modalities
Haoci Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
SIGMOD (demo) 2018
-
Demonstration of Smoke: A Deep Breath of Data-Intensive Lineage Applications
Fotis Psallidas, Eugene Wu
SIGMOD (demo) 2018
-
Deeper: A Data Enrichment System Powered by Deep Web.
Pei Wang, Yongjun He, Ryan Shea, Jiannan Wang, Eugene Wu
SIGMOD (demo) 2018
-
"I Like the Way You Think!" Inspecting the Internal Logic of Recurrent Neural Networks
Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
SysML 2018
-
A "Probabilistic" Model of Research
Eugene Wu
Blog Post 2018
-
Smoke: Fine-grained Lineage at Interactive Speeds
Fotis Psallidas, Eugene Wu
VLDB 2018
-
Mining Precision Interfaces From Query Logs
Haoci Zhang, Thibault Sellam, Eugene Wu
Tech Report 2017
-
BoostClean: Automated Error Detection and Repair for Machine Learning
Sanjay Krishnan, Michael J. Franklin, Ken Goldberg, Eugene Wu
Tech Report 2017
-
Load-n-Go: Fast Approximate Join Visualizations That Improve Over Time
Marianne Procopio, Carlos Scheidegger, Eugene Wu, Remco Chang
DSIA 2017
-
Approximate Entropy as a Measure of Line Chart Complexity
Gabriel Ryan, Abigail Mosca, Eugene Wu, Remco Chang
InfoVIS Poster 2017
-
Towards a Bayesian Model of Data Visualization Cognition
Yifan Wu, Larry Xu, Remco Chang, Eugene Wu
DECISIVE 2017
-
PreCog: Improving Crowdsourced Data Quality Before Acquisition
Hamed Nilforoshan, Jiannan Wang, Eugene Wu
Arxiv 2017
-
Precision Interfaces
Haoci Zhang, Thibault Sellam, Eugene Wu
HILDA 2017
-
PALM: Machine Learning Explanations For Iterative Debugging
Sanjay Krishnan, Eugene Wu
HILDA 2017
-
Segment-Predict-Explain for Automatic Writing Feedback
Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
Collective Intelligence 2017
-
Dialectic: Enhancing Text Input Fields with Automatic Feedback to Improve Social Content Writing Quality
Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
ArXiv 2017
-
Skipping-oriented Partitioning for Columnar Layouts
Liwen Sun, Michael J. Franklin, Jiannan Wang, Eugene Wu
VLDB 2017
-
Combining Design and Performance in a Data Visualization Management System
Eugene Wu, Fotis Psallidas, Zhengjie Miao, Haoci Zhang, Laura Rettig, Yifan Wu, Thibault Sellam
CIDR 2017
-
CIDR: Chat-oriented Innovations in Database Research
Eugene Wu
CIDR 2017 Abstract
-
QFix: Diagnosing errors through query histories
Xiaolan Wang, Alexandra Meliou, Eugene Wu
SIGMOD 2017
-
A DeVIL-ish Approach to Inconsistency in Interactive Visualizations
Yifan Wu, Joe Hellerstein, Eugene Wu
HILDA 2016
-
PFunk-H: Approximate Query Processing using Perceptual Models
Daniel Alabi, Eugene Wu
HILDA 2016
-
Towards Reliable Interactive Data Cleaning: A User Survey and Recommendations
Sanjay Krishnan, Daniel Haas, Michael J. Franklin, Eugene Wu
HILDA 2016
-
TrendQuery: A System for Interactive Exploration of Trends
Niranjan Kamat, Eugene Wu, Arnab Nandi
HILDA 2016
-
ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning
Sanjay Krishnan, Michael Franklin, Ken Goldberg, Jiannan Wang, Eugene Wu
SIGMOD 2016 Demo
(Demo Award Winner!)
-
Graphical Perception in Animated Bar Charts
Eugene Wu, Lilong Jiang, Larry Xu, Arnab Nandi
Arxiv 2016
-
QFix: Demonstrating error diagnosis in query histories
Xiaolan Wang, Alexandra Meliou, Eugene Wu
SIGMOD 2016 Demo
-
QFix: Diagnosing errors through query histories
Xiaolan Wang, Alexandra Meliou, Eugene Wu
Arxiv 2016
-
ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models
Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J. Franklin, Ken Goldberg
Arxiv 2016
-
Towards Perception-aware Interactive Data Visualization Systems
Eugene Wu, Arnab Nandi
DSIA 2015
Slides
-
SampleClean: Fast and Reliable Analytics on Dirty Data (overview paper)
Sanjay Krishnan, Jiannan Wang, Michael J Franklin, Ken Goldberg, Tim Kraska, Tova Milo, Eugene Wu
IEEE Data Eng. Bulletin 2015
-
CLAMShell: Speeding up Crowds for Low-latency Data Labeling
Daniel Haas, Jiannan Wang, Eugene Wu, Michael J. Franklin
VLDB 2016
-
Automated Metadata Construction to Support Portable Building Applications
Arka A. Bhattacharya, Dezhi Hong, David Culler, Jorge Ortiz, Kamin Whitehouse, Eugene Wu
BuildSys 2015
-
Wisteria: Nurturing Scalable Data Cleaning Infrastructure
Daniel Haas, Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Eugene Wu
VLDB 2015 demo
-
Collaborative Data Analytics with Datahub
Anant Bhardwaj, Amol Deshpande, Aaron Elmore, David Karger, Sam Madden, Aditya Parameswaran, Harihar Subramanyam, Eugene Wu, Rebecca Zhang
VLDB 2015 demo
-
Indexing Cost Sensitive Prediction
Leilani Battle, Edward Benson, Aditya Parameswaran, Eugene Wu
Technical Report 2016
-
Explaining Data in Visual Analytic Systems
Eugene Wu
Doctoral Thesis 2015
-
The Case for Data Visualization Management Systems
Eugene Wu, Leilani Battle, Samuel Madden
VLDB 2014
-
Vertexica: Your Relational Friend for Graph Analytics!
Alekh Jindal, Praynaa Rawlani, Eugene Wu, Samuel Madden, Amol Deshpande, Mike Stonebraker
SIGMOD 2014 demo
-
Data In Context: Aiding News Consumers while Taming Dataspaces
Eugene Wu, Adam Marcus, Sam Madden
DBCrowd 2013
-
Mobile applications need Targeted Micro-updates
Alvin Cheung, Lenin Ravindranath, Eugene Wu, Samuel Madden, Hari Balakrishnan
APSYS 2013
-
Scorpion: Explaining Away Outliers in Aggregate Queries
Eugene Wu, Samuel Madden
VLDB 2013 (Best-of)
Slides
-
SubZero: a Fine-Grained Lineage System for Scientific Databases
Eugene Wu, Samuel Madden, Michael Stonebraker
ICDE 2013 (Best-of)
-
A Demonstration of DBWipes: Clean as You Query
Eugene Wu, Samuel Madden, Michael Stonebraker
VLDB 2012
-
Human-powered Sorts and Joins
Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller
VLDB 2012
-
Partitioning Techniques for Fine-Grained Indexing
Eugene Wu, Sam Madden
ICDE 2011
-
Demonstration of Qurk: A Query Processor for Human Operators
Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller
SIGMOD 2011
-
No Bits Left Behind
Eugene Wu, Carlo Curino, Sam Madden
CIDR 2011
-
Crowdsourced Databases: Query Processing with People
Adam Marcus, Eugene Wu, Sam Madden, Robert Miller
CIDR 2011
-
Relational Cloud: A Database-as-a-Service for the Cloud
Carlo Curino, Evan Jones, Raluca Popa, Nirmesh Malviya, Eugene Wu, Sam Madden, Hari Balakrishnan, Nickolai Zeldovich
CIDR 2011
-
Relational Cloud: The Case for a Database Service
Carlo Curino, Evan Jones, Yang Zhang, Eugene Wu, Sam Madden
MIT Tech Report 2010
-
TrajStore: An Adaptive Storage System for Very Large Trajectory Data Sets
Philippe Cudre-Mauroux, Eugene Wu, Sam Madden
ICDE 2010
-
Webtables, exploring the power of tables on the web
Michael Cafarella, Alon Halevy, Daisy Wang, Eugene Wu, Yang Zhang
VLDB 2008
-
SASE: Complex Event Processing over Streams (Demo)
Daniel Gyllstrom, Eugene Wu, Hee-Jin Chae, Yanlei Diao, Patrick Stahlberg, Gordon Anderson
CIDR 2007
-
High-performance complex event processing over streams
Eugene Wu, Yanlei Diao, Shariq Rizvi
SIGMOD 2006
-
SASE: Complex Event Processing over Streams
Daniel Gyllstrom, Eugene Wu, Hee-Jin Chae, Yanlei Diao, Patrick Stahlberg, Gordon Anderson
CoRR 2006
-
Probabilistic Data Management for Pervasive Computing: The Data Furnace Project
Minos N. Garofalakis, Kurt P. Brown, Michael J. Franklin, Joseph M. Hellerstein, Daisy Zhe Wang, Eirinaios Michelakis, Liviu Tancau, Eugene Wu, Shawn R. Jeffery, Ryan Aipperspach
IEEE Data Eng. Bulletin 2006
-
Design Considerations for High Fan-In Systems: The HiFi Approach
Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, Wei Hong
CIDR 2005
-
HiFi: A Unified Architecture for High Fan-in Systems
Owen Cooper, Anil Edakkunni, Michael J. Franklin, Wei Hong, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu
VLDB 2004 Demo