| Data source selection for information integration in big data era Y Lin, H Wang, J Li, H Gao Information Sciences 479, 197-213, 2019 | 50 | 2019 |
| Spade: Synthesizing data quality assertions for large language model pipelines S Shankar, H Li, P Asawa, M Hulsebos, Y Lin, JD Zamfirescu-Pereira, ... arXiv preprint arXiv:2401.03038, 2024 | 48 | 2024 |
| Querying templatized document collections with large language models Y Lin, M Hulsebos, R Ma, S Shankar, S Zeighami, AG Parameswaran, ... 2025 IEEE 41st International Conference on Data Engineering (ICDE), 2422-2435, 2025 | 43* | 2025 |
| Locater: cleaning wifi connectivity datasets for semantic localization Y Lin, D Jiang, R Yus, G Bouloukakis, A Chio, S Mehrotra, ... arXiv preprint arXiv:2004.09676, 2020 | 39 | 2020 |
| Efficient quality-driven source selection from massive data sources Y Lin, H Wang, S Zhang, J Li, H Gao Journal of Systems and Software 118, 221-233, 2016 | 21 | 2016 |
| Efficient entity resolution on heterogeneous records Y Lin, H Wang, J Li, H Gao IEEE Transactions on Knowledge and Data Engineering 32 (5), 912-926, 2019 | 14 | 2019 |
| Auto-bi: Automatically build bi-models leveraging local join prediction and global schema graph Y Lin, Y He, S Chaudhuri arXiv preprint arXiv:2306.12515, 2023 | 12 | 2023 |
| Twix: Automatically reconstructing structured data from templatized documents Y Lin, M Hasan, R Kosalge, A Cheung, AG Parameswaran arXiv preprint arXiv:2501.06659, 2025 | 8 | 2025 |
| LLM-Powered Proactive Data Systems S Zeighami, Y Lin, S Shankar, A Parameswaran arXiv preprint arXiv:2502.13016, 2025 | 6 | 2025 |
| T-cove: an exposure tracing system based on cleaning wi-fi events on organizational premises Y Lin, P Khargonekar, S Mehrotra, N Venkatasubramanian Proceedings of the VLDB Endowment 14 (12), 2783-2786, 2021 | 5 | 2021 |
| ZIP: Lazy Imputation during Query Processing Y Lin, S Mehrotra Proceedings of the VLDB Endowment 17 (1), 28-40, 2023 | 4 | 2023 |
| PLAQUE: Automated Predicate Learning at Query Time Y Lin, S Mehrotra Proceedings of the ACM on Management of Data 2 (1), 1-25, 2024 | 3 | 2024 |
| Semiotic: Bridging the semantic gap in iot spaces S Almanee, G Bouloukakis, D Jiang, S Ghayyur, D Ghosh, P Gupta, Y Lin, ... Proceedings of the 6th ACM International Conference on Systems for Energy …, 2019 | 3 | 2019 |
| Privacy cognizant iot environment for the brandeis program S Mehrotra, N Venkatasubramanian, A Kobas, C Davison, S Sharma, ... | 1 | 2022 |
| Water-COLOR: Water-COnservation using a Learning-based Optimized Recommender GX Zhang, DL Feldman, Y Lin, S Mehrotra, N Venkatasubramanian, ... 2024 IEEE International Conference on Smart Computing (SMARTCOMP), 93-100, 2024 | | 2024 |
| Optimizing Query Processing for Data-Intensive Computation Y Lin University of California, Irvine, 2023 | | 2023 |
| QUIP: Query-driven Missing Value Imputation Y Lin, S Mehrotra arXiv preprint arXiv:2204.00108, 2022 | | 2022 |