Bio Notes
Paolo Papotti got his Ph.D. degree from the University of Roma Tre (Italy) in 2007 and is an associate professor in the Data Science department at EURECOM (France) since 2017. Before joining EURECOM, he has been a scientist in the data analytics group at QCRI (Qatar) and an assistant professor at Arizona State University (USA). His research is in the broad areas of scalable data management and information quality, with a focus on data integration and computational claim verification.
News
(Complete list)- 10/2023 - Talk on 'SQL and Large Language Models: A Marriage Made in Heaven?' at the SAP Predictive Summit and the TUM database group.
- 9/2023 - Our vision paper 'Querying Large Language Models with SQL' has been accepted at EDBT 2024 (pdf) (code) (blog post).
- 9/2023 - 'QATCH: Benchmarking Table Representation Learning Models on Your Data' has been accepted at NeurIPS 2023 in the Datasets and Benchmarks Track (code).
- 8/2023 - Distinguished reviewer award at VLDB 2023.
- 8/2023 - We present 'Maximizing Neutrality in News Ordering' at KDD 2023 (video).
- 7/2023 - Talk on 'Querying Large Language Models with SQL' at the Systems group in TU Darmstadt.
- 7/2023 - Course on 'Computational Methods to Counter Online Misinformation' at the 4th ACM Europe Summer School on Data Science.
- 7/2023 - Talk on 'Explainable Fact Checking with Structured Data' at the Athena Research Center.
- 6/2023 - Presented the research paper 'Exploratory Training: When Annotators Learn About Data' and the tutorial 'Models and Practice of Neural Table Representations' (slides) (video) at SIGMOD 2023.
Recent Activities
(Complete list)- PC Co-Chair: Integrity 2023,
- Demo Co-Chair: SIGMOD 2023
- Associate Editor: SIGMOD (2025), VLDBJ (since 2023)
- PC Member: SIGMOD (2024, 2023), VLDB (2024, 2023), EDBT (2024), ACL (2023), QDB (2023), SEBD (2023), BDA (2023), TaDA@VLDB (2023), TRL@NeurIPS (2023)
Selected Publications
Data Cleaning
- R. Cappuzzo, P. Papotti, S. Thirumuruganathan
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks.
In SIGMOD, 2020. (.pdf) (code) (video) - S. Ortona, V. Meduri, P. Papotti
Robust Discovery of Positive and Negative Rules in Knowledge-Bases.
In ICDE, 2018. (Tech. Report) (code) (.pdf) - R. Singh, V. Meduri, A. Elmagarmid, S. Madden, P. Papotti, J. Quiane, N. Tang, A. Solar
Synthesizing Entity Matching Rules by Examples.
PVLDB, 2016. (.pdf) - E. Veltri, D. Santoro, G. Mecca, P. Papotti, J. He, G. Li, N. Tang
Interactive and Deterministic Data Cleaning.
In SIGMOD, 2016. (.pdf) - Z. Abedjan, X. Chu, D. Deng, R. Fernandez, I. Ilyas, M. Ouzzani, P. Papotti, M. Stonebraker, N. Tang
Detecting Data Errors: Where are we and what needs to be done?.
PVLDB, 2016. (.pdf) - F. Geerts, G. Mecca, P. Papotti, D. Santoro.
The LLUNATIC Data-Cleaning Framework.
PVLDB, 2013. (.pdf) (code) - X. Chu, I. Ilyas, P. Papotti
Discovering Denial Constraints.
PVLDB, 2013. (.pdf)
Computational Fact Checking
- M. Saeed et al.
Crowdsourced Fact-Checking at Twitter: How Does the Crowd Compare With Experts?.
(.pdf) CIKM, 2022. - M. Mori et al.
Neural machine Translation for Fact-Checking Temporal Claims.
(.pdf) FEVER, 2022. - M. Saeed et al.
RuleBERT: Teaching Soft Rules to Pre-Trained Language Models.
EMNLP, 2021. (.pdf) (code) - P. Nakov et al.
Automated Fact-Checking for Assisting Human Fact-Checkers.
IJCAI, 2021. (.pdf) - G. Karagiannis, M. Saeed, P. Papotti, I. Trummer.
Scrutinizer: a mixed-initiative approach to large-scale, data-driven claim verification.
PVLDB, 2020. (.pdf) (code) (video) - P. Huynh, P. Papotti.
A Benchmark for Fact Checking Algorithms Built on Knowledge Bases.
CIKM, 2019. (.pdf) (code) - N. Ahmadi, J. Lee, P. Papotti, M. Saeed.
Explainable Fact Checking with Probabilistic Answer Set Programming.
Conference for Truth and Trust Online (TTO), 2019. (.pdf) (code)
Table Representation Learning
- G. Badaro, M. Saeed, P. Papotti
Transformers for Tabular Data Representation: A Survey of Models and Applications.
In Transactions of the ACL (TACL), 2023. (.pdf) - M. Saeed, P. Papotti
You are my type! Type embeddings for pre-trained language models.
In EMNLP (Findings), 2022. (.pdf) (code) - E. Veltri, G. Badaro, M. Saeed, P. Papotti
Data Ambiguity Profiling for the Generation of Training Examples.
In ICDE, 2023. (.pdf) (code) - G. Badaro, P. Papotti.
Transformers for Tabular Data Representation: Models and Applications.
VLDB (Tutorial), 2022. (.pdf) (slides) - E. Veltri, D. Santoro, G. Badaro, M. Saeed, P. Papotti
Pythia: Unsupervised Generation of Ambiguous Textual Claims from Relational Data.
In SIGMOD (demo), 2022. (.pdf) (code) - N. Ahmadi, A. Sand, P. Papotti.
Unsupervised Matching of Data and Text.
ICDE, 2022. (.pdf) (code)
Data Exchange
- P. Atzeni, L. Bellomarini, P. Papotti, R. Torlone.
Meta-Mappings for Schema Mapping Reuse.
PVLDB, 2019. (.pdf) - B. Marnette, G. Mecca, P. Papotti.
Scalable Data Exchange with Functional Dependencies.
PVLDB, 2010. (.pdf) (.ppt) (code) - G. Mecca, P. Papotti, S. Raunich.
Core Schema Mappings.
In SIGMOD Conference, 2009. (.pdf) (.ppt) (tech. report) (code) - M.A. Hernandez, P. Papotti, W.C. Tan.
Data Exchange with Data-Metadata Translations.
In VLDB Conference, 2008. (.pdf) (.ppt) - A. Raffio, D. Braga, S.Ceri, P. Papotti, M.A. Hernandez.
Clip: a Visual Language for Explicit Schema Mappings.
In ICDE Conference, 2008. (.pdf) - A. Fuxman, M.A.Hernandez, H.Ho,
R.J. Miller, P. Papotti, L.Popa.
Nested Mappings: Schema Mapping Reloaded.
In VLDB Conference, 2006. (.pdf) (.ppt)
Web Data Extraction and Integration
- M. Bronzi, V. Crescenzi, P. Merialdo, P. Papotti.
Extraction and Integration of Partially Overlapping Web Sources.
PVLDB, 2013. (.pdf) - L.Blanco, V.Crescenzi, P.Merialdo, P.Papotti.
Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources.
In CAiSE Conference, 2010. (.pdf)
Schema Exchange
- P. Papotti and R. Torlone.
Schema exchange: Generic mappings for transforming data and metadata.
In Data & Knowledge Engineering, 2009. (.pdf) - P. Papotti and R. Torlone.
Automatic Generation of Model Translations.
In CAiSE Conference, 2007. (.pdf)