DATA Talk : Thursday June 27th 2019 - 11:00am

Renée J. Miller - University Distinguished Professor of Computer Science at Northeastern University
Data Science

Date: June 27th 2019
Location: Eurecom - Eurecom

Title: " Data Discovery in Data Lakes" Abstract: The ubiquity of data lakes has created fascinating new challenges for data management research. In data lakes, the main challenge is not in integrating known data, rather it is in finding the right data to solve a given problem. Traditionally, data integration is done using a framework called query discovery where the main task is to discover a query (or transformation) that translates data from one form into another. The goal is to find the right operators to join, nest, group, link, and twist data into a desired form. We introduce a new paradigm for thinking about integration where the focus is on data discovery, finding the right data for a task. We consider highly efficient, massively scalable discovery that is driven by data analysis needs. We describe a research agenda and recent progress in developing scalable data-aware discovery algorithms that provide high recall and accuracy over massive data lakes. Biography: Renée J. Miller is a University Distinguished Professor of Computer Science at Northeastern University. She is a Fellow of the Royal Society of Canada, Canada’s National Academy of Science, Engineering and the Humanities. She received the US Presidential Early Career Award for Scientists and Engineers (PECASE), the highest honor bestowed by the United States government on outstanding scientists and engineers beginning their careers. She received an NSF CAREER Award, the Ontario Premier’s Research Excellence Award, and an IBM Faculty Award. She formerly held the Bell Canada Chair of Information Systems at the University of Toronto and is a fellow of the ACM. Her work has focused on the long-standing open problem of data integration and has achieved the goal of building practical data integration systems. She and her co-authors (Fagin, Kolaitis and Popa) received the (10 Year) ICDT Test-of-Time Award for their influential 2003 paper establishing the foundations of data exchange. Professor Miller has received the VLDB Women In Database Research Award and was elected president of the non-profit Very Large Data Base Foundation. She received her PhD in Computer Science from the University of Wisconsin, Madison and bachelor’s degrees in Mathematics and Cognitive Science from MIT.

