Long and Multi-Document Abstractive Summarization in Low-Resource Regimes

Luca Ragazzi -
Data Science

Date: -
Location: Eurecom

Abstract: Automatic text summarization (ATS) is one of the most crucial tasks of natural language processing, especially for those domains where documents are lengthy, numerous, and complex to analyze (e.g., legal cases and scientific articles). Thanks to the latest advancements in neural architectures, new cutting-edge solutions based on transformers have been proposed for this task. Nevertheless, the shortage of labeled data and computational memory (a real-world low-resource regime) affects model performance. We provide a background on ATS and show how generative language models can successfully and efficiently address two critical ATS tasks, such as long and multi-document summarization, regarding the synthesis generation of a single long text and many topic-related documents, respectively. Bio: Luca Ragazzi received his B.S. and M.S. degrees (with honors) in computer science and engineering. He is a third-year Ph.D. student at the Department of Computer Science and Engineering, University of Bologna, Italy. He has natural language processing (NLP) and understanding (NLU) competencies, focusing on single and multi-document summarization in low-resource regimes with state-of-the-art language models. He presented several original papers to top-tier international conferences, such as AAAI and ACL.