Conference
Corallo, Giulio; Papotti, Paolo
FINCH: Prompt-guided key-value cache compression for large language models
EMNLP 2024, Conference on Empirical Methods in Natural Language Processing, 12-16 November 2024, Miami, Florida, USA / Also published in Transactions of the Association for Computational Linguistics, Vol.12, 2024