Digital preservation with synthetic DNA

Marinelli, Eugenio; Ghabach, Eddy; Bolbroe, Thomas; Sella, Omer; Heinis, Thomas; Appuswamy, Raja
Chapter 5 of "Transactions on Large-Scale Data- and Knowledge-Centered Systems LI", Part of the Lecture Notes in Computer Science book series, TLDKS, Vol. 13410, 2022

The growing adoption of AI and data analytics in various sectors has resulted in digital preservation emerging as a cross-sectoral problem that affects everyone from data-driven enterprises to memory institutions alike. As all contemporary storage media suffer from fundamental density and durability limitations, researchers have started investigating new media that can offer high-density, long-term preservation of digital data. Synthetic Deoxyribo Nucleic Acid (DNA) is one such medium that has received a lot of attention recently. In this paper, we provide an overview of the ongoing collaboration between the European Union-funded, Future and Emerging Technologies project OligoArchive and the Danish National Archive in preserving culturally important digital data with synthetic DNA. In doing so, we highlight the challenges involved using DNA for long-term preservation, and present a holistic data storage pipeline that brings together several novel techniques (standardized file storage, motif-based DNA encoding, scalable read consensus to name a few) to provide reliable, passive, obsolescence-free digital preservation using synthetic DNA.


DOI
HAL
Type:
Book
Date:
2022-10-08
Department:
Data Science
Eurecom Ref:
7088
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in Chapter 5 of "Transactions on Large-Scale Data- and Knowledge-Centered Systems LI", Part of the Lecture Notes in Computer Science book series, TLDKS, Vol. 13410, 2022 and is available at : https://doi.org/10.1007/978-3-662-66111-6_5

PERMALINK : https://www.eurecom.fr/publication/7088