Motif-based approaches for scaling read/write cost in DNA storage

Appuswamy, Raja

DBDS 2025, Invited Talk in New Trends in DNA-Based Data Storage Conference, 3-6 June 2025, Prague, Czech Republic

The surge in demand for cost-effective long-term archival media, coupled with density limitations of contemporary magnetic media, has resulted in synthetic DNA emerging as a promising new alternative. Despite its benefits, storing data in DNA poses several challenges as the technology used for writing data on DNA is very expensive, and technology used for reading data is error prone. Thus, it is important to design pipelines that can efficiently use redundancy to mask errors without amplifying read/write cost. In this talk, we will present the benefits of using motifs, which are short sequences of nucleotides, as fundamental building blocks for encoding/decoding digital data to/from DNA instead of the traditional approach of using nucleotides. We first present CMOSS [1], a reliable, motif-based Columnar Molecular Storage System that focuses on reading the read cost of DNA data storage. CMOSS differs from state-of-the-art approaches on three fronts. First, it uses a motif-based, columnar layout in contrast to nucleotide-based row layout for designing oligos. Second, thanks to use of motifs, it performs integrated consensus calling and decoding to effectively handle reliability bias in DNA storage. Third, it provides a flexible, blockbased data organization for random access over DNA storage in contrast to object-based organization used by state-of-the-art. Using results from large-scale wet-lab experiments, we demonstrate the benefit of CMOSS’ motif-based design in reducing read cost. Next, we focus on reducing the benefit of using motifs for reducing the write cost of DNA storage. In particular, we will present the composite motifs [2]—a framework that uses a mixture of prefabricated motifs sequences assembled as building blocks of synthesis to reduce write cost by scaling logical density. We will provide an overview of synthesis and sequencing techniques, consensus calling methods, and encoding/decoding algorithms customized to the composite motifs framework. Using these tools, we will present the results from wet-lab experiments that show how the use of motifs as building blocks can provide an order of magnitude reduction in write cost as well.

Detail

BIBTEX

Type:

Talk

City:

Prague

Date:

2025-06-03

Department:

Data Science

Eurecom Ref:

8250