« How Much Information Can Be Stored Using Very Short Molecules ?»

Ran Tamir -
Communication systems

Date: -
Location: Eurecom

Abstract: From an information-theoretic point of view, the commonly adopted DNA storage channel, the shuffling-sampling channel, has two distinct operational regimes. If the stored molecules are relatively long, so that the number of molecule types is larger than the number of molecules that store the message, the channel capacity is positive. Alternatively, if the molecules are relatively short and so that the number of molecule types is smaller than the number of molecules that store the message, then the resulting storage system is characterized by capacity zero. In this short molecule regime, Shomorony and Heckel (2022) put forward a conjecture on the scaling of the number of information bits that can be reliably stored in the noiseless case. This conjecture was only partially proved by Gerzon, Weinberger, and Shomorony (2025). In the first part of this talk, I will describe a random-coding scheme which completes the proof of the conjecture. Since this random-coding scheme is computationally heavy, the second part of this talk is devoted to an alternative coding scheme which operates at a significantly lower computational complexity but achieves the optimal scaling, except for a specific range of very short molecules. Short bio: Ran Tamir is a post-doctoral fellow in the Department of Signal Theory and Communications, Universitat Politècnica de Catalunya. Previously, from 2021 to 2023 he was a postdoctoral fellow at ETH Zurich, and during 2024 he was a postdoctoral fellow at EPFL. He holds B.Sc. (2010), M.Sc. (2017), and Ph.D. (2021) degrees, all in Electrical Engineering, from the Faculty of Electrical and Computer Engineering, Technion - Israel Institute of Technology. His main research interests lie between information theory and mathematical data sciences.