Direct high-throughput deconvolution of non-canonical bases via nanopore sequencing and bootstrapped learning

Page view(s)
7
Checked on Sep 09, 2025
Direct high-throughput deconvolution of non-canonical bases via nanopore sequencing and bootstrapped learning
Title:
Direct high-throughput deconvolution of non-canonical bases via nanopore sequencing and bootstrapped learning
Journal Title:
Nature Communications
Keywords:
Publication Date:
30 July 2025
Citation:
Perez, M., Kimoto, M., Rajakumar, P., Suphavilai, C., Peres da Silva, R., Tan, H. P., Ong, N. T. X., Nicholas, H., Hirao, I., Chew, W. L., & Nagarajan, N. (2025). Direct high-throughput deconvolution of non-canonical bases via nanopore sequencing and bootstrapped learning. Nature Communications, 16(1). https://doi.org/10.1038/s41467-025-62347-z
Abstract:
The discovery of non-canonical bases (NCBs) and development of synthetic xeno-nucleic acids (XNAs) has spawned interest in many applications in viral genomics, synthetic biology and DNA storage. However, inability to do high-throughput sequencing of NCBs has been a significant limitation. We demonstrate that XNAs with NCBs can be robustly sequenced on a MinION system (> 2.3×106 reads/flowcell) to obtain significantly distinct signals from controls (median fold-change >6×). To enable AI-model training, we synthesized and sequenced a complex pool of 1,024 NCB-containing oligonucleotides with varied 6-mer contexts and high purity ( > 90%). Bootstrapped models assisted in data preparation, and data augmentation with spliced reads provided high context diversity, enabling learning of generalizable models to decipher NCB-containing sequences with high accuracy ( > 80%) and specificity (99%). These results highlight the versatility of nanopore sequencing for interrogating unusual nucleic acids, and the potential to transform the study of genetic material beyond those that use canonical bases.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
This research / project is supported by the Ministry of Health (MOH) - Joint Strategic Open Grant Call
Grant Reference no. : PREPARE-OC-ETM-Dx-2023-005

This research / project is supported by the Ministry of Health (MOH) - Joint Strategic Open Grant Call
Grant Reference no. : PREPARE-OC-ETM-Dx-2023-006

This research / project is supported by the Agency for Science, Technology and Research - Advanced Manufacturing and Engineering Programmatic grant
Grant Reference no. : A18A9b0060

This work was funded by the Advanced Manufacturing and Engineering (AME) Programmatic grant A18A9b0060 (to I.H., W.L.C., N.N.). Additional funding supporting manpower came from Singapore Ministry of Health’s National Medical Research Council under its Open Fund – Individual Research Grants (NMRC/OFIRG/MOH-000649-00 to N.N.), Agency for Science, Technology and Research (A*STAR)’s Manufacturing, Trade And Connectivity (MTC) Individual Research Grant (M24N7c0088 to W.L.C), and the Programme for Research in Epidemic Preparedness and Response (PREPARE), under its Joint Strategic Open Grant Call (Environmental Transmission & Mitigation and Diagnostics Co-operatives; PREPARE-OC-ETM-Dx-2023-006 (to N.N.) and PREPARE-OC-ETM-Dx-2023-005 (to W.L.C.)). This research was supported by a National Research Foundation Investigatorship grant NRFI09-0015 (to N.N.).
Description:
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
ISSN:
2041-1723