Accepted_test
Potato (Solanum tuberosum) breeding relies heavily on advanced methods, such as marker assisted selection (MAS). Taking the full advantage of MAS requires extensive knowledge of genetic polymorphisms that can be used as genetic markers. The most comprehensive information on genetic polymorphisms is generated by phased genome assemblies. Long read whole-genome sequencing and Hi-C facilitate phased genome assembly even for organisms with complex polyploid genomes, such as potato. However, such tools are quite expensive and not always available. Here we demonstrate how partially phased sequences assembled with Illumina short reads can be used to successfully discover novel genetic polymorphisms in autotetraploid potato.
For our search of novel genetic polymorphisms we focused on 9 genes encoding the key enzymes of plant carbohydrate metabolism. Previously, using Illumina short reads, our colleagues assembled genomes of five Russian potato cultivars. Reads belonging to selected genes were de novo assembled by SPAdes with default parameters and phased with freebayes and whatshap programmes. Only two of the four haplotypes were resolved for each gene. For each cultivar, the resolved haplotypes demonstrated different SNP distribution. To verify SNPs predicted by assemblies, we used cleaved amplified polymorphic sequence (CAPS) assay. We were able to detect more allelic variants (3-4) than was predicted by individual assemblies for most of the genes. Our results indicate that while some SNPs may be hidden due to partial phasing, and thus designated as ‘non-polymorphic’, using assemblies of several cultivars can help alleviate this problem by providing additional information about these sites.