Genome assembly of Colletotrichum lini from long Nanopore reads

by Sigova E.A. | Dvorianinova E.M. | Rozhmina T.A. | Kudryavtseva L.P. | Melnikova N.V. | Dmitriev A.A. |
Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia | Engelhardt
Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia | Federal Research Center
for Bast Fiber Crops, Torzhok, Russia

Motivation and Aim: Colletotrichum lini is the malicious flax anthracnose causative agent.
However, the lack of the C. lini whole genome sequence hinders extensive molecular
research on the pathogen. Therefore, our aim was to obtain the first genome assembly of C.
lini using the Oxford Nanopore Technologies (ONT) sequencing platform.
Methods and Algorithms: C. lini highly pathogenic strain #811 was provided by the Institute
for Flax (Torzhok, Russia). Pure high-molecular DNA was obtained according to our
previously developed protocol. The spectrophotometry (Nanodrop) and fluorometry (Qubit
4) methods were used to evaluate the quality and quantity of the extracted DNA. DNA
libraries were prepared and sequenced on the ONT (MinION instrument, FLO-MIN-106
R9.4.1 flow-cell) platform according to the manufacturer’s protocol. The obtained reads
were bacecalled using Guppy 6.0.1 with different quality filtration thresholds (min_qscore in
range from 7 to 10). Porechop 0.2.4 was used for adapter removing. Draft assemblies were
performed using Canu 2.2, Flye 2.8.1, Raven 1.5.1, Shasta 0.8.0, Wtdbg-cns 1.1 (Wtdbg2
0.0), NextDenovo 2.5.0, Miniasm 0.3-r179, Ra 0.2.1, and SmartDenovo tools. BUSCO 5.3.2
and QUAST 5.0.2 were used to analyze the quality of the obtained assemblies.
Results: We obtained 1.7 Gbases (Gb) of raw ONT reads with an N50 of 15.7 kb. The
assemblers gave better results at lower min qscore values (7-8). The average length of the
assemblies with BUSCO > 80% was 52.2 Mb (48.1-53.5 Mb). Flye produced the most
contiguous and complete assembly from the ONT data basecalled with min qscore = 7: N50
of 4.4 Mb for a total length of 53.4 Mb, 42 contigs.
Conclusion: We obtained the first C. lini genome assembly from long ONT reads. This
knowledge is a starting point for further detailed research on C. lini and the flax-pathogen
Acknowledgements: This work was financially supported by the Russian Science
Foundation, grant 22-16-00169.