Accepted_test

Multifunctional software pipeline for processing HTS data obtained using highly multiplexed primer panels
by Boris Gukov | Ivan Stetsenko | Alina Matsvay | German Shipulin | Federal State Budgetary Institution «Centre for Strategic Planning and Management of Biomedical Health Risks» of the Federal medical and biological agency, Moscow, Russia | Federal State Budgetary Institution «Centre for Strategic Planning and Management of Biomedical Health Risks» of the Federal medical and biological agency, Moscow, Russia | Federal State Budgetary Institution «Centre for Strategic Planning and Management of Biomedical Health Risks» of the Federal medical and biological agency, Moscow, Russia | Federal State Budgetary Institution «Centre for Strategic Planning and Management of Biomedical Health Risks» of the Federal medical and biological agency, Moscow, Russia
Abstract ID: 222
Event: BGRS-abstracts
Sections: [Sym 2] Section “Mathematical epidemiology”

Viruses are the most common form of life on Earth. Most viruses that cause disease in humans are zoonotic in origin and were transmitted from animals. Particular attention is paid to bats, which are an important natural reservoir of potential zoonotic infections. The main challenge in combating future zoonoses is to proactively control and monitor the spread of known and emerging pathogens. DNA sequencing allows not only to identify the etiological factor of the disease, but also to carry out genotyping and reconstruction of genomes. The use of highly multiplex primer panels of broad screening for targeted enrichment makes it possible to increase the efficiency of the study by significantly reducing its cost, however, shotgun sequencing processing programs are not effective for processing such data. The developed “pathogenid” program is aimed at analyzing the nucleic acids of pathogens. It filters raw reads, merges paired reads, annotates data, performs clustering, and generates reports in excel and JSON format. The program supports second and third generation sequencing platforms. The program was tested on 713 biosamples from 29 bat species. Seven primer panels were used to study 52 virus genera from 23 families. 217 known groups of viruses and 343 supposedly new viruses were discovered. The family Coronaviridae turned out to be the most common. The program effectively determines the taxonomic affiliation of virus hosts in the presence of genomic data Pathogenid is suitable for virome analysis of natural reservoirs of infections and mass screening of biomaterials, identifying both known and new viruses.