Accepted_test

Prediction of properties of biological sequences based on their structural formulas - a new approach to data representation in bioinformatics
by Alexey A. Lagunin | Smirnov A.S. | Zadorozhny A.D. | Lebedev N.V. | Polusmak I.V. | Zhuravleva S.I. | Rudik A.V. | Filimonov D.A. | Pirogov Russian National Research Medical University (RNRMU); Institute of Biomedical Chemistry (IBMC) | Pirogov Russian National Research Medical University (RNRMU) | Pirogov Russian National Research Medical University (RNRMU) | Pirogov Russian National Research Medical University (RNRMU) | Pirogov Russian National Research Medical University (RNRMU) | Pirogov Russian National Research Medical University (RNRMU) | Institute of Biomedical Chemistry (IBMC) | Institute of Biomedical Chemistry (IBMC)
Abstract ID: 696
Event: BGRS-abstracts
Sections: [Sym 4] Section “Human medical genomics/genetics”

Our research offers an alternative to representing aa and nt sequences of proteins in the form of structural formulas of their fragments, which makes it possible to use methods previously developed in chemoinformatics for identifying structure-property relationships. To describe the structural formulas of fragments of macromolecules, high-level substructural descriptors of multilevel atomic neighborhoods MNA (Multilevel Neighborhoods of Atoms) are used, implemented in specially developed for working with such data MultiPASS program, which uses the Bayesian algorithm to identify the “structure-property” relationships.

We have demonstrated the effectiveness of this approach in methods for predicting: pathogenic amino acid substitutions, amino acid substitutions associated with drug resistance of tumors, specificity of T-cell receptors for epitopes and post-translational modifications. The speech will also demonstrate the use of our approach to predict secondary structures and functional motifs of proteins, as well as microRNA targets and binding sites for human transcription factors.

The proposed approach represents a new direction in bioinformatics in the study of the properties of biological macromolecules based on the structural formulas of their fragments.