Genome-wide Prediction of Transcription Start Site in Four Conifer Species
Poster (download)
[pdf-embedder url=”https://bgrssb.icgbio.ru/wp-content/uploads/2020/07/104.pdf”]
Eugeniia I. Bondar1, Dmitry A. Kuzmin2, Konstantin V. Krutovsky3, Vadim V. Sharov4, Tatiana V. Tatarinova5
1Laboratory of Forest Genomics, Siberian Federal University; Laboratory of Genomeic Research and Biotechnology, FRC KSC SB RAS, Krasnoyarsk, Russia, bondar.ev@ksc.krasn.ru
2Laboratory of Forest Genomics Siberian Federal University; Department of High Performance Computing, Siberian Federal University, Krasnoyarsk, Russia, dm.kuzmin@gmail.com
3Laboratory of Forest Genomics, Siberian Federal University, Krasnoyarsk, Russia; Department of Forest Genetics and Forest Tree Breeding, Georg-August of GГ¶ttingen, GГ¶ttingen, Germany; Laboratory of Population Genetics, Vavilov Institute of General Genetics, Moscow, Russia; Department of Ecosystem Science and Management Texas A&M University College Station, TX, USA, konstantin.krutovsky@forst.uni-goettingen.de
4Laboratory of Genomeic Research and Biotechnology, FRC KSC SB RAS; Laboratory of Forest Genomics, Siberian Federal University; Department of High Performance Computing, Siberian Federal University, Krasnoyarsk, Russia, sharvadim07@ya.ru
5Department of Biology, University of La Verne, La Verne, USA; Functional Genomics Group, Vavilov Institute for General Genetics, Moscow, Russia; Siberian Federal University, Krasnoyarsk, Russia; Bioinformatics Center of IITP RAS, Moscow, Russia, ttatarinova@laverne.edu
Current draft annotations for sequenced conifer genomes are preliminary and limited, but provide opportunities for further structural and functional analysis. We attempted to improve the existing genome annotations by marking 5’-UTRs in the four conifer species Pinus taeda, Picea glauca, Picea abies and Larix sibirica. Prediction of transcription start sites (TSS) was performed on the promoter sequences of genes with RNA or protein support using TSS prediction program TSSPlant. The distribution of 5’-UTR lengths from the annotations of several model plants was used to select the best prediction per gene. Frequency of TATA(A/T)A(A/T) motif in the predicted TSS-centered promoter regions showed a pronounced peak around 60 bp upstream of TSS.
