Genome-wide Prediction of Transcription Start Site in Four Conifer Species

Poster (download) Eugeniia I. Bondar1, Dmitry A. Kuzmin2, Konstantin V. Krutovsky3, Vadim V. Sharov4, Tatiana V. Tatarinova51Laboratory of Forest Genomics, Siberian Federal University; Laboratory of Genomeic Research and Biotechnology, FRC KSC SB RAS, Krasnoyarsk, Russia, bondar.ev@ksc.krasn.ru2Laboratory of Forest Genomics Siberian Federal University; Department of High Performance Computing, Siberian Federal University, Krasnoyarsk, Russia, dm.kuzmin@gmail.com3Laboratory of Forest Genomics, Siberian Federal University, Krasnoyarsk, Russia; Department of Forest Genetics and Forest Tree Breeding, Georg-August of Göttingen, Göttingen, Germany; Laboratory of Population Genetics, Vavilov Institute of General Genetics, Moscow, Russia; Department of Ecosystem Science and Management Texas A&M University College Station, TX, USA, konstantin.krutovsky@forst.uni-goettingen.de4Laboratory of Genomeic Research and Biotechnology, FRC KSC SB RAS; Laboratory of Forest Genomics, Siberian Federal University; Department of High Performance Computing, Siberian Federal University, Krasnoyarsk, Russia, sharvadim07@ya.ru5Department of Biology, University of La Verne, La Verne, USA; Functional Genomics Group, Vavilov Institute for General Genetics, Moscow, Russia; Siberian Federal University, Krasnoyarsk, Russia; Bioinformatics Center of IITP RAS, Moscow, Russia, ttatarinova@laverne.edu Current draft annotations for sequenced conifer genomes are preliminary and limited, but provide opportunities for further structural and functional analysis. We attempted to improve the existing genome annotations by marking 5’-UTRs in the four conifer species Pinus taeda, Picea glauca, Picea abies and Larix sibirica. Prediction of transcription start sites (TSS) was performed on the promoter sequences of genes with RNA or protein support using TSS prediction program TSSPlant. The distribution of 5’-UTR lengths from the annotations of several model plants was used to select the best prediction per gene. Frequency of TATA(A/T)A(A/T) motif in the predicted TSS-centered promoter regions showed a pronounced peak around 60 bp upstream of TSS.

Read More