Accepted_test
Motivation and Aim: The mainstream of the post-genome target-assisted plant breeding is biofortification as high-throughput phenotyping along with genome-based selection.
Methods and Algorithms: Therefore, in this work, we used our previously developed Web-service Plant_SNP_TATA_Z-tester to run a uniform in silico analysis of the transcription alterations of 54,013 protein-coding transcripts from 32,833 Arabidopsis thaliana L. genes caused by 871,707 single nucleotide polymorphisms (SNPs) within the 90 bp proximal promoter regions. We took their DNA sequences and SNPs from the databases Ensembl Plant and TAIR, respectively, as depicted in Figure 1.
Results: The analysis has identified 54,993 SNPs, each of which can significantly either upregulate or downregulate a proper Arabidopsis gene by means of an alteration in TATA-binding protein (TBP) binding affinity to the promoters carrying this SNP. The existence of these SNPs in highly conserved proximal promoters may be explained as intraspecific diversity kept by the stabilizing natural selection. To support this, using the database PubMed we hand-annotated papers on some of the Arabidopsis genes possessing these SNPs or on their orthologs in other plant species and demonstrated the effects of changes in these gene expressions on plant vital traits. We integrated in silico estimates of the TBP-promoter affinity in the AtSNP_TATAdb knowledge base (https://www.sysbio.ru/AtSNP_TATAdb/) and showed their significant correlations with independent in vivo experimental data, as exemplified in Figure 2.
Conclusion: These correlations appeared to be robust to variations in statistical criteria, genomic environment of TATA box regions, plants species and growing conditions.
Funding: The study is supported by the Government Budget Project FWNR-2022-0020.