dc.description.abstract | The availability of genome-wide gene expression data provides a unique set of genes from which can be to decipher the mechanisms underlying the common transcriptional response. The identification of transcription factor binding sites provides valuable information on gene expression and regulation. Recently, the biological information and analyzing methods are available for the analysis of gene expression and transcriptional regulatory sequences. However, users should make elaborate the complicated analysis processes to query the data from different databases, followed by analyzing the gene upstreams by different prediction tools, and finally convert among different data formats. Beyond methods for the prediction of transcriptional regulatory site, new automated and integrated methods for gene upstream sequence analysis at a higher level are needed. Since the identification of regulatory sites requires a large set of biological databases, methods for an efficient and integrated data management are also crucial. In this dissertation, we proposed a predictive system, designated RgS-Miner, which is capable of predicting transcriptional regulatory sites in eukaryotes and detecting co-occurrence of these regulatory sites by inputting a group of genes, i.e., a set of genes that are considered potentially with the common regulatory mechanisms. The system integrates several regulatory site detection methods, such as known site matching, over-presented oligonucleotide detection, and DNA motif discovery. Three case studies in yeast and human genomes are studies in the proposed system. Besides, the system successfully constructs a biological data warehouse to integrate a variety of heterogeneous biological databases. By comparison to other systems, our system is a useful tool in the analyses of transcriptional regulatory sites when users investigate on the regulation of gene expression. | en_US |