於後基因體時代,表觀基因體學對於生物學家而言是一項重要的研究領域。 DNA甲基化是一種附加到DNA上的化學修飾,研究指出發生於CpG位置上的甲基化狀態與DNA表現以及一些疾病相關,例如癌症。 如果不正常的甲基化發生於轉錄因子結合位點時,可能會影響轉錄因子的結合而進一步影響DNA的表現,因此找出不正常甲基化的位置是非常重要的。我們使用轉錄因子結合位點以及DNA的特殊序列出現次數做為建立預測模型的特徵,並且為了去了解不同組織以及不同DNA區域間甲基化差異,我們建立的不同的預測模型來分析這些預測模型所使用的特徵差異。於結果中,我們的預測模型有良好的預測結果,我們使用10折交叉驗證特異性80.54%、敏感性為80.54%以及準確度為86.01%。針對不同組織細胞以及不同區域所建立預測模型的準確度也都高於80%,並且比較不同區域間前七十名的特徵發現,共同的特徵約佔50%,由此結果可推測不同區域間的特徵與甲基化狀態存在著差異。 DNA methylation is a biochemical modification in epigenetics. The 80% cytosines at CpG dinucleotide are found methylation. The DNA methylation is important for gene expression and cancer. The transcription factor binding will be affected if aberrant DNA methylation occurred in TFBSs. To figure out where be methylated is an important research. To reveal the effective features for different tissues and regions, we develop models to compare differences between 4-regions and 12-tissues. The TFBS and DNA properties and distribution are features for classification. From our results, we found some TFBS (e.g. SP1, ZF5 and etc.) that would discriminate methylated or not. The sensitivity and specificity and accuracy by using 10-fold cross validation are about 90.8%, 80.54%, and 86.07%, respectively. According to four-regions and twelve-tissues, the performances (ACC) are all 80% highly. We conjecture that the differential features or methylation are between different regions because the common features of each region are only 50% in the top 70 feature.