全基因體的拷貝數變異,自全人類基因體定序計畫完成之後便已漸漸被注意及探討。其中,以老鼠為模型的實驗有完善的微陣列晶片數據和明確的品種間基因拷貝數差異性。利用兩種不同的演算法─隱藏馬可夫模型以及成對高斯合併法─來判定老鼠全基因體拷貝數位置,我們發現這兩者判定的結果,無論在拷貝數變異區段之長度、位置、或是數量上,都有非常顯著的差異。我們認為原因是:兩種演算法背後有著截然不同的統計理論支持,導致判定區段時的策略不同。成對高斯合併法判定的拷貝數變異區段相對於隱藏馬可夫模型判定的結果來說,有較廣的區段長度分布,也有較多的區段個數。但是我們發現將兩者過短的區段忽略不看之後,判定的總區段數量便會幾乎相同。未來,我們也可以將這兩種演算法預測的結果拿來做進一步的比較,找出相同或相異的基因名稱及其註解;或甚至與更多不同的演算法比較。除了探討各種演算法的計算速度與硬體消耗程度之外,也可以套用在分析老鼠全基因體拷貝數變異的研究上。 Whole genome copy number variation (CNV) has been noticed and the related studies grew in amount since the completion of Human Genome Project (HGP). Those experiments using mouse as a biological model present a complete microarray data and clear CNV diversities between different strains. Applying two different algorithms, Hidden Markov Model (HMM) & Pair-wised Gaussian Merging (PGM), to determine the mouse genome-wide CNV segment, we found that the results are significantly different on CNV length, CNV location, and the number of CNV segments. We thought the reason might be: The two underlying statistical theories are quite different, leading to the different decision-making patterns of finding CNV segments. The distribution of the length of CNV segment determined by PGM is wider than those determined by HMM. However, after filtering the shorter CNV segments, the total number of results generated by these two algorithms became almost the same. So we can do further study on the data generated by HMM & PGM, such as finding out the CNV segments that only appeared in one of their results and checking the gene symbols or gene annotations. Besides the comparison of the calculating speed and space requirement between these algorithms, we can even applying them on the analysis of mouse whole-genome CNV.