運用深度學習方法預測癌症種類及存活死亡與治癒復發

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：10

、訪客IP：18.188.168.28

姓名

陳柏傑(Po-Chieh Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

運用深度學習方法預測癌症種類及存活死亡與治癒復發
(Deep learning in predicting outcomes of cancer type, overall survival and disease free survival)

相關論文

★ 整合深度學習方法預測年齡以及衰老基因之研究

★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

深度類神經網路(DNN)在各個不同的領域，舉凡聲音，影像等領域中皆有不凡的表現，而目前在生醫工程方面也陸續有深度學習方法的應用，而本論文中，我們使用來自The Cancer Genome Atlas(TCGA)上，針對DNA去做次世代定序而得出的RNA-seq資料，因其通量數高，且背景雜質低的特性，所以能最正確的偵測基因表現量。
在本篇論文裡，主要大方向為三：
1. 使用基因表現來判定癌症種類之預測
2. 分別對肺癌、乳癌、腦癌之存活，死亡分析其分類
3. 分別對肺癌、乳癌、腦癌之治癒，復發分析其分類
在本篇論文裡，實驗了一部份機器學習之方法，如決策樹(Decision Tree)，以及支援向量機(Support Vector Machine)，XGBoost(Extreme Gradient Boost)，也加入深度學習之方法，深度神經網路(Deep Neural Network)、自編碼器(Auto-encoder)及變異型自編碼器(Variational Auto-encoder)，去比較各個方法的辨識率。

摘要(英)

Deep neural networks (DNN) have extraordinary performances in various of fields such as sound and image processing. Recently, deep learning methods are applied in the field of biomedical engineering. In this paper, we use the RNA-sequencing data from TCGA (The Cancer Genome Atlas), which is sequenced from RNA data and generated by NGS (Next Generation Sequencing). Due to its high flux number and low background impurities, the most accurate detection of gene expression become possible.
In this paper, we have tree main directions:
1. Classification of cancer types based on RNA- sequencing data.
2. To predict the OS (Overall Survival) of lung, breast, and brain cancer.
3. To predict the DFS (Disease Free Survival) of lung, breast, and brain cancer.
In this paper, we have experimented with a number of methods of machine learning, such as Decision tree, Support Vector Machine and XGBoost (Extreme gradient boost), as well as deep learning methods, including DNN (Deep neural network), autoencoder and VAE (Variational Autoencoder). Our goal is to perform all of these methods and compare the recognition rate of each method.

關鍵字(中)

★ 機器學習
★ 癌症
★ 醫學預測
★ 深度學習

關鍵字(英)

★ Maching Learning
★ Cancer
★ Medical prediction
★ Deep learning

論文目次

章節目次
中文摘要 i
Abstract ii
圖目錄 iii
表目錄 iv
章節目次 v
第一章緒論 1
1.1 研究背景、動機及目的 1
1.2 研究方法與章節概要 2
第二章相關研究及文獻探討 4
2.1 深度學習 4
2.1.1 感知機原理 4
2.1.2 倒傳遞類神經網路 6
2.1.3 多層感知機架構 7
2.2 分類器 10
2.2.1 支援向量機 (Support Vector Machine, SVM) 10
2.2.2 決策樹 (Decision tree) 13
2.2.3 極限梯度上升模型 (XGboost) 16
第三章降維方法 20
3.1 主成分分析 (Principal component analysis, PCA) 20
3.2 自編碼器 21
3.3 變異型自編碼器模型 (Variational autoencoder, VAE) 25
3.3.1 潛在變量模型 25
3.3.2 VAE目標函數建立 26
3.3.3 目標函數優化 27
第四章整體實驗架構與方法 30
4.1 TCGA資料集 31
4.2 正規化方法 31
4.3 激發函數 33
第五章實驗結果 35
5.1 實驗設置與環境 35
5.2 實驗流程 37
5.3 激發函數的比較 38
5.4 比較結果 39
5.4.1 癌症種類預測 39
5.4.2 癌症存活死亡之預測 41
5.4.3 癌症治癒復發之預測 44
5.4.4 結果綜合討論 46
第六章結論及未來研究方向 47
參考文獻 48

參考文獻

參考文獻
[1] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks, ” in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249-256.
[2] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology, ” IEEE transactions on systems, man, and cybernetics, vol. 21, no. 3, pp. 660-674, 1991.
[3] S. Tong and D. Koller, “Support vector machine active learning with applications to text classification, ” Journal of machine learning research, vol. 2, no. Nov, pp. 45-66, 2001.
[4] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system, ” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785-794: ACM.
[5] S. C. AP et al., “An autoencoder approach to learning bilingual word representations, ” in Advances in Neural Information Processing Systems, 2014, pp. 1853-1861.
[6] Y. Pu et al., “Variational autoencoder for deep learning of images, labels and captions, ” in Advances in neural information processing systems, 2016, pp. 2352-2360.
[7] Y. Tan, J. Wang, and J. M. Zurada, “Nonlinear blind source separation using a radial basis function network, ” IEEE transactions on neural networks, vol. 12, no. 1, pp. 124-134, 2001.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks, ” in Advances in neural information processing systems, 2012, pp. 1097-1105.
[9] A. Mnih and K. Gregor, “Neural variational inference and learning in belief networks, ” arXiv preprint arXiv:1402.0030, 2014.
[10] I. Goodfellow et al., “Generative adversarial nets, ” in Advances in neural information processing systems, 2014, pp. 2672-2680.
[11] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition, ” arXiv preprint arXiv:1409.1556, 2014.
[12] W. Warsito and L. Fan, “Neural network based multi-criterion optimization image reconstruction technique for imaging two-and three-phase flow systems using electrical capacitance tomography, ” Measurement Science and Technology, vol. 12, no. 12, p. 2198, 2001.
[13] K. Chaudhary, O. B. Poirion, L. Lu, and L. X. Garmire, “Deep Learning based multi-omics integration robustly predicts survival in liver cancer, ” Clinical Cancer Research, p. clincanres. 0853.2017, 2017.
[14] P. Rajpurkar et al., “CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning, ” arXiv preprint arXiv:1711.05225, 2017.
[15] F. Iandola, M. Moskewicz, S. Karayev, R. Girshick, T. Darrell, and K. Keutzer, “Densenet: Implementing efficient convnet descriptor pyramids, ” arXiv preprint arXiv:1404.1869, 2014.
[16] Y. Liu, J. Zhou, and K. P. White, “RNA-seq differential expression studies: more sequence or more replication?, ” Bioinformatics, vol. 30, no. 3, pp. 301-304, 2013.
[17] M. G. Grabherr et al., “Full-length transcriptome assembly from RNA-Seq data without a reference genome, ” Nature biotechnology, vol. 29, no. 7, p. 644, 2011.
[18] M. Schena, D. Shalon, R. W. Davis, and P. O. Brown, “Quantitative monitoring of gene expression patterns with a complementary DNA microarray, ” Science, vol. 270, no. 5235, pp. 467-470, 1995.
[19] I. H. G. S. Consortium, “Initial sequencing and analysis of the human genome, ” Nature, vol. 409, no. 6822, p. 860, 2001.
[20] F. Meng et al., “Involvement of human micro-RNA in growth and response to chemotherapy in human cholangiocarcinoma cell lines, ” Gastroenterology, vol. 130, no. 7, pp. 2113-2129, 2006.
[21] Y. Wu and K. He, “Group normalization, ” arXiv preprint arXiv:1803.08494, 2018.
[22] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift, ” arXiv preprint arXiv:1502.03167, 2015.
[23] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, “Self-normalizing neural networks, ” in Advances in Neural Information Processing Systems, 2017, pp. 972-981.
[24] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines, ” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807-814.
[25] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity, ” The bulletin of mathematical biophysics, vol. 5, no. 4, pp. 115-133, 1943.
[26] F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain, ” Psychological review, vol. 65, no. 6, p. 386, 1958.
[27] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors, ” nature, vol. 323, no. 6088, p. 533, 1986.
[28] P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, “A tutorial on the cross-entropy method, ” Annals of operations research, vol. 134, no. 1, pp. 19-67, 2005.
[29] Y. Ephraim and D. Malah, “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, ” IEEE Transactions on acoustics, speech, and signal processing, vol. 32, no. 6, pp. 1109-1121, 1984.
[30] J. H. Friedman, “Stochastic gradient boosting, ” Computational Statistics & Data Analysis, vol. 38, no. 4, pp. 367-378, 2002.
[31] F. E. Harrell, “Ordinal logistic regression, ” in Regression modeling strategies: Springer, 2001, pp. 331-343.
[32] B. Dai, S. Ding, and G. Wahba, “Multivariate bernoulli distribution, ” Bernoulli, vol. 19, no. 4, pp. 1465-1483, 2013.
[33] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis, ” Chemometrics and intelligent laboratory systems, vol. 2, no. 1-3, pp. 37-52, 1987.
[34] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks, ” science, vol. 313, no. 5786, pp. 504-507, 2006.
[35] A. Ng and S. Autoencoder, “CS294A Lecture notes, ” Dosegljivo: https://web. stanford. edu/class/cs294a/sparseAutoencoder_2011new. pdf.[Dostopano 20. 7. 2016], 2011.
[36] D. P. Kingma and M. Welling, “Auto-encoding variational bayes, ” arXiv preprint arXiv:1312.6114, 2013.
[37] C. Doersch, “Tutorial on variational autoencoders, ” arXiv preprint arXiv:1606.05908, 2016.
[38] L. Bottou, “Large-scale machine learning with stochastic gradient descent, ” in Proceedings of COMPSTAT′2010: Springer, 2010, pp. 177-186.
[39] D. P. Kingma, T. Salimans, and M. Welling, “Variational dropout and the local reparameterization trick, ” in Advances in Neural Information Processing Systems, 2015, pp. 2575-2583.
[40] K. Tomczak, P. Czerwi?ska, and M. Wiznerowicz, “The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge, ” Contemporary oncology, vol. 19, no. 1A, p. A68, 2015.
[41] O. Valero, “On Banach fixed point theorems for partial metric spaces, ” Applied General Topology, vol. 6, no. 2, pp. 229-240, 2005.
[42] E. Cerami et al., “The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data, ” ed: AACR, 2012.
[43] C. Trapnell et al., “Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation, ” Nature biotechnology, vol. 28, no. 5, p. 511, 2010.
[44] E. A. Runkle and D. Mu, “Tight junction proteins: from barrier to tumorigenesis, ” Cancer letters, vol. 337, no. 1, pp. 41-48, 2013.

指導教授

王家慶許藝瓊(Jia-Ching Wang Yi-Chiung Hsu)

審核日期

2018-8-7

推文