基於注意力機制與多尺度信息的視網膜水腫語義分割

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：27

、訪客IP：3.21.43.214

姓名

黎氏芳(Le Thi Phuong) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於注意力機制與多尺度信息的視網膜水腫語義分割
(An ACPX Model for Retinal Edema Segmentation)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

近年来可用的许多OCT图像是一种有用的工具，用于支持可以诊断准确疾病和监测患者状态的医生以获得适当的治疗方法。这导致需要检测并显示OCT图像中出现的特征，这些特征像一个具有挑战性的问题显着增加。
意识到每个图像的洞察特征是必要的，本文设计并分析了一个基于视网膜水肿图像数据集提取语义分割的自动化系统，以检测是否存在有助于尽快诊断眼病的有效元素，并提供大量的合理的治疗。我们选择计算机视觉来解决上述语义图像分割问题，这显示了令人印象深刻的结果，并考虑了最先进的方法。结果包括图像预处理，允许处理大尺寸和通道的输入图像。此外，新型号适应轻量级和高精度。
此外，我们最近通过使用参数和准确性两个方面与先进方法比较和评估我们的结果。
最后，由我们自己设计的ACPX模型的准确率达到78.19％，比基线模型提高了29.75％。

摘要(英)

A number of OCT imageries available become popular in recent years is a tool useful to support doctors that can diagnosis accurate illnesses and monitoring status of patients for appropriate treatment methods. That lead to demand for detect and show appeared features in OCT images that increase significantly like a challenging problem.
Aware of insight characters of each image is necessary, this thesis designs and analysis an automated system for extracting semantic segmentation based on retinal edema image datasets to detect cons elements exist or not that help to diagnose eye diseases as soon as possible and offer a lot of reasonable treatments. We have chosen computer vision to solute this above problem for semantic image segmentation, which shows impressive outcomes and considers the state-of-the-art method. The resulting includes image preprocessing that allows coping with input image of big dimension and channels. Moreover, a new model adapts to lightweight and accuracy high.
Additionally, we compare and evaluate our result with advanced methods recently through two aspects like using parameters and accuracy.
Finally, An ACPX Model which is designed by ourself achieves 78.19% the accuracy that improves 29.75% to baseline model.

關鍵字(中)

★ 光學相干斷層掃描

關鍵字(英)

★ Optical Coherence Tomography

論文目次

ABSTRACT ii
ACKNOWLEDGEMENT iii
LIST OF FIGURES vi
LIST OF TABLES vii
CHAPTER 1: INTRODUCTION 1
1.1 Motivation 1
1.2 Segmentation 2
1.3 Contribution 5
CHAPTER 2: DEEP LEARNING 7
2.1 Deep learning 7
2.2 Overfitting 8
2.3 Transfer learning 10
2.4 Data imbalance 11
CHAPTER 3: CONVOLUTION NEURAL NETWORK 12
3.1 Architecture convolution neural networks 12
3.1.1 Convolutions 13
3.1.2 Non-linearity Functions 13
3.1.3 Pooling Layers 15
3.1.4 Fully Connected Layers 16
3.1.5 Hyperparameters 17
3.2 VGG architecture 18
3.3 ResNet architecture 18
3.4 Region Proposal Network (RPN) 21
3.4.1 Single Shot Detections(SSD) 22
3.4.2 YOLO 23
3.5 Feature Pyramid Networks (FPN) 23
CHAPTER 4: METHODOLOGY 25
4.1 Preprocessing 25
4.2 Post processing 25
4.3 An ACPX Model 26
CHAPTER 5: EXPERIMENT RESULT 31
5.1 Retinal Edema Dataset 31
5.2 Evaluation 32
5.3 Result 32
CHAPTER 6: CONCLUSIONS 36
REFERENCES 37

參考文獻

[1] C.M. Bishop, Pattern recognition and machine learning, springer2006.
[2] J.M. Schmitt, "Optical coherence tomography (OCT): a review", IEEE Journal of selected topics in quantum electronics, 5 1205-1215, 1999.
[3] A.L. Samuel, "Some studies in machine learning using the game of checkers", IBM Journal of research and development, 44 206-226, 2000.
[4] T.M. Mitchell, The discipline of machine learning, Carnegie Mellon University, School of Computer Science, Machine Learning …2006.
[5] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press2016.
[6] H.T.U. Smith, "Manual of Photographic Interpretation. R. N. Colwell Photogrammetry and Photo-interpretation (with a Section on Applications to Forestry). S. H. Spurr", The Journal of Geology, 70 757-758, 1962.
[7] K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
[8] A.K.a.I.S.a.G.E. Hinton, Imagenet classification with deep convolutional neural networks, 2012.
[9] J. Uhrig, M. Cordts, U. Franke, T. Brox, Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling, Springer International Publishing, Cham, pp. 14-25, 2016.
[10] S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path aggregation network for instance segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8759-8768.
[11] G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R.R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors", arXiv preprint arXiv:1207.0580, DOI 2012.
[12] K. Simonyan, A. Zisserman, "Very deep convolutional networks for large-scale image recognition", arXiv preprint arXiv:1409.1556, DOI 2014.
[13] F. Fleuret, "EE-559–Deep learning 6.1. Benefits of depth", DOI.
[14] S. Targ, D. Almeida, K. Lyman, "Resnet in resnet: Generalizing residual architectures", arXiv preprint arXiv:1603.08029, DOI 2016.
[15] H. Noh, S. Hong, B. Han, Learning deconvolution network for semantic segmentation, Proceedings of the IEEE international conference on computer vision, pp. 1520-1528, 2015.
[16] J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431-3440, 2015.
[17] M. Schmidt, G. Fung, R. Rosales, Fast optimization methods for l1 regularization: A comparative study and two new approaches, European Conference on Machine Learning, Springer, pp. 286-297, 2007.
[18] D. Steinkraus, I. Buck, P. Simard, Using GPUs for machine learning algorithms, Eighth International Conference on Document Analysis and Recognition (ICDAR′05), IEEE, 2005, pp. 1115-1120.
[19] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press2016.
[20] D. Steinkraus, I. Buck, P. Simard, Using GPUs for machine learning algorithms, Eighth International Conference on Document Analysis and Recognition (ICDAR′05), IEEE, pp. 1115-1120, 2005.
[21] S. Elwakil, S. El-Labany, M. Zahran, R. Sabry, "Modified extended tanh-function method for solving nonlinear partial differential equations", Physics Letters A, 299 179-188, 2002.
[22] J. Schmidt-Hieber, "Nonparametric regression using deep neural networks with ReLU activation function", arXiv preprint arXiv:1708.06633, DOI 2017.
[23] M. Tommiska, "Efficient digital implementation of the sigmoid function for reprogrammable logic", IEE Proceedings-Computers and Digital Techniques, 150 403-411, 2003.
[24] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp. 1097-1105, 2012.
[25] X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 315-323, 2011.
[26] M.K. Johnson, Applied Predictive Modeling, Springer, New York, NY.
[27] S. Ruder, "An overview of gradient descent optimization algorithms", arXiv preprint arXiv:1609.04747, DOI 2016.
[28] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[29] S. Ioffe, C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift", arXiv preprint arXiv:1502.03167, DOI 2015.
[30] D. Mishkin, Matas, Jiri, "All you need is a good init", CoRR, abs/1511.06422 2016.
[31] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, International Conference on Medical image computing and computer-assisted intervention, Springer, pp. 234-241, 2015.
[32] L.-C. Chen, G. Papandreou, F. Schroff, H. Adam, "Rethinking atrous convolution for semantic image segmentation", arXiv preprint arXiv:1706.05587, DOI 2017.
[33] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Bisenet: Bilateral segmentation network for real-time semantic segmentation, Proceedings of the European Conference on Computer Vision (ECCV), pp. 325-341, 2018.
[34] S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, Proceedings of the European Conference on Computer Vision (ECCV), pp. 3-19, 2018.
[35] M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, European conference on computer vision, Springer, pp. 818-833, 2014.
[36] S. Zagoruyko, N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer", arXiv preprint arXiv:1612.03928, DOI 2016.
[37] H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881-2890, 2017.
[38] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
[39] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[40] T. Dozat, "Incorporating nesterov momentum into adam", DOI 2016.
[41] S. Mannor, D. Peleg, R. Rubinstein, The cross entropy method for classification, Proceedings of the 22nd international conference on Machine learning, ACM, pp. 561-568, 2005.
[42] P.-T. De Boer, D.P. Kroese, S. Mannor, R.Y. Rubinstein, "A tutorial on the cross-entropy method", Annals of operations research, 134 19-67, 2005.
[43] V. Thada, V. Jaglan, "Comparison of jaccard, dice, cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm", International Journal of Innovations in Engineering and Technology, 2 202-205, 2013.

指導教授

王家慶(Jia-Ching Wang)

審核日期

2019-7-1

推文