基於深度學習的魚類影像分割和辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：35

、訪客IP：18.224.59.57

姓名

陳履軒(Lu-Hsuan Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於深度學習的魚類影像分割和辨識
(Fish Image Segmentation and Classification System Design Based on Deep Learning)

相關論文

★ 整合GRAFCET虛擬機器的智慧型控制器開發平台	★ 分散式工業電子看板網路系統設計與實作
★ 設計與實作一個基於雙攝影機視覺系統的雙點觸控螢幕	★ 智慧型機器人的嵌入式計算平台
★ 一個即時移動物偵測與追蹤的嵌入式系統	★ 一個固態硬碟的多處理器架構與分散式控制演算法
★ 基於立體視覺手勢辨識的人機互動系統	★ 整合仿生智慧行為控制的機器人系統晶片設計
★ 嵌入式無線影像感測網路的設計與實作	★ 以雙核心處理器為基礎之車牌辨識系統
★ 基於立體視覺的連續三維手勢辨識	★ 微型、超低功耗無線感測網路控制器設計與硬體實作
★ 串流影像之即時人臉偵測、追蹤與辨識─嵌入式系統設計	★ 一個快速立體視覺系統的嵌入式硬體設計
★ 即時連續影像接合系統設計與實作	★ 基於雙核心平台的嵌入式步態辨識系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

魚類在我們的日常生活佔了很大一部分，但是在沒有專業的訓練之下，我們很難去
辨別魚的種類。因此，我們需要研發一個魚類影像分割和辨識系統來幫助一般人辨別一
個圖片中可見的每隻魚的種類。不過從頭研發一個上述的系統相當的消耗人力和時間。
本研究提出一個基於深度學習的魚類影像切割和辨識的系統設計，使用 MIAT 系統設計
方法論，按照 IDEF0 進行模組和階層式系統設計，並使用 GRAFCET 進行離散事件建
模，以進行系統軟體高階合成來建構我們的系統。我們亦展示我們使用的影像標註工具
以及標註流程和規則。本研究以一個魚類圖片資料庫來驗證此一方法論建構出的系統。
實驗結果顯示，本系統可以在驗證資料集上面達到 85%的 top-1 準確度，並在兩個月之
內把本系統研發完成。

摘要(英)

Fish plays an important role in our life, but we can hardly to recognize the type of fish
without professional training. As a result, we would like to design a system which can segment
the part of fish from an image then classify the kind of each part of segments for ordinary
human being. Nevertheless, designing and implementing such system from scratch costs a lot
of human labor and time.

This dissertation proposes a fish image segmentation and classification system design
with the help of deep learning. We adopt the core concepts of MIAT Methodology to construct
the system with IDEF0 for modular and hierarchical system design and GRAFCET of discrete
event modeling. Also, we demonstrate the image annotation tool we use on labeling dataset
and state the protocols of image annotation. We adopt a fish image dataset to verify the system
created with applying MIAT Methodology within two months, and the system shows a top-1
accuracy of 85%.

關鍵字(中)

★ 深度學習
★ 機器學習
★ 影像分割
★ 影像切割
★ 影像辨識

關鍵字(英)

★ deep learning
★ Mask R-CNN
★ ResNet
★ machine learning
★ image segmentation
★ semantic segmentation
★ instance segmentation
★ image classification
★ image recognition

論文目次

摘要 i
Abstract ii
List of Contents v
List of Figures vii
List of Tables ix
Chapter 1. Introduction 1
1.1. Background 1
1.2. Objective 2
1.3. Structure of This Thesis 3
Chapter 2. Review of Image Segmentation and Classification Methods Using Deep Convolutional Neural Networks 4
2.1. Deep Learning-based image segmentation 4
2.1.1. Region-based Semantic Segmentation 5
2.1.2. Fully Convolutional Network-based Semantic Segmentation 9
2.1.3. Weakly Supervised Semantic Segmentation 10
2.2. Image Classification with the Help of Deep Convolutional Neural Networks 10
2.3. Mask R-CNN 13
2.4. ResNet 14
2.5. Brief Summary 15
Chapter 3. System Architecture 18
3.1. MIAT Methodology 18
3.1.1. IDEF0 for Hierarchical and Modular Design of System 19
3.1.2. GRAFCET for Discrete Event System Modeling 21
3.2. The Architecture of Proposed System 22
3.2.1. Segmentation Model 23
3.2.2. Classification Model 23
3.2.3. IDEF0 of the Architecture 24
3.2.4. GRAFCET of the Proposed System 25
3.3. Brief Summary 27
Chapter 4. Fish Image Dataset Architecture and Annotation 28
4.1. Architecture of Fish Image Dataset 28
4.1.1. Dataset of Instance Segmentation Module 28
4.1.2. Dataset Used for Training of Fish Kind Classification 29
4.2. Countering Overfitting Problem 34
4.3. Introduction of VGG Image Annotator 36
4.4. Using VIA to Annotate Single Image 37
4.5. Annotation Protocols of Our Fish Dataset 39
Chapter 5. Experiments 42
5.1. Development Environment 42
5.2. Structure of Segmentation Part 42
5.3. Structure of Classification Part 43
5.4. Benchmark and Experiment Results of Each Module 45
5.5. Brief Summary 54
Chapter 6. Conclusion and Future Prospects 55
6.1. Conclusion 55
6.2. Future Prospects 56
References 58

參考文獻

[1] Kwang-Tsao Shao, “The Fish Database of Taiwan.” TELDAP, 2014.
[2] S. Yuheng and Y. Hao, “Image Segmentation Algorithms Overview,” ArXiv170702051 Cs, Jul. 2017.
[3] J. A. Delmerico, P. David, and J. J. Corso, “Building facade detection, segmentation, and parameter estimation for mobile robot localization and guidance,” in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, 2011, pp. 1632–1639.
[4] M. Forouzanfar, N. Forghani, and M. Teshnehlab, “Parameter optimization of improved fuzzy c-means clustering algorithm for brain MR image segmentation,” Engineering Applications of Artificial Intelligence, vol. 23, no. 2, pp. 160–168, Mar. 2010.
[5] H. Yao, Q. Duan, D. Li, and J. Wang, “An improved K -means clustering algorithm for fish image segmentation,” Mathematical and Computter Modelling, vol. 58, no. 3–4, pp. 790–798, Aug. 2013.
[6] J. Wu, J. Zhu, and Z. Tu, “Reverse Image Segmentation: A High-Level Solution to a Low-Level Task,” in Proceedings of the British Machine Vision Conference, 2014.
[7] D. Kaur and Y. Kaur, “Various Image Segmentation Techniques : A Review,” International Journal of Computter Science and Mobile Computing, vol. 3, pp. 809–814, 2014.
[8] Er. Anjna and Er. R. Kaur, “Review of Image Segmentation Technique,” International Journal of Advanced Research in Computer Science, vol. 8, pp. 36–39, May 2017.
[9] J. Long, E. Shelhamer, and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” ArXiv14114038 Cs, Nov. 2014.
[10] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” ArXiv13112524 Cs, Nov. 2013.
[11] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs,” ArXiv160600915 Cs, Jun. 2016.
[12] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs,” ArXiv14127062 Cs, Dec. 2014.
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, USA, 2012, pp. 1097–1105.
[14] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” ArXiv170306870 Cs, Mar. 2017.
[15] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” ArXiv151203385 Cs, Dec. 2015.
[16] N. R. Pal and S. K. Pal, “A review on image segmentation techniques,” Pattern Recognition, vol. 26, no. 9, pp. 1277–1294, Sep. 1993.
[17] e_soroush (https://cs.stackexchange.com/users/29428/e-soroush), What is the difference between object detection, semantic segmentation and localization? .
[18] C. Szegedy, A. Toshev, and D. Erhan, “Deep Neural Networks for Object Detection,” in NIPS, 2013.
[19] J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective Search for Object Recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp. 154–171, Sep. 2013.
[20] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, Sep. 1995.
[21] R. Girshick, “Fast R-CNN,” ArXiv150408083 Cs, Apr. 2015.
[22] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” ArXiv14064729 Cs, vol. 8691, pp. 346–361, 2014.
[23] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” ArXiv150601497 Cs, Jun. 2015.
[24] B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Hypercolumns for Object Segmentation and Fine-grained Localization,” ArXiv14115752 Cs, Nov. 2014.
[25] B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Simultaneous Detection and Segmentation,” ArXiv14071808 Cs, Jul. 2014.
[26] S. Bittel, V. Kaiser, M. Teichmann, and M. Thoma, “Pixel-wise Segmentation of Street with Neural Networks,” ArXiv151100513 Cs, Nov. 2015.
[27] D. Eigen and R. Fergus, “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 2650–2658.
[28] W. Liu, A. Rabinovich, and A. C. Berg, “Parsenet: Looking wider to see better,” ArXiv Prepr. ArXiv150604579, 2015.
[29] A. Khoreva, R. Benenson, J. Hosang, M. Hein, and B. Schiele, “Simple Does It: Weakly Supervised Instance and Semantic Segmentation,” ArXiv160307485 Cs, Mar. 2016.
[30] J. Dai, K. He, and J. Sun, “BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation,” ArXiv150301640 Cs, Mar. 2015.
[31] S. B. Kotsiantis, “Decision trees: a recent overview,” Artificial Intelligence Review, vol. 39, no. 4, pp. 261–283, Apr. 2013.
[32] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[33] H. Ramchoun, M. Amine, J. Idrissi, Y. Ghanou, and M. Ettaouil, “Multilayer Perceptron: Architecture Optimization and Training,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 4, no. 1, p. 26, 2016.
[34] A. F. Agarap, “Deep Learning using Rectified Linear Units (ReLU),” ArXiv180308375 Cs Stat, Mar. 2018.
[35] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929–1958, 2014.
[36] L. Perez and J. Wang, “The Effectiveness of Data Augmentation in Image Classification using Deep Learning,” ArXiv171204621 Cs, Dec. 2017.
[37] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” ArXiv14091556 Cs, Sep. 2014.
[38] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spatial Transformer Networks,” ArXiv150602025 Cs, Jun. 2015.
[39] C. Szegedy et al., “Going Deeper with Convolutions,” ArXiv14094842 Cs, Sep. 2014.
[40] R. David, “Grafcet: a powerful tool for specification of logic controllers,” IEEE Transactions on Control Systems Technology, vol. 3, no. 3, pp. 253–268, Sep. 1995.
[41] C.-H. Chen, C.-M. Kuo, C.-Y. Chen, and J.-H. Dai, “The design and synthesis using hierarchical robotic discrete-event modeling,” Journal of Vibration and Control, vol. 19, no. 11, pp. 1603–1613, Aug. 2013.
[42] M. Z. Alom et al., “The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches,” ArXiv180301164 Cs, Mar. 2018.
[43] A. Dutta and A. Zisserman, “The VGG Image Annotator (VIA),” ArXiv Prepr. ArXiv190410699, 2019.
[44] L. Prechelt, “Early Stopping — But When?,” in Neural Networks: Tricks of the Trade: Second Edition, G. Montavon, G. B. Orr, and K.-R. Müller, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 53–67.
[45] Tzutalin, LabelImg. 2015.
[46] W. Abdulla, Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. Github, 2017.
[47] T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context,” ArXiv14050312 Cs, May 2014.
[48] F. Chollet and others, Keras. 2015.
[49] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” ArXiv14126980 Cs, Dec. 2014.
[50] D. C. Cireşan, U. Meier, and J. Schmidhuber, “Transfer learning for Latin and Chinese characters with Deep Neural Networks,” in The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, pp. 1–6.
[51] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” ArXiv160806993 Cs, Aug. 2016.

指導教授

陳慶瀚

審核日期

2019-8-22

推文