摘要(英) |
The Industrial Revolution emerged in the 18th and 19th centuries, during which European and American countries replaced manual labor with machines, leading to four distinct industrial revolutions, with the current era being the fourth. This study focuses on the core of the Industrial Revolution, automation, aiming to improve production efficiency, reduce costs, and enhance quality, particularly through the application of machine vision systems in the manufacturing industry. Traditional methods of three-dimensional object recognition often utilize two-dimensional multi-view images but fail to fully exploit the correlation between these images and the potential impact of real-life shooting conditions on image quality, thereby increasing the difficulty of model recognition. Therefore, this study aims to propose a system for recognizing three-dimensional products, comprising a view-based convolutional neural network, feature extraction from images, and contrastive learning training methods. The specific objectives are to improve recognition efficiency, enhance the capture of key features in images, and strengthen robustness in real-life scenarios. To achieve these goals, the study will adopt a view-based convolutional neural network that effectively aggregates information from multiple-view images, an attention mechanism to extract important feature information, and supervised contrastive learning methods to train neural networks and enhance model generalization capabilities. The detailed implementation of these methods will be discussed in subsequent chapters. |
參考文獻 |
參考文獻
[1] Bahdanau, D., K. Cho & Y. Bengio (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[2] Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4), 193-202.
[3] Golnabi, H. & A. Asadpour (2007). Design and application of industrial machine vision systems. Robotics and Computer-Integrated Manufacturing, 23(6), 630-637.
[4] He, K., X. Zhang, S. Ren & J. Sun (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[5] Khosla, P., P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, ... & D. Krishnan (2020). Supervised contrastive learning. Advances in neural information processing systems, 33, 18661-18673.
[6] Kipf, T. N., & M. Welling (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
[7] Krizhevsky, A., I. Sutskever & G. E. Hinton (2012). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[8] LeCun, Y., L. Bottou, Y. Bengio & P. Haffner (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[9] Mnih, V., N. Heess, & A. Graves (2014). Recurrent models of visual attention. Advances in neural information processing systems, 27.
[10] Niu, Z., G. Zhong & H. Yu (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452, 48-62.
[11] Qi, C. R., Yi, L., Su, H., & L. J. Guibas (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30.
[12] Simonyan, K., & A. Zisserman (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[13] Su, H., S. Maji, E. Kalogerakis, & E. Learned-Miller (2015). Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision (pp. 945-953).
[14] Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, ... & A. Rabinovich (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[15] Thoben, K. D., S. Wiesner, & T. Wuest (2017). “Industrie 4.0” and smart manufacturing-a review of research issues and application examples. International journal of automation technology, 11(1), 4-16.
[16] Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ... & I. Polosukhin (2017). Attention is all you need. Advances in neural information processing systems, 30.
[17] Wei, X., R. Yu & J. Sun (2020). View-gcn: View-based graph convolutional network for 3d shape analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1850-1859).
[18] Wu, Z., Y. Xiong, S. X. Yu & D. Lin (2018). Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733-3742).
[19] Zeiler, M. D., & R. Fergus (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 (pp. 818-833). Springer International Publishing. |