摘要: | 過去20多年,我們都一直在從事電腦視覺相關的技術發展;包括:先進駕駛輔助系統 (ADAS) 的視覺偵測與辨識、光電元件的自動光學檢測 (AOI)、人體特徵 (臉、手、身) 的視覺偵測與辨識、等。最近幾年導入深度學習技術做人臉偵測/辨識、手勢辨識、行人偵測、視覺自走車的自動導引、前車偵測與分類、倒車碰撞警示、印刷電路板瑕疵偵測/分類、生成對抗網路的影像外嵌字移除、等,獲得很好的成果;因此才規劃這一個深度學習的3D物件偵測、辨識、分割、與定位研究計畫。本研究為三年期計畫,擬以深度學習技術提升部份傳統3D物件偵測、辨識、分割、與定位應用的效果。在這個計畫中,每一年都有二個卷積神經網路的技術發展項目及二個3D物件的應用項目。第一年的理論改進有:1.發展並改進具有偵測與辨識功能的卷積神經網路,2.發展可修正3D相機距離誤差的卷積神經網路;實務應用有:1.執行基於卷積神經網路的2D影像之物件偵測與辨識、2.執行幾何運算的3D物件之6 DoF定位。第二年的理論改進有:1.修改並比較卷積神經網路成為可輸入並融合2D及3D影像,2.修改卷積神經網路增加分割功能;實務應用有:1.執行基於卷積神經網路的2D/3D影像之物件偵測與辨識,2.執行基於卷積神經網路的物件偵測、辨識、與分割。第三年的理論改進有:1.發展可直接估計3D物件方位的卷積神經網路,2.改進卷積神經網路的速度與網路規模;實務應用有:1.執行機器手臂的3D小物件取放 (bin-picking) 系統,2.執行室內自走車的3D物件方位估計。本研究是建立在我們過去的研究基礎及實務成果上,針對特定議題,發展深度卷積神經網路技術解決過去不易顯著突破的偵測、辨識、分割、與3D定位問題。計畫主持人已有三十多年電腦視覺的研究經歷,且已有數年深度學習在電腦視覺技術應用上的經驗;更在2018年間協助三家上市櫃公司及工研院機械所各別發展深度學習在電腦視覺的應用研究,因此我們有信心及能力完成本計畫的執行。 ;In these decades, we are enthusiastic on the development of computer-vision techniques and the related applications, such as, the visual detection and recognition in advanced driver assistance systems, automatic optical inspection on optical/electronic parts, visual detection and recognition on human features such as face, hand, and body. In these few years, we develop the techniques of deep learning and apply to face detection/recognition, hand gesture recognition, pedestrian detection, visual-following automobile, forward vehicle detection and classification, backward collision detection, defect detection and inspection on PCBs, embedded text removal in images using GAN. In these studies, we got far better results than the traditional approaches got; and hence we are inspired to submit this research proposal “3D object detection, recognition, segmentation, & position using deep learning”.This research project is a three-year project. In this project, we want to develop deep-learning techniques to improve the effect and efficient of 3D object detection, recognition, segmentation, and position application techniques. In each year study, we propose two theoretical techniques on developing CNNs and two application topics on 3D objects. In the first year, two theoretical research topics are: 1. developing and improving the CNN with detection and recognition functions, and 2. developing the CNN to improve the accuracy of range data from 3D cameras; two application topics are: 1. CNN-based object detection and recognition using RGB images, and 2. using computational geometries to acquire 6 DoF position of 3D objects. In the second year, two theoretical research topics are: 1. modifying and comparing the CNNs with RGBD input and fusion, and 2. modifying the CNN by adding segmentation function; two application topics are: 1. CNN-based object and recognition using RGBD data, and 2. CNN-based 3D object detection, recognition, and segmentation. In the third year, two theoretical research topics are: 1. developing the CNN to directly acquire the 3D object position (i.e., amodal system), and 2. improving the performance and reducing the amount of CNNs; two application topics are: 1. developing the bin-picking robot arm system, and 2. 3D object position estimation for automonous in door vehicles.This study is based on our fruitful previous studying results, and focuses on the fixed topics to develop special CNN systems to solve the tricky problems on visual detection, recognition, segmentation, and position. The principal investigator of this project is an original researcher on computer vision; he has studied computer vision techniques more than thirty years; moreover, he has several-year experience of deep learning techniques applied on computer vision problems. In 2018, he has separately collaborated with three companies and ITRI to develop CNN techniques for object detection / recognition and defect inspection on PCBs; thus we have ability to complete the execution of this research project. |