深度學習的 3D 物件偵測、辨識、分割、與定位應用技術發展 (II);3d Object Detection, Recognition, Segmentation, & Position Using Deep Learning (II)

NCU Institutional Repository > 資訊電機學院 > 資訊工程學系 > 研究計畫 > Item 987654321/84705

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/84705

題名:	深度學習的 3D 物件偵測、辨識、分割、與定位應用技術發展 (II);3d Object Detection, Recognition, Segmentation, & Position Using Deep Learning (II)
作者:	曾定章
貢獻者:	資訊工程學系
關鍵詞:	深度學習;卷積神經網路;電腦視覺;3D物件偵測;3D物件辨識;3D物件分割;3D物件定位;deep learning;convolutional neural network;computer vision;3D object detection;3D object recognition;3D object segmentation;3D object position
日期:	2020-12-08
上傳時間:	2020-12-09 10:45:33 (UTC+8)
出版者:	科技部
摘要:	本計畫在去年曾申請三年期計畫，但僅獲得一年補助，今年修改研究內容繼續申請未執行的二年計畫內容。本研究以改進現有技術應用於特定領域，提高效能至 99%, 99.5%, 99.9%, 甚至於 99.99% 的偵測率或辨識率為主；不是發展新技術，測試於一般資料庫；例如，PASCAL VOC, ImageNet, MS COCO，追求 70%, 80% 比別人多 1%, 3% 的新技術。改進現有技術應用於特定領域，追求至高程度的效能一定需要卷積神經網路的學理基礎，我個人在將屆退休之際還研讀百餘篇相關論文，整理50多個有名卷積神經網路模式，督導中等程度的學生修改網路架構、模組、函數、演算法、及搭配影像處理技術應用到特定領域達到 99.5% 的偵測辨識效能，已很難有時間再撰寫 top journal/top conference 論文。本研究為二年期計畫，擬以深度學習技術提升部份傳統3D物件偵測、辨識、分割、與定位應用的效果。在這個計畫中，每一年都有二個卷積神經網路的技術發展項目及二個3D物件的應用項目。過去一年的理論改進有：1.改進偵測與辨識網路的大小物件適用性，2.分析多種不同2D及3D影像融合方式的效能；實務應用有：1.執行比較2D影像之物件偵測與辨識，2.執行3D影像之物件偵測與辨識。新計畫第一年的理論改進有：1.發展可估計3D物件9 DoF參數的3D CNN，2.以生成對抗網路修正3D相機的距離誤差；實務應用有：1.執行3D物件的9 DoF定位，2.執行機器手臂的3D小物件取放 (bin-picking)。第二年的理論改進有：1.改進3D CNN的準確度與速度，2.增加3D CNN的分割功能；實務應用有：1.執行3D物件的偵測、辨識、與分割，2.執行自動導引車的3D物件方位估計與分割。本研究是建立在我們過去的研究基礎及實務成果上，針對特定議題，發展深度卷積神經網路技術解決過去不易顯著突破的偵測、辨識、分割、與3D定位問題。計畫主持人已有三十多年電腦視覺的研究經歷，且已有數年深度學習在電腦視覺技術應用上的經驗；更在最近二年間協助三家上市櫃公司及工研院機械所各別發展深度學習在電腦視覺的應用研究，因此我們有信心及能力完成本計畫的執行。 ;This proposal had been applied a three-year grant support, however, we only obtain one-year support; in this project, we modify the original project content to apply the remained two-year project execution. In this project, we pursue the detection and recognition rates reaching 99%, 99.5%, 99.9%, even 99.99% by modifying the existed techniques applied in special fields; we are not develop new techniques with testing in common data bases; such as, PASCAL VOC, ImageNet, MS COCO, to pursue 70%, 80% performance with 1%, 3% increasment. In the special-field applications, we need principle of CNN related theory to improve the existed techniques to reach higher performance. I studied more than 100 related papers in near retiring year, systematically prepared more than 50 famous CNN models, and supervised the middle-quality students to modify the network structure, modules, functions, and alogrithms to reach 99.5% detection and recognition rates; I have no enough time to complete the top journal/top conference papers.This research project is a two-year project. In this project, we want to develop deep-learning techniques to improve the effect and efficient of 3D object detection, recognition, segmentation, and position application techniques. In each year study, we propose two theoretical techniques on developing CNNs and two application topics on 3D objects. In the passed year, two theoretical research topics are: (1) improving the CNN with detection and recognition of objects to adapt the large size variation and (2) analyzing the performance of different fusion structures on 2D and 3D images; two application topics are: (1) comparing CNN-based object detection and recognition using RGB images and (2) CNN-based object and recognition using RGBD data. In the first year of this project, two theoretical research topics are: (1) developing 3D CNN to acquire 9 DoF parameters of 3D objects and (2) using GAN to correct the distance error of 3D camera; two application topics are: (1) executing 3D CNN for 9 DoF estimation of 3D object and (2) developing the bin-picking robot arm system. In the second year, two theoretical research topics are: (1) improving the performance and speed of the CNN and (2) modifying the CNN by adding segmentation function; two application topics are: (1) CNN-based 3D object detection, recognition, and segmentation and (2) 3D object position estimation for automonous in door vehicles.This study bases on our previous fruitful studying results, and focuses on the fixed topics to develop special CNN systems to solve the tricky problems on visual detection, recognition, segmentation, and position. The principal investigator of this project is an original researcher on computer vision; he has studied computer vision techniques more than thirty years; moreover, he has several-year experience of deep learning techniques applied on computer vision problems. In these two years, he has separately collaborated with three companies and ITRI to develop CNN techniques for object detection / recognition and defect inspection on PCBs; thus, we have ability to complete the execution of this research project.
關聯:	財團法人國家實驗研究院科技政策研究與資訊中心
顯示於類別:	[資訊工程學系] 研究計畫

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	158	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....