基於遞迴神經網路於多重深度攝影機架構下之駕駛動作辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：90

、訪客IP：3.135.182.13

姓名

莊英瑋(Ying-Wei Chuang) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於遞迴神經網路於多重深度攝影機架構下之駕駛動作辨識
(Driver Behavior Recognition based on Multiple Depth Cameras using Recurrent Neural Network)

相關論文

★ 基於區域權重之衛星影像超解析技術	★ 延伸曝光曲線線性特性之調適性高動態範圍影像融合演算法
★ 實現於RISC架構之H.264視訊編碼複雜度控制	★ 基於卷積遞迴神經網路之構音異常評估技術
★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術	★ 具有注意力機制之隱式表示於影像重建三維人體模型
★ 使用對抗式圖形神經網路之物件偵測張榮	★ 基於弱監督式學習可變形模型之三維人臉重建
★ 以非監督式表徵分離學習之邊緣運算裝置低延遲樂曲中人聲轉換架構	★ 基於序列至序列模型之 FMCW雷達估計人體姿勢
★ 基於多層次注意力機制之單目相機語意場景補全技術	★ 基於時序卷積網路之單FMCW雷達應用於非接觸式即時生命特徵監控
★ 視訊隨選網路上的視訊訊務描述與管理	★ 基於線性預測編碼及音框基頻週期同步之高品質語音變換技術
★ 基於藉語音再取樣萃取共振峰變化之聲調調整技術	★ 即時細緻可調性視訊在無線區域網路下之傳輸效率最佳化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本篇論文是針對車內駕駛的動作辨識，針對駕駛動作的目的，一方面是和行車安全有高度相關性，在發現駕駛不專心時或有危險時給予提醒，另一方面可應用在車上型娛樂的控制上。我們提出利用兩台的Kinect攝影機，拍攝到的不同視角影像、經過前處理，並利用深度學習裡面的遞迴神經網路架構去做訓練辨識。使用不同視角的影像降低只用單一視角造成的自我遮蔽的問題，使用長短期記憶的架構可以讓網路學習到隨時間變化而改變的資訊，這套系統應用在我們自己拍攝的Vap多視角駕駛動作資料庫上，可以達到不錯的辨識正確率

摘要(英)

This thesis is aimed at in-car driver behavior recognition. One of the purpose is for the safe drive, because it would be dangerous that driver doesn’t concentrate when driving. The other is the application for the In-car entertainment. We propose a multi-view driver behavior recognition system (MDBR system). The pointcloud is captured from different views, and we manage to preprocess the original data by rotation, calibration, merging and sampling. Then, we use the Long short-term memory (LSTM) network, a type of recurrent neural network, as classifier. The dataset we used is VAP multi-view driver behavior dataset. This dataset is we proposed, and contain 10 driver behavior. Using multi-view data can effectively reduce the influence of the occlusion problem. The recognition accuracy of MDBR system have good performance.

關鍵字(中)

★ 駕駛動作辨識
★ 深度攝影機
★ 深度學習
★ 多視角拍攝

關鍵字(英)

★ driver behavior recognition
★ depth camera
★ deep learning
★ RNN

論文目次

摘要 I
Abstract II
誌謝 III
目錄 V
圖目錄 VII
表目錄 X
第一章緒論 1
1.1 研究背景 1
1.2　研究動機與目的 2
1.3　論文架構 4
第二章　深度攝影機及動作辨識相關介紹 5
2.1　深度攝影機 5
2.1.1　Kinect深度攝影機 5
2.1.2　硬體規格 6
2.1.3 技術與功能 8
2.1.4 開發工具介紹 Kinect SDK 14
2.2 動作辨識 15
2.2.1 動作辨識相關文獻介紹 16
2.2.2 車內行為辨識 19
第三章　深度學習相關基本介紹 21
3.1　類神經網路 21
3.1.1 生物神經元 22
3.1.2 人工神經元 23
3.1.3人工神經網路 28
3.2　深度學習 31
3.2.1 深度神經網路 31
3.2.2 遞迴神經網路 33
3.2.3 長短期記憶 (LSTM) 35
第四章提出之車內駕駛動作辨識系統 37
4.1　系統架構 37
4.2　利用骨架當作特徵進行駕駛動作辨識 39
4.3 利用多視角點雲當作特徵進行駕駛動作辨識 41
4.4　VAP多視角駕駛動作資料庫 44
第五章實驗結果與分析討論 49
5.1　實驗環境介紹 49
5.2　實驗結果 50
5.2.1 骨架特徵輸入之實驗結果 50
5.2.2 多視角點雲特徵輸入之實驗結果 55
5.3　比較與討論 64
第六章　結論與未來展望 74
參考文獻 75

參考文獻

參考文獻
[1] https://www.amazon.com/b?ie=UTF8&node=16008589011
[2] https://www.xbox.com/en-US/xbox-one/accessories/kinect
[3] Weimar, R.; Romberg, R.; Frigo, S.; Kasshlke, B.; Feulner, P. “Time-of-flight techniques for the investigation of kinetic energy distributions of ions and neutrals desorbed by core excitations” in Conference: 8th International Workshop on Desorption Induced by Electronic Transitions (DIET 8), San Alfonso, NJ (US), 09/07/1999--10/01/1999; Other Information: PBD: 31 Aug 2000,
[4] Jamie Shotton ; Andrew Fitzgibbon ; Mat Cook ; Toby Sharp ; Mark Finocchio ; Richard Moore ; Alex Kipman ; Andrew Blake, “Real-time human pose recognition in parts from single depth images” in 2011 Conference on Computer Vision and Pattern Recognition (CVPR 2011),pp. 1297-1304, 20-25 June 2011
[5] Vangos Pterneas, “HOW TO USE KINECT HD FACE,” 2015
https://pterneas.com/2015/06/06/kinect-hd-face/
[6] Shahram Izadi , David Kim , Otmar Hilliges , David Molyneaux , Richard Newcombe , Pushmeet Kohli , Jamie Shotton , Steve Hodges , Dustin Freeman , Andrew Davison , Andrew Fitzgibbon ”KinectFusion: Realtime 3D Reconstruction and Interaction Using a Moving Depth Camera” UIST ′11 Proceedings of the 24th annual ACM symposium on User interface software and technology, October 16 - 19, 2011,pp559-568
[7] S. Rusinkiewicz and M. Levoy, "Efficient variants of the ICP algorithm," Proceedings Third International Conference on 3-D Digital Imaging and Modeling, Quebec City, Que., 2001, pp. 145-152.
[8] G. Welch and G. Bishop, “An introduction to the kalman filter,” University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, Tech. Rep. 95–041, 1995.
[9] L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
[10] Seon-Woo Lee, Kenji Mase “Activity and Location Recognition Using Wearable Sensors “ in IEEE Pervasive Computing, Volume: 1, Issue: 3, July-Sept. 2002,pp24-32
[11] A. Zenonos, A. Khan, G. Kalogridis, S. Vatsikas, T. Lewis, and M. Sooriyabandara,”Healthy office: Mood recognition at work using smartphones and wearable sensors” in IEEE International Conference on Pervasive Computing and Communication Workshops, Mar. 2016, pp. 1–6.
[12] G. Sprint, D. Cook, R. Fritz, and M. Schmitter-Edgecombe, “Detecting health and behavior change by analyzing smart home sensor data,” in IEEE International Conference on Smart Computing, May 2016, pp. 1–3.
[13] C. Shen, Y. Chen, and G. Yang, “On motion-sensor behavior analysis for human-activity recognition via smartphones,” in IEEE International Conference on Identity, Security and Behavior Analysis, Feb. 2016, pp. 1–6.
[14] J. G. Lee, M. S. Kim, T. M. Hwang, and S. J. Kang, “A mobile robot which can follow and lead human by detecting user location and behavior with wearable devices,” in IEEE International Conference on Consumer Electronics, Jan. 2016, pp. 209–210.
[15] Seema Rawat , Somya Vats and Praveen Kumar, “Evaluating and Exploring the MYO ARMBAND” in 2016 International Conference System Modeling & Advancement in Research Trends (SMART), Nov. 2016 ,pp. 115-120.
[16] O’ scar D. Lara and Miguel A. Labrador, “A Survey on Human Activity Recognition using Wearable Sensors” in IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 15, NO. 3, THIRD QUARTER 2013, pp1192-1209
[17] F. Lv and R. Nevatia, “Recognition and segmentation of 3-d human action using hmm and multi-class adaboost,” in Proceedings of the European Conference on Computer Vision, 2006, pp. 359–372.
[18] Y. Freund and R. Schapire, “A decision theoretic generalization of on-line learning and application to boosting,” Journal of Computer and System Science, vol. 55, no. 1, pp. 119–139, 1995
[19] Y. Sheikh, M. Sheikh, and M. Shah, “Exploring the space of a human action,” in IEEE International Conference on Computer Vision, vol. 1, Oct. 2005, pp. 144–149.
[20] M. Hussein, M. Torki, M. Gowayyed, and M. El-Saban, “Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2013, pp. 2466–2472.
[21] J. Wang, Z. Liu, Y. Wu, and J. Yuan, “Mining actionlet ensemble for action recognition with depth cameras,” in IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2012, pp. 1290–1297.
[22] C. Chang and C. Lin, “LIBSVM: A library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, vol. 2, 27:1–27:27, 3 2011, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[23] X. Yang and Y. L. Tian, “Eigenjoints-based action recognition using naive-bayes-nearestneighbor,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2012, pp. 14–19.
[24] Y. Zhu, W. Chen, and G. Guo, “Fusing spatiotemporal features and joints for 3d action recognition,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2013, pp. 486–491.
[25] R. Chaudhry, F. Ofli, G. Kurillo, R. Bajcsy, and R. Vidal, “Bio-inspired dynamic 3d discriminative skeletal features for human action recognition,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2013, pp. 471–478.
[26] E. Ohn-Bar and M. M. Trivedi, “Joint angles similarities and hog2 for action recognition,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2013, pp. 465–470.
[27] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, Jun. 2005, pp. 886–893.
[28] G. Evangelidis, G. Singh, and R. Horaud, “Skeletal quads: Human action recognition using joint quadruples,” in International Conference on Pattern Recognition, Aug. 2014, pp. 4513–4518.
[29] T. Jaakola and D. Haussler, “Exploiting generative models in discriminative classifiers,” in Proceedings of the Conference on Advances in Neural Information Processing Systems II, 1999, pp. 487–493.
[30] N. A. Azis, H. J. Choi, and Y. Iraqi, “Substitutive skeleton fusion for human action recognition,” in International Conference on Big Data and Smart Computing, Feb. 2015, pp. 170–177.
[31] N. A. Azis, Y. S. Jeong, H. J. Choi, and Y. Iraqi, “Weighted averaging fusion for multiview skeletal data and its application in action recognition,” IET Computer Vision, vol. 10, no. 2, pp. 134–142, 2016.
[32] N. A. Azis, H. J. Choi, and Y. Iraqi, “Substitutive skeleton fusion for human action recognition,” in International Conference on Big Data and Smart Computing, Feb. 2015, pp. 170–177.
[33] C. Braunagel, E. Kasneci, W. Stolzmann and W. Rosenstiel, "Driver-Activity Recognition in the Context of Conditionally Autonomous Driving," 2015 IEEE 18th International Conference on Intelligent Transportation Systems, Las Palmas, 2015, pp. 1652-1657.
[34] S. Yan, Y. Teng, J. S. Smith and B. Zhang, "Driver behavior recognition based on deep convolutional neural networks," 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, 2016, pp. 636-641.
[35] Y. Xing et al., "Identification and Analysis of Driver Postures for In-Vehicle Driving Activities and Secondary Tasks Recognition," in IEEE Transactions on Computational Social Systems, vol. 5, no. 1, pp. 95-108, March 2018.
[36] Y. W. Chuang, S. W. Sun and P. C. Chang, "Driver posture recognition for 360-degree holographic media browsing," 2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media), Pattaya, 2017, pp. 1-6
[37] Paul J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974
[38] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. Learning representations by back-propagating errors. Nature. 8 October 1986, 323 (6088): 533–536
[39] D. H. Ackley, G. E. Hinton, T. J. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” In D. E. Rumelhart, J. L. McClelland, and the PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations (Cambridge: MIT Press): 282–317. 1985.
[40] P. Smolensky, Parallel Distributed Processing: Volume 1:Foundations, D. E. Rumelhart, J. L. McClelland, Eds. (MIT Press, Cambridge, 1986), pp. 194–281
[41] A. Mnih, and G. E. Hinton, “Learning Unreliable Constraints using Contrastive Divergence,” In IJCNN 2005, Montreal.
[42] Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, “Greedy Layer-Wise Training of Deep Networks,” Advances in Neural Information Processing Systems 19, 2007.
[43] G. Casella, E. I. George, “Explaining the Gibbs Sampler,” The American Statistician 46 (3): 167, 1992.
[44] McCulloch, Warren S.; Pitts, Walter. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics. 1943-12-01, 5 (4): 115–133
[45] W. Li; Z. Zhang; Z. Liu, "Action recognition based on a bag of 3D points," in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),pp.9-14, 13-18, June 2010.
[46] K. Kaewplee; N. Khamsemanan; C. Nattee, "A rule-based approach for improving Kinect Skeletal Tracking system with an application on standard Muay Thai maneuvers," in 15th International Symposium on Soft Computing and Intelligent Systems (SCIS), 2014 Joint 7th International Conference on and Advanced Intelligent Systems (ISIS), vol., no., pp.281-285, 3-6b Dec. 2014.
[47] Hochreiter, Sepp; Schmidhuber, Jurgen (1997-11-01). "Long Short-Term Memory". Neural Computation. 9 (8): 1735–1780.
[48] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, “Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio” in In Proc. International Conference on Learning Representations
[49] MSR Action3D https://www.uow.edu.au/~wanqing/#Datasets
[50] Northwestern-UCLA Multiview Action3D Datase http://www.stat.ucla.edu/~xnie/multiview_action.html
[51] Tensorflow 官方網站: https://www.tensorflow.org/
[52] C. H. Kuo, P. C. Chang and S. W. Sun, "Behavior Recognition Using Multiple Depth Cameras Based on a Time-Variant Skeleton Vector Projection," in IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 1, no. 4, pp. 294-304, Aug. 2017

指導教授

張寶基(Pao-Chi Chang)

審核日期

2018-8-2

推文