深度學習於學生專注度分析之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：28

、訪客IP：3.144.104.189

姓名

鄭俊廷(Chun-Ting Cheng) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

深度學習於學生專注度分析之應用
(Applications of Deep Learning in Student Concentration Analysis)

相關論文

★ 以Q-學習法為基礎之群體智慧演算法及其應用	★ 發展遲緩兒童之復健系統研製
★ 從認知風格角度比較教師評量與同儕互評之差異：從英語寫作到遊戲製作	★ 基於檢驗數值的糖尿病腎病變預測模型
★ 模糊類神經網路為架構之遙測影像分類器設計	★ 複合式群聚演算法
★ 身心障礙者輔具之研製	★ 指紋分類器之研究
★ 背光影像補償及色彩減量之研究	★ 類神經網路於營利事業所得稅選案之應用
★ 一個新的線上學習系統及其於稅務選案上之應用	★ 人眼追蹤系統及其於人機介面之應用
★ 結合群體智慧與自我組織映射圖的資料視覺化研究	★ 追瞳系統之研發於身障者之人機介面應用
★ 以類免疫系統為基礎之線上學習類神經模糊系統及其應用	★ 基因演算法於語音聲紋解攪拌之應用

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著科技日新月異的進步，學生在上課時往往會使用筆記型電腦和手機做筆記或查資料，不過在這種情境下，自制力差的學生容易會被電子產品影響而導致分心，而專注力差的學生會有心不在焉或東張西望的情形發生。因此本論文的目的為希望能透過彩色影像分析學生的專注度，幫助老師了解學生們的學習狀況。本論文使用人臉偵測和類神經網路找到人臉特徵，判斷臉部資訊和估測人臉朝向的方向，也從姿態評估系統的骨架資料中擷取特徵，透過類神經網路和物品辨識判斷出目前的姿態，根據以上的結果分析學生的專注度。
本論文在臉部資訊上設計了兩種疲勞行為，以人臉特徵判定行為的發生，也擷取特徵用以訓練類神經網路預測人臉朝向角，平均角度誤差在10度以內；在動作辨識上設計了八種學生常見的姿態，其中四種姿態為使用物品的情境，並以此蒐集資料集訓練和測試類神經網路，在不同人間和角度推廣性測試的辨識率將近八成，在實際情境測試下也有不錯的辨識率，證明本系統在臉部資訊和動作辨識上能提供準確的資訊。

摘要(英)

With the ever-improving of the technology, students usually use the laptop and cellphone to take notes or look up information. However, in this situation, students with poor self-control are easily distracted by electronic products. Also, students with poor concentration can be absent-minded and wandering. Therefore, the purpose of this paper is to analyze students’ concentration through color images and help the teacher understand students’ situations. In this paper, we use the face detector and neural network to find facial landmarks in order to determine the facial information and estimate the face orientation. In addition, we also use the neural network and object recognition to classify these skeleton features extracted from the skeleton data of the pose estimation system. Finally, we analyze the students’ concentration according to the above results.
In this paper, two kinds of fatigue behaviors are designed on the facial information and the occurrence of fatigue behaviors is determined by facial landmarks. Also, we use the features extracted from facial landmarks to train the neural network to estimate the face orientation angles with an average angular error of less than 10 degrees. In motion recognition, we design eight kinds of common postures of students, four of which are human-object interaction, and to collect the dataset to train and test the neural network. In the experiments, the recognition rate of person and angle independent experiments is nearly 80%. Also, the recognition rate of the real situation is good.
With these experiments, it is proved that the system can provide accurate information in facial information and motion recognition.

關鍵字(中)

★ 學生專注度
★ 電腦視覺
★ 行為識別
★ 類神經網路

關鍵字(英)

★ students’ attention
★ computer vision
★ activity recognition
★ neural networks

論文目次

摘要 i
ABSTRACT ii
誌謝 iv
目錄 v
圖目錄 viii
表目錄 x
第一章、緒論 1
1-1 研究動機 1
1-2 研究目的 2
1-3 論文架構 3
第二章、相關研究 4
2-1 專注度評估 4
2-2 人臉朝向角估測 5
2-3 動作辨識 7
2-4 姿態估測 9
2-5 類神經網路 12
2-5-1 倒傳遞類神經網路 13
2-5-2 卷積類神經網路 16
第三章、研究方法 18
3-1 軟體流程架構 18
3-2 臉部資訊判定 19
3-2-1 人臉特徵偵測 19
3-2-2 正規化 21
3-2-3 閉眼和眨眼偵測 23
3-2-4 哈欠偵測 24
3-2-5 人臉朝向角估測 26
1.特徵點之間的距離變化 26
2.特徵點之間的角度變化 27
3-3 動作辨識 29
3-3-1 正規化 29
3-3-2 卷積類神經網路特徵擷取 31
3-3-3 卷積類神經網路架構 32
3-3-4 物件辨識 33
3-3-5 後處理 38
3-4 專注度分析 39
第四章、實驗設計與結果 40
4-1 哈欠偵測 40
4-1-1 資料集 40
4-1-2 實驗結果 41
4-2 人臉朝向角估測 42
4-2-1 資料集 42
4-2-2 不同之類神經網路比較 43
4-2-3 實驗結果 44
4-3 動作辨識 45
4-3-1 資料集拍攝方式 45
4-3-2 教室情境動作 47
4-3-3 資料集骨架 49
4-3-4 不同之卷積類神經網路比較 51
4-3-5 特徵擷取測試 52
4-3-6 不同人間推廣性測試 53
4-3-7 加入物件辨識 56
4-3-8 角度推廣性測試 59
4-3-9 滑動窗口測試 60
4-3-10 實際情境測試 61
第五章、結論與未來展望 63
5-1 結論 63
5-2 未來展望 64
參考文獻 65

參考文獻

[1] L. N. Han, C. Y. Chiang, and H. C. Chu, “Recognizing the Degree of Human Attention Using EEG Signals from Mobile Sensors,” Sensors, vol. 13, no. 8, pp. 10273-10286, 2013.
[2] C. M. Chen, J. Y. Wang, and C. M. Yu, “Assessing the Attention Levels of Students by Using a Novel Attention Aware System based on Brainwave Signals,” 2015 IIAI 4th International Congress on Advanced Applied Informatics, pp. 379-384, 2015.
[3] M. Raca and P. Dillenbourg, “System for Assessing Classroom Attention,” Proceedings of the Third International Conference on Learning Analytics and Knowledge, pp. 265-269, 2013.
[4] J. Zaletelj and A. Ko?ir, “Predicting students’ attention in the classroom from Kinect facial and body features,” EURASIP Journal on Image and Video Processing, pp. 1-12, 2017.
[5] E. M. Chutorian and M. M. Trivedi, “Head Pose Estimation in Computer Vision: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 607-626, 2009.
[6] K. Hara and R. Chellappa, “Growing Regression Forests by Classification: Applications to Object Pose Estimation,” European Conference on Computer Vision, pp. 552-567, 2014.
[7] X. Zhen, Z. Wang, M. Yu, and S. Li, “Supervised descriptor learning for multi-output regression,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1211-1218, 2015.

[8] R. O. Mbouna, S. G. Kong, and M. G. Chun, “Visual Analysis of Eye State and Head Pose for Driver Alertness Monitoring,” IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 3, pp. 1462-1469, 2013.
[9] S. Tulyakov and N. Sebe, “Regressing a 3D Face Shape from a Single Image,” IEEE International Conference on Computer Vision, pp. 3748-3755, 2015.
[10] R. Poppe, “A survey on vision-based human action recognition,” Image and Vision Computing, vol. 28, no. 6, pp. 976-990, 2010.
[11] A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257-267, 2001.
[12] A. A. Efros, A. C. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” Proceedings Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 726-733, 2003.
[13] P. Scovanner, S. Ali, and M. Shah, “A 3-dimensional sift descriptor and its application to action recognition,” Proceedings of the 15th ACM international conference on Multimedia, pp. 357-360, 2007.
[14] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[15] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.

[16] H. Wang, A. Klaser, C. Schmid, and C. L. Liu, “Action recognition by dense trajectories,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 3169-3176, 2011.
[17] H. Wang and C. Schmid, “Action Recognition with Improved Trajectories,” IEEE International Conference on Computer Vision, pp. 3551-3558, 2013.
[18] Wikipedia Kinect. [Online]. Available: https://en.wikipedia.org/wiki/
Kinect. [Accessed: 13-Jun-2018].
[19] H. T. Kam, “Random Decision Forest,” Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp. 278-282, 1995.
[20] Y. Cheng, “Mean Shift, Mode Seeking, and Clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 790-799, 1995.
[21] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, “Real-time human pose recognition in parts from single depth images,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1297-1304, 2011.
[22] Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, “Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291-7299, 2017.

[23] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D Human Pose Estimation: New Benchmark and State of the Art Analysis,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686-3693, 2014.
[24] 維基百科：感知機. [Online]. Available: https://zh.wikipedia.org/wiki/
感知機. [Accessed: 20-Jun-2018].
[25] 蘇木春、張孝德，機器學習：類神經網路、模糊系統以及基因演算法則，第二版，全華科技圖書，民國一百零一年。
[26] IBM Deep learning architectures. [Online]. Available: https://www.
ibm.com/developerworks/library/cc-machine-learning-deep-learning-
architectures/index.html. [Accessed: 21-Jun-2018].
[27] Deep learning for complete beginners: convolutional neural networks with keras. [Online]. Available: https://cambridgespark.com/content/
tutorials/convolutional-neural-networks-with-keras/index.html. [Accessed: 21-Jun-2018].
[28] Wikipedia Convolution Neural Network. [Online]. Available: https://
en.wikipedia.org/wiki/Convolutional_neural_network.
[Accessed: 21-Jun-2018].
[29] GitHub openpose. [Online]. Available: https://github.com/CMU-
Perceptual-Computing-Lab/openpose. [Accessed: 19-Jun-2018].
[30] 蘇昭銘，「漫談疲勞駕駛」，中華民國運輸協會，運輸人通訊第41期，2005。
[31] Dlib CNN face detector example. [Online]. Available:http://dlib.
net/cnn_face_detector.py.html. [Accessed: 12-Jun-2018].

[32] A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks),” International Conference on Computer Vision, pp. 1021-1030, 2017.
[33] 2D and 3D FAN GitHub code. [Online]. Available: https://github.
com/1adrianb/2D-and-3D-face-alignment. [Accessed: 12-Jun-2018].
[34] A. Newell, K. Yang, and J. Deng, “Stacked hourglass networks for human pose estimation,” European Conference on Computer Vision, pp. 483-499, 2016.
[35] A. Bulat and G. Tzimiropoulos, “Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources,” International Conference on Computer Vision, pp. 3726-3734, 2017.
[36] 68 facial landmarks. [Online]. Available:https://ibug.doc.ic.ac.uk/
resources/300-W/. [Accessed: 12-Jun-2018].
[37] T. Soukupova and J. ?ech, “Real-Time Eye Blink Detection using Facial Landmarks,” 21st Computer Vision Winter Workshop, pp. 1-8, 2016.
[38] B. Shankar, D. Jayachandra, and K. K. Hati, “Face Pose Estimation From Rigid Face Landmarks For Driver Monitoring Systems,” Electronic Imaging, pp. 83-88, 2017.

[39] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”, International Conference on Learning Representations, pp. 1-14, 2014.
[40] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[41] YOLOv3 Comparison to Other Detectors. [Online]. Available: https://
pjreddie.com/darknet/yolo/. [Accessed: 18-Jul-2018].
[42] S. Abtahi, M. Omidyeganeh, S. Shirmohammadi, and B. Hariri, “YawDD: A Yawning Detection Dataset”, Proceedings of the 5th ACM Multimedia Systems Conference, pp. 24-28, 2014.
[43] N. Gourier, D. Hall, and J. L. Crowley, “Estimating Face Orientation from Robust Detection of Salient Facial Features”, ICPR International Workshop on Visual Observation of Deictic Gestures, pp. 17-25, 2004.
[44] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks”, Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
[45] T. Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollar, “Microsoft COCO: Common Objects in Context”, European conference on computer vision, pp. 740-755, 2014.

指導教授

蘇木春(Mu-Chun Su)

審核日期

2018-8-21

推文