姓名 黃柏鈞(Po-Chun Huang)  畢業系所 資訊管理學系
論文名稱 應用深度學習以心理特徵與自然語言預測網路霸凌行為
(Prediction of Cyberbullying by Deep Learning with Natural Language and Psychology Features)
摘要(中) 本研究旨在提出架構以心理特徵(數值)結合文本應用深度學習預測網路霸凌行為,不同於過去研究只由文本或心理特徵(數值)的單一方式訓練模型進行預測,本研究擬調用IBM API分析網路貼文中的情緒、情感以及發文者的五型人格傾向。文本方面,本研究使用的詞嵌入模型有GloVe與Word2Vec,並依據實驗結果提出較適合本研究的詞嵌入模型。
  在文本結合心理特徵(數值)輸入雙向長短期記憶神經網路串接全連接神經網路的部分則較單純使用心理特徵(數值)輸入全連接神經網路在各項指標中更佳,在epoch (訓練期數)本研究實驗10、20與30的組合,在epoch為20的情況下在各項指標的結果最佳。比較epoch同為20的狀況下loss值由0.3673下降至0.2842,F1 score提升近6% 。除了進行網路霸凌行為預測外,本研究亦探討各個心理特徵與網路霸凌行為間的相關性。
摘要(英) This research aims to propose a new method to predict cyberbullying behavior by combining psychology features with text. Each of the many posts on the Internet contains their own emotions, personalities and their sentiments at the moment of posting. These emotions, sentiments and personalities can be identified through applications. In terms of text, the Word Embedding methods used in this study are GloVe and Word2Vec. We compared them and proposed the methods which more suitable for this study. It is different from the past research that only used a one-way information method of text or psychology features values to train models to make predictions.  
  The use of text and psychological features, in contrast of purely using numerical data, loss value drops from 0.3673 to 0.2842 and F1 score increase by nearly 5% under the same epoch. This study in the epoch (Number of training periods) has experimented with a combination of 10, 20, and 30, the results of each metrics are the best when the epoch is 20. In addition to predicting cyberbullying, this research also explores the correlation between various psychological features and cyberbullying.
  The results of this research confirmed our proposed approach to be a useful detection mechanism for social media websites to provide an early warning or to screen out maliciously posts occurred in Internet.
關鍵字(中) ★ 深度學習
★ 情緒
★ 情感極性
★ 網路霸凌
★ 五型人格
關鍵字(英) ★ Deep learning
★ Emotion
★ Sentiment polarity
★ Cyberbullying
★ Big Five personalities
論文目次 摘要...............................................i
1.1 研究背景......................................1
1.2 研究動機......................................2
1.3 研究目的......................................4
第二章、 文獻探討..................................5
2.1 網路霸凌相關研究...............................5
2.2 應用機器學習於網路霸凌的相關研究.................8
2.3 機器學習與深度學習.............................10
2.4 五型人格特質...................................13
2.5 自然語言處理相關研究............................16
第三章、 研究方法..................................18
3.1 資料集來源.....................................18
3.2 研究流程.......................................19
3.3 IBM Tone Analyzer.............................20
3.4 資料篩選.......................................22
3.5 文字資料前處理..................................24
3.6 神經網路建構....................................28
3.7 實驗環境設置....................................34
第四章、 實驗結果與分析..............................35
4.1 資料集分析......................................35
4.2 Bi-LSTM與Fully-connected 神經網路建構...........36
4.3 神經網路的實驗結果...............................37
4.4 SHAP (沙普利值法)特徵重要度與皮爾森相關係數..................................................42
第五章、 結論.......................................45
5.1 結論...........................................45
5.2 研究限制與未來研究方向.................................................46
5.3 研究貢獻.......................................47
指導教授 周惠文(Huey-Wen Chou) 審核日期 2021-8-2
