博碩士論文 109522072 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:15 、訪客IP:18.218.104.46
姓名 林斳(Chyn Lin)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 中文多模態圖文不符資料集
(Chinese Multi-modal Out-of-context Dataset)
相關論文
★ A Real-time Embedding Increasing for Session-based Recommendation with Graph Neural Networks★ 基於主診斷的訓練目標修改用於出院病摘之十代國際疾病分類任務
★ 混合式心臟疾病危險因子與其病程辨識 於電子病歷之研究★ 基於 PowerDesigner 規範需求分析產出之快速導入方法
★ 社群論壇之問題檢索★ 非監督式歷史文本事件類型識別──以《明實錄》中之衛所事件為例
★ 應用自然語言處理技術分析文學小說角色 之關係:以互動視覺化呈現★ 基於生醫文本擷取功能性層級之生物學表徵語言敘述:由主成分分析發想之K近鄰算法
★ 基於分類系統建立文章表示向量應用於跨語言線上百科連結★ Code-Mixing Language Model for Sentiment Analysis in Code-Mixing Data
★ 藉由加入多重語音辨識結果來改善對話狀態追蹤★ 對話系統應用於中文線上客服助理:以電信領域為例
★ 應用遞歸神經網路於適當的時機回答問題★ 使用多任務學習改善使用者意圖分類
★ 使用轉移學習來改進針對命名實體音譯的樞軸語言方法★ 基於歷史資訊向量與主題專精程度向量應用於尋找社群問答網站中專家
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2026-1-18以後開放)
摘要(中) 假新聞的驗證一直是普遍存在的問題。近年來隨著社群媒體的發展,訊息傳播更加容易,其中將真實圖片的標題,抽換成誤導訊息的作法,能夠低成本製作具有一定說服力的假新聞,而這種假新聞類型我們稱作圖文不符。
在圖文不符方面,過去有人運用文字、圖片和場景變換,從真實新聞製作過英文的圖文不符資料集。但目前仍然沒有中文的圖文不符資料集以及中文模型。
於是我們決定創造具有挑戰性且可用的非隨機圖文匹配技術,用於自動生成中文假新聞資料集。並將用於英文圖文不符判斷的模型搬遷至中文,以測試資料集的可用性,並分析了將英文模行搬遷至中文後的性能。
摘要(英) The verification of fake news has always been a common problem. With the development of social media in recent years, it has become easier to disseminate information. The practice of replacing the captions of real pictures with misleading information can produce convincing fake news at a low cost. We call this type of fake news out-of-context.
In terms of out-of-context, some people have used text, pictures and scene changes to produce English out-of-context datasets from real news in the past. However, there is still no Chinese out-of-context dataset or Chinese out-ofcontext model.
So we decide to create a challenging and usable non-random out-of-context matching technique to automatically generate Chinese out-of-context datasets. We further adapt the model designed for English out-of-context determination to Chinese to test the usability of the dataset, and analyzed the performance of the adapted English model to Chinese.
關鍵字(中) ★ 圖文不符
★ 資料集
關鍵字(英) ★ Out-of-context
★ Dataset
論文目次 中文摘要 i
Abstract ii
誌謝 iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related work 3
2.1 Fake News and Rumor Detection 3
2.2 Automated Fact Checking 3
2.3 Multi-modal Message Validation 4
2.4 Multi-modal and Out-of-context Dataset 4
3 Method 7
3.1 Preprocessing 8
3.2 Generation Method 9
3.2.1 Semantic Comparison 10
3.2.2 Scene Comparison 10
3.2.3 Example 11
3.3 Data Statistics and Balance 14
3.4 Method Design and Trial 15
4 Experiments and Analysis 17
4.1 Challenge and Usability 17
4.2 Analysis 19
4.3 Subjective Evaluation 23
5 Conclusion 24
Bibliography 24
參考文獻 [1] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International Conference on Machine Learning. PMLR, 2021, pp. 8748–8763.
[2] S. Kwon, M. Cha, K. Jung, W. Chen, and Y. Wang, “Prominent features of rumor propagation in online social media,” in 2013 IEEE 13th international conference on data mining. IEEE, 2013, pp. 1103–1108.
[3] X. Liu, A. Nourbakhsh, Q. Li, R. Fang, and S. Shah, “Real-time rumor debunking on twitter,” in Proceedings of the 24th ACM international on conference on information and knowledge management, 2015, pp. 1867–1870.
[4] J. Ma, W. Gao, and K.-F. Wong, “Rumor detection on twitter with tree-structured recursive neural networks.” Association for Computational Linguistics, 2018.
[5] W. Y. Wang, “” liar, liar pants on fire”: A new benchmark dataset for fake news detection,” arXiv preprint arXiv:1705.00648, 2017.
[6] S. Vasileva, P. Atanasova, L. Màrquez, A. Barrón-Cedeño, and P. Nakov, “It takes nine to smell a rat: Neural multi-task learning for check-worthiness prediction,” arXiv preprint arXiv:1908.07912, 2019.
[7] P. Atanasova, J. G. Simonsen, C. Lioma, and I. Augenstein, “Generating fact checking explanations,” arXiv preprint arXiv:2004.05773, 2020.
[8] Z. Jin, J. Cao, H. Guo, Y. Zhang, and J. Luo, “Multimodal fusion with recurrent neural networks for rumor detection on microblogs,” in Proceedings of the 25th ACM international conference on Multimedia, 2017, pp. 795–816.
[9] D. Khattar, J. S. Goud, M. Gupta, and V. Varma, “Mvae: Multimodal variational autoencoder for fake news detection,” in The world wide web conference, 2019, pp.
2915–2921.
[10] A. Jaiswal, E. Sabir, W. AbdAlmageed, and P. Natarajan, “Multimedia semantic integrity assessment using joint embedding of images and text,” in Proceedings of the 25th ACM international conference on Multimedia, 2017, pp. 1465–1471.
[11] E. Sabir, W. AbdAlmageed, Y. Wu, and P. Natarajan, “Deep multimodal imagerepurposing detection,” in Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 1337–1345.
[12] D. Zlatkova, P. Nakov, and I. Koychev, “Fact-checking meets fauxtography: Verifying claims about images,” arXiv preprint arXiv:1908.11722, 2019.
[13] E. Müller-Budack, J. Theiner, S. Diering, M. Idahl, and R. Ewerth, “Multimodal analytics for real-world news using measures of cross-modal entity consistency,” in Proceedings of the 2020 International Conference on Multimedia Retrieval, 2020, pp. 16–25.
[14] G. Luo, T. Darrell, and A. Rohrbach, “Newsclippings: Automatic generation of outof-context multimodal media,” arXiv preprint arXiv:2104.05893, 2021.
[15] R. Tan, B. A. Plummer, and K. Saenko, “Detecting cross-modal inconsistency to defend against neural fake news,” arXiv preprint arXiv:2009.07698, 2020.
[16] R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, and Y. Choi, “Defending against neural fake news,” Advances in neural information processing systems, vol. 32, 2019.
[17] S. Aneja, C. Bregler, and M. Nießner, “Cosmos: Catching out-of-context misinformation with self-supervised learning,” arXiv preprint arXiv:2101.06278, 2021.
[18] F. Liu, Y. Wang, T. Wang, and V. Ordonez, “Visual news: Benchmark and challenges in news image captioning,” arXiv preprint arXiv:2010.03743, 2020.
[19] B. Wang and C.-C. J. Kuo, “Sbert-wk: A sentence embedding method by dissecting bert-based word models,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2146–2157, 2020.
[20] B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, “Places: A 10 million image database for scene recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 6, pp. 1452–1464, 2017.
[21] S. Abdelnabi, R. Hasan, and M. Fritz, “Open-domain, content-based, multi-modal fact-checking of out-of-context images via online resources,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp.
14 940–14 949.
[22] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, “Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media,” Big data, vol. 8, no. 3, pp. 171–188, 2020.
[23] K. Nakamura, S. Levy, and W. Y. Wang, “r/fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection,” arXiv preprint arXiv:1911.03854, 2019.
[24] “Central news agency,” https://www.cna.com.tw/.
[25] “Liberty times net,” https://www.ltn.com.tw/.
[26] “United daily news,” https://udn.com/news/index.
[27] “China times,” https://www.chinatimes.com/?chdtv.
[28] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[29] J. D. M. C. K. Lee and K. Toutanova, “Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
[30] F. Carlsson, P. Eisen, F. Rekathati, and M. Sahlgren, “Cross-lingual and multilingual clip,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 6848–6854.
指導教授 蔡宗翰(Tzong-Han Tsai) 審核日期 2023-2-2
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明