Hybrid Architecture for Intent Detection through Integration of Data Augmentation in Text and Feature Space

、線上人數：46

、訪客IP：18.219.206.102

姓名	蔣銘皓(Ming-Hao Chiang) 查詢紙本館藏	畢業系所	資訊管理學系
論文名稱	(Hybrid Architecture for Intent Detection through Integration of Data Augmentation in Text and Feature Space)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 (2026-8-1以後開放)
摘要(中)	訓練資料的不足是自然語言處理 (Natural Language Processing) 任務中面臨的最大挑戰之一。意圖偵測是一個跨多個領域的經典自然語言處理任務，他屬於對話系統中自然語言理解重要的元件之一，而意圖偵測領域也常常面臨資料不足的問題。以往的研究通過採用逆向翻譯 (Back Translation)、簡單資料擴增(Easy Data Augmentation) 等基於文本空間的資料擴增方法來提升訓練資料量，或者是基於特徵空間方法，像是CVAE、外推、高斯噪音等方法，以不同面向來解決資料量不足的問題。然而，Kumar 等人 (2021) 在他們的文獻之未來展望中提到，透過他們提出的資料擴增技術可以與潛在空間資料擴增相互結合，統一不同面向的資料擴增方法能夠激發出更強力的資料擴增方法亦是新的思路。因此，我們提出了一種同時包含文本空間和特徵空間資料擴增的混合架構。這種混合式結構的目的是增強模型的泛化能力，使能夠廣泛應用在不同領域，亦可增進資料擴增的效率。除此之外，我們在實驗中也會更近一步觀察資料擴增的生成品質以及資料擴增對意圖標籤分類性能的影響。從實驗結果中可以觀察到，與僅應用文本空間資料擴增的設置相比，透過我們提出的混合式架構來整合不同面向的資料擴增方法後，所有資料集的表現均有一致且穩定的提升。因此，結果驗證了我們提出的架構具有強大的泛化能力和有效性。
摘要(英)	One of the biggest challenges in Natural Language (NLP) tasks is the scarcity of training data. Intent detection is a classic NLP task that spans multiple domains. Yet, it also encounters data scarcity issues. Previous studies have tackled this issue by employing both text space-based methods, such as back translation and Easy Data Augmentation (EDA) (Wei and Zou, 2019), and feature space-based approaches, including CVAE, extrapolation, and so forth. Nevertheless, Kumar et al. (2021) mentioned in their future work that the proposed data augmentation technique could be combined with latent space data augmentation, hoping that unifying different DA methods would inspire new approaches for universal data augmentation approach. Hence, we propose a hybrid architecture containing both text space and feature (latent) space. The purpose of this hybrid structure is to enhance model generalization capabilities significantly across various domains. Additionally, we observe the quality of generated data through data augmentation and how it affects the classification performance of intent labels to ensure substantial impact. Compared to the baseline setup that only implements text space data augmentation, experiment results demonstrate consistent improvement across all three datasets when applying our proposed hybrid architecture, which integrates text space and feature space data augmentation. Our approach shows strong generalization capabilities and effectiveness.
關鍵字(中)	★ 自然語言處理 ★ 遷移式學習 ★ 文本分類 ★ 意圖偵測 ★ 資料擴增 ★ 少樣本學習 ★ 大型語言模型 ★ 特徵空間資料擴增	關鍵字(英)	★ natural language processing ★ transfer learning ★ text classification ★ intent detection ★ data augmentation ★ few-shot learning ★ large language models (LLMs) ★ feature space data augmentation (FDA)
論文目次	摘要 ii Abstract iii Acknowledgments iv Table of Contents v List of Figures vii List of Tables viii List of Appendixes ix 1. Introduction 1 1.1. Background 1 1.2. Motivation 2 1.3. Objectives 3 2. Related Work 4 2.1. Intent Detection 4 2.2. Few-shot Learning 5 2.2.1. FSL in Intent Detection 6 2.3. Transfer learning 6 2.3.1. BERT pre-trained model 6 2.4. Data Augmentation 7 2.4.1. Data Augmentation in Text Space 8 2.4.2. DA in Feature space 9 3. Method 11 3.1. Model Architecture 11 3.2. DA in Text Space 12 3.2.2. Fine-tuning BERT 14 3.2.3. DA in Feature Space 15 3.2.4. Multilayer Perceptron Classifier (MLP Classifier) 16 3.3. Experimental Setup 16 3.3.1. Few-shot Integration 17 3.3.2. Data Preprocessing 17 3.3.3. Parameter Configuration 17 3.4. Experiment Design 18 3.4.1. Experiment 1: Hybrid Architecture Combining Text and Feature Space 18 3.4.2. Datasets 19 3.5. Evaluation Metrics 23 3.5.1. Confusion Matrix 23 4. Experiment 26 4.1. Experiment: Hybrid Architecture Combining Text and Feature Space 26 4.2. Ablation Study 38 4.2.1. Text Space & Feature Space Only 38 4.2.2. Quality Assessment of text-space DA outputs 40 4.2.3. Evaluation of Feature Space Data Augmentation 44 5. Conclusion 48 5.1. Overall Summary 48 5.2. Contributions 48 5.3. Further Discussion 49 References 50 Appendix 54
參考文獻	Alain, G., Bengio, Y., Yao, L., Yosinski, J., Thibodeau-Laufer, E., Zhang, S., Vincent, P., 2015. GSNs : Generative Stochastic Networks. Anaby-Tavor, A., Carmeli, B., Goldbraich, E., Kantor, A., Kour, G., Shlomov, S., Tepper, N., Zwerdling, N., 2020. Do Not Have Enough Data? Deep Learning to the Rescue! Proc. AAAI Conf. Artif. Intell. 34, 7383–7390. https://doi.org/10.1609/aaai.v34i05.6233 Bayer, M., Kaufhold, M.-A., Reuter, C., 2023. A Survey on Data Augmentation for Text Classification. ACM Comput. Surv. 55, 1–39. https://doi.org/10.1145/3544558 Bengio, Y., Mesnil, G., Dauphin, Y., Rifai, S., n.d. Better Mixing via Deep Representations. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16, 321–357. https://doi.org/10.1613/jair.953 Chen, H., Liu, X., Yin, D., Tang, J., 2017. A Survey on Dialogue Systems: Recent Advances and New Frontiers. ACM SIGKDD Explor. Newsl. 19, 25–35. https://doi.org/10.1145/3166054.3166058 Chen, Z., Liu, B., Hsu, M., Castellanos, M., Ghosh, R., n.d. Identifying Intention Posts in Discussion Forums. Cheung, T.-H., Yeung, D.-Y., 2021. MODALS: MODALITY-AGNOSTIC AUTOMATED DATA AUGMENTATION IN THE LATENT SPACE. Coucke, A., Saade, A., Ball, A., Bluche, T., Caulier, A., Leroy, D., Doumouro, C., Gisselbrecht, T., Caltagirone, F., Lavril, T., Primet, M., Dureau, J., 2018. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. DeVries, T., Taylor, G.W., 2017. Dataset Augmentation in Feature Space. Feng, S.Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., Hovy, E., 2021. A Survey of Data Augmentation Approaches for NLP. Genkin, A., Lewis, D.D., Madigan, D., 2007. Large-Scale Bayesian Logistic Regression for Text Categorization. Technometrics 49, 291–304. https://doi.org/10.1198/004017007000000245 Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative Adversarial Networks. Haffner, P., Tur, G., Wright, J.H., 2003. Optimizing SVMs for complex call classification, in: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03). Presented at the International Conference on Acoustics, Speech and Signal Processing (ICASSP’03), IEEE, Hong Kong, China, p. I-632-I–635. https://doi.org/10.1109/ICASSP.2003.1198860 Hochreiter, S., Schmidhuber, J., 1997. Long Short-Term Memory. Neural Comput. 9, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 Kim, Y., 2014. Convolutional Neural Networks for Sentence Classification, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, pp. 1746–1751. https://doi.org/10.3115/v1/D14-1181 Kingma, D.P., Ba, J., 2017. Adam: A Method for Stochastic Optimization. Kingma, D.P., Welling, M., 2022. Auto-Encoding Variational Bayes. Kobayashi, S., 2018. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Presented at the Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp. 452–457. https://doi.org/10.18653/v1/N18-2072 Kumar, V., Choudhary, A., Cho, E., 2021. Data Augmentation using Pre-trained Transformer Models. Kumar, V., Glaude, H., de Lichy, C., Campbell, W., 2019. A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification. Larson, S., Mahendran, A., Peper, J.J., Clarke, C., Lee, A., Hill, P., Kummerfeld, J.K., Leach, K., Laurenzano, M.A., Tang, L., Mars, J., 2019. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction. Lee, K., Guu, K., He, L., Dozat, T., Chung, H.W., 2021. Neural Data Augmentation via Example Extrapolation. Li, X., Roth, D., 2002. Learning question classifiers, in: Proceedings of the 19th International Conference on Computational Linguistics -. Presented at the the 19th international conference, Association for Computational Linguistics, Taipei, Taiwan, pp. 1–7. https://doi.org/10.3115/1072228.1072378 Lin, Y.-T., Papangelis, A., Kim, S., Lee, S., Hazarika, D., Namazifar, M., Jin, D., Liu, Y., Hakkani-Tur, D., 2023. Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information. Liu, J., Li, Y., Lin, M., 2019. Review of Intent Detection Methods in the Human-Machine Dialogue System. J. Phys. Conf. Ser. 1267, 012059. https://doi.org/10.1088/1742-6596/1267/1/012059 Liu T., Ding X., Qian Y., Chen Y., 2017. Identification method of user’s travel consumption intention in chatting robot. Sci. Sin. Informationis 47, 997. https://doi.org/10.1360/N112016-00306 Louvan, S., Magnini, B., 2020. Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification. McCallum, A., Nigam, K., n.d. A Comparison of Event Models for Naive Bayes Text Classiﬁcation. Ozair, S., Bengio, Y., 2014. Deep Directed Generative Autoencoders. Pan, S.J., Yang, Q., 2010. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359. https://doi.org/10.1109/TKDE.2009.191 Pearson, K. 1857-1936, n.d. On the theory of contingency and its relation to association and normal correlation. Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L., 2018. Deep Contextualized Word Representations, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp. 2227–2237. https://doi.org/10.18653/v1/N18-1202 Popescu, M.-C., Balas, V.E., Perescu-Popescu, L., Mastorakis, N., 2009. Multilayer Perceptron and Neural Networks 8. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., n.d. Language Models are Unsupervised Multitask Learners. Raﬀel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J., n.d. Exploring the Limits of Transfer Learning with a Uniﬁed Text-to-Text Transformer. Ravuri, S., Stolcke, A., 2015. Recurrent neural network and LSTM models for lexical utterance classification, in: Interspeech 2015. Presented at the Interspeech 2015, ISCA, pp. 135–139. https://doi.org/10.21437/Interspeech.2015-42 Santoso, N., Wibowo, W., Hikmawati, H., 2019. Integration of synthetic minority oversampling technique for imbalanced class. Indones. J. Electr. Eng. Comput. Sci. 13, 102. https://doi.org/10.11591/ijeecs.v13.i1.pp102-108 Schlüter, J., Grill, T., n.d. EXPLORING DATA AUGMENTATION FOR IMPROVED SINGING VOICE DETECTION WITH NEURAL NETWORKS. Sennrich, R., Haddow, B., Birch, A., 2016. Improving Neural Machine Translation Models with Monolingual Data, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp. 86–96. https://doi.org/10.18653/v1/P16-1009 Tiedemann, J., n.d. OPUS – Parallel Corpora for Everyone. Wei, J., Zou, K., 2019. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. Xia, C., Xiong, C., Yu, P., Socher, R., 2020. Composed Variational Natural Language Generation for Few-shot Intents, in: Findings of the Association for Computational Linguistics: EMNLP 2020. Presented at the Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, pp. 3379–3388. https://doi.org/10.18653/v1/2020.findings-emnlp.303 Yang, Y., Malaviya, C., Fernandez, J., Swayamdipta, S., Le Bras, R., Wang, J.-P., Bhagavatula, C., Choi, Y., Downey, D., 2020. Generative Data Augmentation for Commonsense Reasoning, in: Findings of the Association for Computational Linguistics: EMNLP 2020. Presented at the Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, pp. 1008–1025. https://doi.org/10.18653/v1/2020.findings-emnlp.90 Ye, J., Xu, N., Wang, Y., Zhou, J., Zhang, Q., Gui, T., Huang, X., 2024. LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition. Zhang, J., Bui, T., Yoon, S., Chen, X., Liu, Z., Xia, C., Tran, Q.H., Chang, W., Yu, P., 2021. Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 1906–1912. https://doi.org/10.18653/v1/2021.emnlp-main.144 Zhang, J., Hashimoto, K., Liu, W., Wu, C.-S., Wan, Y., Yu, P., Socher, R., Xiong, C., 2020. Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp. 5064–5082. https://doi.org/10.18653/v1/2020.emnlp-main.411 Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., Fidler, S., 2015. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books, in: 2015 IEEE International Conference on Computer Vision (ICCV). Presented at the 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, Santiago, Chile, pp. 19–27. https://doi.org/10.1109/ICCV.2015.11
指導教授	周惠文柯士文(Huey-Wen Chou Shi-Wen Ke)	審核日期	2024-7-26
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 111423009 詳細資訊