An Audio Call Classification System Based on Fine-Tuned BERT

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：11

、訪客IP：3.142.156.33

姓名

賴議翔(Yi-Shiang Lai) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

(An Audio Call Classification System Based on Fine-Tuned BERT)

相關論文

★ Dynamic Overlay Construction for Mobile Target Detection in Wireless Sensor Networks	★ 車輛導航的簡易繞路策略
★ 使用傳送端電壓改善定位	★ 利用車輛分類建構車載網路上的虛擬骨幹
★ Why Topology-based Broadcast Algorithms Do Not Work Well in Heterogeneous Wireless Networks?	★ 針對移動性目標物的有效率無線感測網路
★ 適用於無線隨意網路中以關節點為基礎的分散式拓樸控制方法	★ A Review of Existing Web Frameworks
★ 將感測網路切割成貪婪區塊的分散式演算法	★ 無線網路上Range-free的距離測量
★ Inferring Floor Plan from Trajectories	★ An Indoor Collaborative Pedestrian Dead Reckoning System
★ Dynamic Content Adjustment In Mobile Ad Hoc Networks	★ 以影像為基礎的定位系統
★ 大範圍無線感測網路下分散式資料壓縮收集演算法	★ 車用WiFi網路中的碰撞分析

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

一家電話行銷公司非常依賴他們的銷售員撥打大量的通話以推銷
公司的產品，為了能夠優先處理較有購買意願的潛在客戶以及檢視
銷售員的業績，一個能夠客觀判斷一通促銷通話目前屬於哪個促銷
階段的機制對電話行銷公司非常重要。
在這篇論文中，我們設計了一套基於微調 BERT 的語音通話分類系
統，它能夠自動的將每通銷售員的通話分類為適當的階段。我們的
提出的系統包含五個組件，包含資料收集、資料前處理、預訓練模
型微調、通話等級分類、以及網路服務，在資料收集中，語音通話
會藉由 Kaldi 語音辨識轉換為相對應的文本，在資料前處理，文本
會經由移除停用詞、切割文本、以及手動標記等處理，在預訓練模
型微調中，四個基於 BERT 的預訓練模型經由遷移學習進而獲得可對
段落等級分類的模型，在通話等級分類中，一個基於規則的方法被
用在通話相對應的段落上進而獲得一通通話的分類結果(階段)，最
後我們提供了一個網路服務以便公司可以容易地使用我們的系統。
經過密集的實驗後，結果顯示我們提出的系統在通話等級的分類上
可以達到 97%的 Macro F1 Score 並且比 TextCNN 高出 13%。

摘要(英)

A telemarketing company relies heavily on its telemarketers to make numerous
calls to customers in order to promote the company products. To prioritize the
potential customers and evaluate the performance of telemarketers, a objectively
mechanism to identify which stage of promotion a call belongs to is crucial to a
telemarketing company. In this thesis, we design an audio call classification system
based on fine-tuned BERT [1] to automatically classify each telemarketer’s call to an
appropriate stage. The five components of the proposed system are data collection,
data pre-processing, pre-trained model fine-tuning, call-level classification, and the
web service. In data collection, the audio calls are converted into the corresponding transcripts via Kaldi speech recognition. In data pre-processing, transcripts
are processed to remove stopwords, split into segments, and assign labels manually. In pre-trained model fine-tuning, four BERT-based models are retrained to
obtain segment-level classification models. In call-level classification, a rule-based
method is performed to obtain the call-level classification (i.e., stage) of a call from
the classification results of the corresponding segments of the call. Finally, a web
service is provided to allow the company access the system easily. The extensive
experiments show that the proposed system reaches 97% Macro-F1 Score for the
call-level classification.

關鍵字(中)

★ BERT
★ 遷移學習
★ 通話分類

關鍵字(英)

★ BERT
★ Transfer learning
★ Call classification

論文目次

Contents
1 Introduction 1
2 Related Work 3
2.1 Machine learning approaches 3
2.2 Deep learning approaches 3
3 Preliminary 5
3.1 Kaldi 5
3.2 Bidirectional Encoder Representations from Transformers (BERT) 6
3.2.1 Background of BERT 6
3.2.2 Transformer Encoder 6
3.2.3 Pre-training tasks 7
3.2.4 Improved BERT 8
3.3 Flask 9
4 Design 10
4.1 Data Collection 11
4.2 Data pre-processing 12
4.3 Pre-trained Model Fine-Tuning 14
4.4 Call-level Classification 16
4.5 Web Service 17
5 Performance 18
5.1 Experimental Environment 18
5.2 Dataset Description 18
5.3 Evaluation Metrics 19
5.4 Experiment Results and Analysis 22
5.4.1 Segment-level Evaluation Result 22
5.4.2 Misclassification Threshold Tuning 23
5.4.3 Call-level Evaluation Result24
6 Conclusions 25
Reference 27

參考文獻

[1] Bert. https://github.com/google-research/bert.
[2] Cocolong. https://www.cocolong.com.tw/zh-tw.
[3] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, and
Guoping Hu. Pre-training with whole word masking for chinese bert, 2019.
[4] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, and Guoping Hu.
Revisiting pre-trained models for chinese natural language processing. Findings of
the Association for Computational Linguistics: EMNLP 2020, 2020.
[5] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly
optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
[6] Y. Zhao, Y. Qian, and C. Li. Improved knn text classification algorithm with mapreduce implementation. In 2017 4th International Conference on Systems and Informatics (ICSAI), pages 1417–1422, 2017.
[7] S. Wei, J. Guo, Z. Yu, P. Chen, and Y. Xian. The instructional design of chinese text
classification based on svm. In 2013 25th Chinese Control and Decision Conference
(CCDC), pages 5114–5117, 2013.
[8] Q. Jiang, W. Wang, X. Han, S. Zhang, X. Wang, and C. Wang. Deep feature
weighting in naive bayes for chinese text classification. In 2016 4th International
Conference on Cloud Computing and Intelligence Systems (CCIS), pages 160–164,
2016.
27[9] Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
(EMNLP), pages 1746–1751, Doha, Qatar, October 2014. Association for Computational Linguistics.
[10] Junmei Zhong and William Li. Predicting customer call intent by analyzing phone
call transcripts based on CNN for multi-class classification. CoRR, abs/1907.03715,
2019.
[11] Changshun Du and Lei Huang. Text classification research with attention-based recurrent neural networks. International Journal of Computers Communications Control, 13:50, 02 2018.
[12] Xuewei Li and Hongyun Ning. Chinese text classification based on hybrid model of
cnn and lstm. In Proceedings of the 3rd International Conference on Data Science
and Information Technology, DSIT 2020, page 129–134, New York, NY, USA, 2020.
Association for Computing Machinery.
[13] Boulianne Gilles Burget Lukas Glembek Ondrej Goel Nagendra Hannemann Mirko
Motlicek Petr Qian Yanmin Schwarz Petr Silovsky Jan Stemmer Georg Vesely Karel
Povey Daniel, Ghoshal Arnab. The kaldi speech recognition toolkit. 2011.
[14] Apache licence v2.0. https://www.apache.org/licenses/LICENSE-2.0.
[15] Openfstlibrary. http://www.openfst.org/twiki/bin/view/FST/WebHome.
[16] Basic linear algebra subprograms. http://www.netlib.org/blas/.
[17] Linear algebra package. http://www.netlib.org/lapack//.
28[18] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.
Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. CoRR,
abs/1706.03762, 2017.
[19] Guillaume Lample and Alexis Conneau. Cross-lingual language model pretraining,
2019.
[20] Flask. https://github.com/pallets/flask.
[21] Werkzeug. https://github.com/pallets/werkzeug.
[22] Jinja. https://github.com/pallets/jinja.
[23] Stopwords. https://www.overcoded.net/stop-words-lists-removal-195521/.
[24] Pytorch. https://pytorch.org/.
[25] Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune BERT for
text classification? CoRR, abs/1905.05583, 2019.

指導教授

孫敏德(Min-Te Sun)

審核日期

2021-7-27

推文