中文地址分類任務中不同特徵表示與分類器之比較研究：以F公司為例;A Comparative Study on Feature Representations and Classifiers for Chinese Address Classification: A Case Study of Company F

NCU Institutional Repository > 管理學院 > 資訊管理學系碩士在職專班 > 博碩士論文 > Item 987654321/98209

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98209

題名:	中文地址分類任務中不同特徵表示與分類器之比較研究：以F公司為例;A Comparative Study on Feature Representations and Classifiers for Chinese Address Classification: A Case Study of Company F
作者:	陳志忠;Chen, Chih-Chung
貢獻者:	資訊管理學系在職專班
關鍵詞:	地址分類;繁體中文;特徵表示;SVM;BERT;智慧倉儲;最後一哩路;Address Classification;Traditional Chinese;Feature Representation;SVM;BERT;Smart Warehousing;Last-Mile Delivery
日期:	2025-06-23
上傳時間:	2025-10-17 12:29:44 (UTC+8)
出版者:	國立中央大學
摘要:	隨著電子商務與智慧物流的快速發展，地址資訊處理的準確性與效率變得日益重要，尤其在最後一哩配送階段，地址辨識錯誤經常導致高昂的營運成本與顧客滿意度下降。發展一套有效且穩定的地址分類技術，已成為智慧倉儲與物流資訊系統的重要基礎。本研究針對繁體中文地址，探討不同特徵表示方法與分類模型於地址分類任務中的應用表現，並評估其於物流實務中的可行性與應用潛力。本研究使用 Bag-of-Words、TF-IDF、Word2Vec 和 BERT 作為特徵表示方式，搭配 kNN、SVM、Random Forest、LSTM 與 BiLSTM 等分類模型，在多分類與二分類資料集上進行交叉實驗分析。結果顯示，語意特徵（特別是BERT）能顯著提升分類準確率，表現優於傳統向量化方法。值得注意的是，即使在深度學習主導的時代，傳統模型如 SVM仍展現其實務應用價值，適用於運算資源受限或部署條件受限的情境。本研究指出，選擇適切的特徵表示方式，對於分類結果的影響遠高於模型本身的選擇，進一步強調即便在深度學習普及的當下，傳統模型仍具有應用價值。整體實驗結果綜合分析後發現，BERT 搭配 SVM 為效能與資源效率兼具的最佳組合，這些觀察提供了未來地址分類系統開發上的具體參考，有助於改善物流系統中依賴地址資訊處理的整體效率。;As e-commerce and smart logistics grow quickly, it becomes more important to process address information correctly and efficiently. This is especially true in last-mile delivery, where mistakes in reading or classifying addresses can cause high costs and lower customer satisfaction. To solve this, it is important to build a reliable and accurate address classification system for smart warehouses and logistics operations. This study looks at how different feature representation methods and machine learning models perform in classifying Traditional Chinese addresses, and also checks how useful they can be in real-world logistics. The study tests Bag-of-Words, TF-IDF, Word2Vec, and BERT for feature representation. It also uses several models, including kNN, SVM, Random Forest, LSTM, and BiLSTM, and runs experiments on both binary and multi-class datasets. The results show that BERT, a method using word meaning (semantic features), gives much better accuracy than traditional methods. Also, even though deep learning is popular today, simpler models like SVM still work well, especially when computer resources are limited or the system must be simple to use. The study shows that choosing the right feature representation is more important than choosing the model itself. It further shows that traditional models still have practical value, even though deep learning is popular today. Of all the tested combinations, BERT paired with SVM showed the best overall performance, balancing accuracy and efficiency. Therefore, these findings can guide the development of more effective address classification systems and improve how logistics systems handle address information.
顯示於類別:	[資訊管理學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	60	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....