最後,本論文設計實驗證明經本方法調整後,使用少量文本預訓練的詞向量在同義詞任務中表現可以超越未調整但使用大量文本預訓練的詞向量,並從結果中發現同義詞相較於反義詞在相似度任務上是更有用的資訊,且同義詞和反義詞資訊並不是越多越好,品質也會影響調整後的結果。;Since word embedding has become a standard technique working excellently in various natural language processing (NLP) tasks, there has been much research on improving the quality of the word embeddings. We argue that the word embeddings are mainly learned through contextual information but ignore the relationship (e.g., synonyms, antonyms, and knowledge graph) of words compiled by humans. We speculate that including human compiled information may improve the quality of the word embeddings. Unlike previous works, we purpose a listwise method that can consider the relations between a word and its synonyms and antonyms.
Experimental results show that our approach to adjust the word embeddings trained from small corpus yields comparable, sometimes even better, results than the word embeddings trained with a large corpus. Additionally, we show that both the quantity and quality of synonyms and antonyms affect the performance of our work. Finally, we show that models utilizing global information outperform the ones utilizing local information in most cases.