因此,我們提出了一個採用了注意力機制以及轉移學習的Seq2Seq模型,除了三種語言的平行語料外,可以有效利用剩餘資料,增進從來源語言到目標語言的命名實體音譯問題之表現。;Machine translation has been research for a long time. Although most of the sentences can be translated correctly, when it comes to named entity like a personal name or a location in a sentence, there′s still room for improvement especially between non-English languages. Named Entity Transliteration is a way to solve the condition mentioned above.
Transliteration is a key part of machine translation. However when we actually do research, we often have limited parallel data between source language and target language. If we take a wildly used language as a pivot langage, in contract, it would be more easily to extract language pairs of source language to pivot language and pivot language to target language. It′s intuitive to extract the common pivot language entities from these corpora to generate a three-language parallel data include source language, pivot language, target language. We can achieve the bilingual transliteration task using the parallel data; nevertheless, large amount of data is wasted in this method.
We propose a modified attention-based sequence-to-sequence model which also applies transfer learning techniques. Our model effectively utilize the remaining data besides the parallel data to promote the performance of named entity transliteration.