參考文獻 |
Ankerst, M., Breunig, M. M., Kriegel, H.-P., & Sander, J. (1999). OPTICS: Ordering Points to Identify the Clustering Structure. Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, 49–60. https://doi.org/10.1145/304182.304187
Arora, P., Deepali, & Varshney, S. (2016). Analysis of K-Means and K-Medoids Algorithm For Big Data. Procedia Computer Science, 78, 507–512. https://doi.org/10.1016/j.procs.2016.02.095
Arthur, D., & Vassilvitskii, S. (2006). k-means++: The Advantages of Careful Seeding. Stanford.
Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. ArXiv:1607.06450 [Cs, Stat]. http://arxiv.org/abs/1607.06450
Babhulgaonkar, A. R., & Bharad, S. V. (2017). Statistical Machine Translation. 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM), 62–67. https://doi.org/10.1109/ICISIM.2017.8122149
Bahdanau, D., Cho, K., & Bengio, Y. (2016). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473
Chen, B., & Cherry, C. (2014). A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU. Proceedings of the Ninth Workshop on Statistical Machine Translation, 362–367. https://doi.org/10.3115/v1/W14-3346
Child, R., Gray, S., Radford, A., & Sutskever, I. (2019). Generating Long Sequences with Sparse Transformers. ArXiv:1904.10509 [Cs, Stat]. http://arxiv.org/abs/1904.10509
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724–1734. https://doi.org/10.3115/v1/D14-1179
Cordonnier, J.-B., Loukas, A., & Jaggi, M. (2020). Multi-Head Attention: Collaborate Instead of Concatenate. ArXiv:2006.16362 [Cs, Stat]. http://arxiv.org/abs/2006.16362
Dauphin, Y. N., Fan, A., Auli, M., & Grangier, D. (2017). Language Modeling with Gated Convolutional Networks. ArXiv:1612.08083 [Cs]. http://arxiv.org/abs/1612.08083
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Kdd, 96(34), 226–231.
Garg, A., & Agarwal, M. (2018). Machine Translation: A Literature Review. ArXiv:1901.01122 [Cs]. http://arxiv.org/abs/1901.01122
Gehring, J., Auli, M., Grangier, D., & Dauphin, Y. N. (2017). A Convolutional Encoder Model for Neural Machine Translation. ArXiv:1611.02344 [Cs]. http://arxiv.org/abs/1611.02344
Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional Sequence to Sequence Learning. ArXiv:1705.03122 [Cs]. http://arxiv.org/abs/1705.03122
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural Networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 315–323. http://proceedings.mlr.press/v15/glorot11a.html
Graves, A., Wayne, G., & Danihelka, I. (2014). Neural Turing Machines. ArXiv:1410.5401 [Cs]. http://arxiv.org/abs/1410.5401
Gu, J., Wang, C., & Zhao, J. (2019). Levenshtein Transformer. ArXiv:1905.11006 [Cs]. http://arxiv.org/abs/1905.11006
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778. https://doi.org/10.1109/CVPR.2016.90
He, P., Liu, X., Gao, J., & Chen, W. (2021). DeBERTa: Decoding-enhanced BERT with Disentangled Attention. ArXiv:2006.03654 [Cs]. http://arxiv.org/abs/2006.03654
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. ArXiv:1502.03167 [Cs]. http://arxiv.org/abs/1502.03167
Jin, X., & Han, J. (2010). K-Means Clustering. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of Machine Learning (pp. 563–564). Springer US. https://doi.org/10.1007/978-0-387-30164-8_425
Kalchbrenner, N., & Blunsom, P. (2013). Recurrent Continuous Translation Models. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1700–1709. https://www.aclweb.org/anthology/D13-1176
Kaufman, L., & Rousseeuw, P. J. (2009). Finding Groups in Data: An Introduction to Cluster Analysis. New York, NY: John Wiley and Sons.
Kodinariya, T., & Makwana, P. (2013). Review on Determining of Cluster in K-means Clustering. International Journal of Advance Research in Computer Science and Management Studies, 1, 90–95.
Lakew, S. M., Cettolo, M., & Federico, M. (2018). A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation. Proceedings of the 27th International Conference on Computational Linguistics, 641–652. https://www.aclweb.org/anthology/C18-1054
Lample, G., & Conneau, A. (2019). Cross-lingual Language Model Pretraining. ArXiv:1901.07291 [Cs]. http://arxiv.org/abs/1901.07291
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-Based Neural Machine Translation. ArXiv:1508.04025 [Cs]. http://arxiv.org/abs/1508.04025
MacQueen, J. (1967). Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 281–297.
Mehdad, Y., Negri, M., & Federico, M. (2012). Match without a Referee: Evaluating MT Adequacy without Reference Translations. Proceedings of the Seventh Workshop on Statistical Machine Translation, 171–180. https://www.aclweb.org/anthology/W12-3122
Meng, F., Lu, Z., Li, H., & Liu, Q. (2016). Interactive Attention for Neural Machine Translation. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2174–2185. https://www.aclweb.org/anthology/C16-1205
Na, S., Xumin, L., & Yong, G. (2010). Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm. 2010 Third International Symposium on Intelligent Information Technology and Security Informatics, 63–67. https://doi.org/10.1109/IITSI.2010.74
Okpor, M. D. (2014). Machine Translation Approaches: Issues and Challenges. 11(5), 7.
Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). Bleu: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. https://doi.org/10.3115/1073083.1073135
Popescu-Belis, A. (2019). Context in Neural Machine Translation: A Review of Models and Evaluations. ArXiv:1901.09115 [Cs]. http://arxiv.org/abs/1901.09115
Raganato, A., Scherrer, Y., & Tiedemann, J. (2020). Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation. ArXiv:2002.10260 [Cs]. http://arxiv.org/abs/2002.10260
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning Representations by Back-Propagating Errors. Nature, 323(6088), 533–536. https://doi.org/10.1038/323533a0
Rush, A. (2018). The Annotated Transformer. Proceedings of Workshop for NLP Open Source Software (NLP-OSS), 52–60. https://doi.org/10.18653/v1/W18-2509
Singh, S. P., Kumar, A., Darbari, H., Singh, L., Rastogi, A., & Jain, S. (2017). Machine Translation Using Deep Learning: An overview. 2017 International Conference on Computer, Communications and Electronics (Comptelix), 162–167. https://doi.org/10.1109/COMPTELIX.2017.8003957
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. ArXiv:1409.3215 [Cs]. http://arxiv.org/abs/1409.3215
Tan, Z., Wang, S., Yang, Z., Chen, G., Huang, X., Sun, M., & Liu, Y. (2020). Neural Machine Translation: A Review of Methods, Resources, and Tools. ArXiv:2012.15515 [Cs]. http://arxiv.org/abs/2012.15515
Tay, Y., Dehghani, M., Bahri, D., & Metzler, D. (2020). Efficient Transformers: A Survey. ArXiv:2009.06732 [Cs]. http://arxiv.org/abs/2009.06732
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. ArXiv:1706.03762 [Cs]. http://arxiv.org/abs/1706.03762
Voita, E., Talbot, D., Moiseev, F., Sennrich, R., & Titov, I. (2019). Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned. ArXiv:1905.09418 [Cs]. http://arxiv.org/abs/1905.09418
Wang, B., Wang, A., Chen, F., Wang, Y., & Kuo, C.-C. J. (2019). Evaluating Word Embedding Models: Methods and Experimental Results. APSIPA Transactions on Signal and Information Processing, 8. https://doi.org/10.1017/ATSIP.2019.12
Wang, Z., Ma, Y., Liu, Z., & Tang, J. (2019). R-Transformer: Recurrent Neural Network Enhanced Transformer. ArXiv:1907.05572 [Cs, Eess]. http://arxiv.org/abs/1907.05572
Wu Y., Schuster M., Chen Z., Le Q. V., Norouzi M., Macherey W., Krikun M., Cao Y., Gao Q., Macherey K., Klingner J., Shah A., Johnson M., Liu X., Kaiser Ł., Gouws S., Kato Y., Kudo T., Kazawa H., … Dean J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144v2
Xin, M., & Wang, Y. (2019). Research on Image Classification Model Based on Deep Convolution Neural Network. EURASIP Journal on Image and Video Processing, 2019(1), 40. https://doi.org/10.1186/s13640-019-0417-8
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489. https://doi.org/10.18653/v1/N16-1174
Zaheer, M., Guruganesh, G., Dubey, A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L., & Ahmed, A. (2021). Big Bird: Transformers for Longer Sequences. ArXiv:2007.14062 [Cs, Stat]. http://arxiv.org/abs/2007.14062
林佳蒼(2020)。多向注意力機制於翻譯任務改進之研究。國立中央大學資訊管理研究所碩士論文,桃園市。 |