參考文獻 |
[1] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
[2] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
[3] Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019).
[4] Karpukhin, Vladimir, et al. "Dense passage retrieval for open-domain question answering." arXiv preprint arXiv:2004.04906 (2020).
[5] Lewis, Patrick, et al. "Paq: 65 million probably-asked questions and what you can do with them." Transactions of the Association for Computational Linguistics 9 (2021): 1098-1115.
[6] Xue, Linting, et al. "mT5: A massively multilingual pre-trained text-to-text transformer." arXiv preprint arXiv:2010.11934 (2020).
[7] Oğuz, Barlas, et al. "Domain-matched pre-training tasks for dense retrieval." arXiv preprint arXiv:2107.13602 (2021).
[8] Martineau, Justin, and Tim Finin. "Delta tfidf: An improved feature space for sentiment analysis." Proceedings of the International AAAI Conference on Web and Social Media. Vol. 3. No. 1. 2009.
[9] Practical BM25 - Part 2: The BM25 Algorithm and its Variables, https://www.elastic.co/cn/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables
[10] Lewis, Mike, et al. "Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension." arXiv preprint arXiv:1910.13461 (2019).
[11] Radford, Alec, et al. "Improving language understanding by generative pre-training." (2018).
[12] Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.
[13] Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.
[14] Cui, Yiming, et al. "Revisiting pre-trained models for Chinese natural language processing." arXiv preprint arXiv:2004.13922 (2020).
[15] Cui, Yiming, Ziqing Yang, and Ting Liu. "PERT: pre-training BERT with permuted language model." arXiv preprint arXiv:2203.06906 (2022).
[16] Cui, Yiming, et al. "LERT: A Linguistically-motivated Pre-trained Language Model." arXiv preprint arXiv:2211.05344 (2022).
[17] Guo, Zhenliang, et al. "CNA: A Dataset for Parsing Discourse Structure on Chinese News Articles." 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2022.
[18] News2016zh: https://opendatalab.com/News2016zh
[19] Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. "Recurrent neural network regularization." arXiv preprint arXiv:1409.2329 (2014).
[20] Shi, Xingjian, et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems 28 (2015).
[21] Cui, Yiming, et al. "A span-extraction dataset for Chinese machine reading comprehension." arXiv preprint arXiv:1810.07366 (2018).
[22] Shao, Chih Chieh, et al. "DRCD: A Chinese machine reading comprehension dataset." arXiv preprint arXiv:1806.00920 (2018).
[23] Li, Peng, et al. "Dataset and neural recurrent sequence labeling model for open-domain factoid question answering." arXiv preprint arXiv:1607.06275 (2016).
[24] CAIL2019: https://github.com/china-ai-law-challenge/CAIL2019
[25] Chinese Squad: https://github.com/junzeng-pluto/ChineseSquad
[26] Rajpurkar, Pranav, et al. "Squad: 100,000+ questions for machine comprehension of text." arXiv preprint arXiv:1606.05250 (2016).
[27] ChatGPT: https://openai.com/blog/chatgpt
[28] Che, W., Feng, Y., Qin, L., & Liu, T. (2020). N-LTP: An open-source neural language technology platform for Chinese. arXiv preprint arXiv:2009.11616.
[29] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[30] Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
[31] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[32] Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
[33] Sarzynska-Wawer, J., Wawer, A., Pawlak, A., Szymanowska, J., Stefaniak, I., Jarkiewicz, M., & Okruszek, L. (2021). Detecting formal thought disorder by deep contextualized word representations. Psychiatry Research, 304, 114135.
[34] Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32.
[35] Dong, L., Yang, N., Wang, W., Wei, F., Liu, X., Wang, Y., ... & Hon, H. W. (2019). Unified language model pre-training for natural language understanding and generation. Advances in neural information processing systems, 32. |