參考文獻 |
Agibetov, A., Blagec, K., Xu, H., Samwald, M., 2018. Fast and scalable neural embedding models for biomedical sentence classification. BMC Bioinformatics 19, 541. https://doi.org/10.1186/s12859-018-2496-4
Aixin Sun, Ee-Peng Lim, 2001. Hierarchical text classification and evaluation, in: Proceedings 2001 IEEE International Conference on Data Mining. Presented at the Proceedings 2001 IEEE International Conference on Data Mining, pp. 521–528. https://doi.org/10.1109/ICDM.2001.989560
Arora, S., Liang, Y., Ma, T., 2019. A simple but tough-to-beat baseline for sentence embeddings. Presented at the 5th International Conference on Learning Representations, ICLR 2017.
Badimala, P., Mishra, C., Modam Venkataramana, R.K., Bukhari, S., Dengel, A., 2019. A Study of Various Text Augmentation Techniques for Relation Classification in Free Text. pp. 360–367. https://doi.org/10.5220/0007311003600367
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T., 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5, 135–146. https://doi.org/10.1162/tacl_a_00051
Bouthillier, X., Konda, K., Vincent, P., Memisevic, R., 2016. Dropout as data augmentation. arXiv:1506.08700 [cs, stat].
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D., 2015. A large annotated corpus for learning natural language inference, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2015, Association for Computational Linguistics, Lisbon, Portugal, pp. 632–642. https://doi.org/10.18653/v1/D15-1075
Cambria, E., Poria, S., Gelbukh, A., Thelwall, M., 2017. Sentiment Analysis Is a Big Suitcase. IEEE Intell. Syst. 32, 74–80. https://doi.org/10.1109/MIS.2017.4531228
Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., St. John, R., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., Strope, B., Kurzweil, R., 2018a. Universal Sentence Encoder for English, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Brussels, Belgium, pp. 169–174. https://doi.org/10.18653/v1/D18-2029
Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N.L.U., John, R.S., Constant, N., Guajardo-Céspedes, M., Yuan, S., Tar, C., Sung, Y., Strope, B., Kurzweil, R., 2018b. Universal Sentence Encoder, in: In Submission to: EMNLP Demonstration. Brussels, Belgium.
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A., 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. arXiv:1405.3531 [cs].
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078 [cs, stat].
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A., 2017a. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data, in: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2017, Association for Computational Linguistics, Copenhagen, Denmark, pp. 670–680. https://doi.org/10.18653/v1/D17-1070
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A., 2017b. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data, in: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2017, Association for Computational Linguistics, Copenhagen, Denmark, pp. 670–680. https://doi.org/10.18653/v1/D17-1070
Das, A., Yenala, H., Chinnakotla, M., Shrivastava, M., 2016. Together we stand: Siamese Networks for Similar Question Retrieval, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2016, Association for Computational Linguistics, Berlin, Germany, pp. 378–387. https://doi.org/10.18653/v1/P16-1036
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs].
Edunov, S., Ott, M., Auli, M., Grangier, D., 2018. Understanding Back-Translation at Scale, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, pp. 489–500. https://doi.org/10.18653/v1/D18-1045
Fedus, W., Goodfellow, I., Dai, A., 2018. MaskGAN: Better Text Generation via Filling in the ____.
Fernández Anta, A., Núñez Chiroque, L., Morere, P., Santos Méndez, A., 2013. Sentiment analysis and topic detection of Spanish tweets: a comparative study of NLP techniques.
Garay-Maestre, U., Gallego, A.-J., Calvo-Zaragoza, J., 2019. Data Augmentation via Variational Auto-Encoders, in: Vera-Rodriguez, R., Fierrez, J., Morales, A. (Eds.), Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science. Springer International Publishing, Cham, pp. 29–37. https://doi.org/10.1007/978-3-030-13469-3_4
Goodfellow, I., Shlens, J., Szegedy, C., 2015. Explaining and Harnessing Adversarial Examples, in: International Conference on Learning Representations.
Han, D., Liu, Q., Fan, W., 2018. A new image classification method using CNN transfer learning and web data augmentation. Expert Systems with Applications 95, 43–56. https://doi.org/10.1016/j.eswa.2017.11.028
Han, S., Gao, J., Ciravegna, F., 2019. Data Augmentation for Rumor Detection Using Context-Sensitive Neural Language Model With Large-Scale Credibility Corpus.
He, Z., Xie, L., Chen, X., Zhang, Y., Wang, Y., Tian, Q., 2019. Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data. arXiv:1909.09148 [cs, stat].
Hou, Y., Liu, Y., Che, W., Liu, T., 2018. Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding, in: Proceedings of the 27th International Conference on Computational Linguistics. Presented at the COLING 2018, Association for Computational Linguistics, Santa Fe, New Mexico, USA, pp. 1234–1245.
Howard, A.G., 2013. Some Improvements on Deep Convolutional Neural Network Based Image Classification. arXiv:1312.5402 [cs].
Howard, J., Ruder, S., 2018. Universal Language Model Fine-tuning for Text Classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2018, Association for Computational Linguistics, Melbourne, Australia, pp. 328–339. https://doi.org/10.18653/v1/P18-1031
Joachims, T., 1998. Text categorization with Support Vector Machines: Learning with many relevant features, in: Nédellec, C., Rouveirol, C. (Eds.), Machine Learning: ECML-98, Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 137–142. https://doi.org/10.1007/BFb0026683
Johnstone, I.M., Titterington, D.M., 2009. Statistical challenges of high-dimensional data. Proc. R. Soc. A 367, 4237–4253. https://doi.org/10.1098/rsta.2009.0159
Kafle, K., Yousefhussien, M.A., Kanan, C., 2017. Data Augmentation for Visual Question Answering, in: INLG. https://doi.org/10.18653/v1/w17-3529
Kim, Y., 2014. Convolutional Neural Networks for Sentence Classification. arXiv:1408.5882 [cs].
Kingma, D.P., Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Kiros, J., Chan, W., 2018. InferLite: Simple Universal Sentence Representations from Natural Language Inference Data, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 4868–4874. https://doi.org/10.18653/v1/D18-1524
Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R.S., Torralba, A., Urtasun, R., Fidler, S., 2015. Skip-thought Vectors, in: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15. MIT Press, Cambridge, MA, USA, pp. 3294–3302.
Kobayashi, S., 2018. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. arXiv:1805.06201 [cs].
Konno, T., Iwazume, M., 2018. Icing on the Cake: An Easy and Quick Post-Learnig Method You Can Try After Deep Learning. arXiv:1807.06540 [cs, stat].
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems. pp. 1097–1105.
Le, Q., Mikolov, T., 2014. Distributed Representations of Sentences and Documents, in: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14. JMLR.org, p. II–1188–II–1196.
Li, J., Jia, R., He, H., Liang, P., 2018. Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 1865–1874. https://doi.org/10.18653/v1/N18-1169
Liu, P., Qiu, X., Huang, X., 2016. Recurrent neural network for text classification with multi-task learning, in: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16. AAAI Press, New York, New York, USA, pp. 2873–2879.
Liu, P.J., Saleh, M.A., Pot, E., Goodrich, B., Sepassi, R., Kaiser, L., Shazeer, N., 2018. Generating Wikipedia by Summarizing Long Sequences.
Lu, S., Zhu, Y., Zhang, W., Wang, J., Yu, Y., 2018. Neural Text Generation: Past, Present and Beyond. arXiv:1803.07133 [cs].
Malandrakis, N., Shen, M., Goyal, A., Gao, S., Sethi, A., Metallinou, A., 2019. Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents. arXiv:1910.03487 [cs, stat].
Manning, C.D., Manning, C.D., Schütze, H., 1999. Foundations of statistical natural language processing. MIT press.
Mariani, G., Scheidegger, F., Istrate, R., Bekas, C., Malossi, C., 2018. BAGAN: Data Augmentation with Balancing GAN. arXiv:1803.09655 [cs, stat].
McCann, B., Bradbury, J., Xiong, C., Socher, R., 2017. Learned in Translation: Contextualized Word Vectors, in: NIPS.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J., 2013. Distributed Representations of Words and Phrases and Their Compositionality, in: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13. Curran Associates Inc., USA, pp. 3111–3119.
Miller, G.A., 1995. WordNet: a lexical database for English. Commun. ACM 38, 39–41. https://doi.org/10.1145/219717.219748
Mitra, T., Gilbert, E., 2015. Credbank: A large-scale social media corpus with associated credibility annotations, in: Ninth International AAAI Conference on Web and Social Media.
Moreno-Barea, F.J., Strazzera, F., Jerez, J.M., Urda, D., Franco, L., 2018. Forward Noise Adjustment Scheme for Data Augmentation, in: 2018 IEEE Symposium Series on Computational Intelligence (SSCI). Presented at the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, Bangalore, India, pp. 728–734. https://doi.org/10.1109/SSCI.2018.8628917
Nicosia, M., Moschitti, A., 2017. Learning Contextual Embeddings for Structural Semantic Similarity using Categorical Information, in: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Presented at the CoNLL 2017, Association for Computational Linguistics, Vancouver, Canada, pp. 260–270. https://doi.org/10.18653/v1/K17-1027
Pagliardini, M., Gupta, P., Jaggi, M., 2018. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 528–540. https://doi.org/10.18653/v1/N18-1049
Pang, B., Lee, L., 2005. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales, in: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Presented at the ACL 2005, Association for Computational Linguistics, Ann Arbor, Michigan, pp. 115–124. https://doi.org/10.3115/1219840.1219855
Pang, B., Lee, L., 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, in: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04). Presented at the ACL 2004, Barcelona, Spain, pp. 271–278. https://doi.org/10.3115/1218955.1218990
Papadaki, M., Chalkidis, I., Michos, A., 2017. Data Augmentation Techniques for Legal Text Analytics.
Pennington, J., Socher, R., Manning, C., 2014. Glove: Global Vectors for Word Representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162
Pérez, L.A., 2019. The Effect of Embeddings on SQuAD v 2 . 0.
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L., 2018. Deep contextualized word representations. arXiv:1802.05365 [cs].
Polson, N.G., Scott, S.L., 2011. Data augmentation for support vector machines. Bayesian Analysis 6, 1–23.
Radford, A., 2018. Improving Language Understanding by Generative Pre-Training.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 9.
Reimers, N., Gurevych, I., 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Presented at the EMNLP-IJCNLP 2019, Association for Computational Linguistics, Hong Kong, China, pp. 3980–3990. https://doi.org/10.18653/v1/D19-1410
Rong, X., 2014. word2vec parameter learning explained. arXiv preprint arXiv:1411.2738.
Schuller, B.W., 2018. Data Augmentation and Deep Learning for Hate Speech Detection.
Sennrich, R., Haddow, B., Birch, A., 2016. Improving Neural Machine Translation Models with Monolingual Data, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp. 86–96. https://doi.org/10.18653/v1/P16-1009
Set (mathematics), 2020. . Wikipedia.
Shen, T., Lei, T., Barzilay, R., Jaakkola, T., 2017. Style Transfer from Non-Parallel Text by Cross-Alignment. arXiv:1705.09655 [cs].
Shorten, C., Khoshgoftaar, T.M., 2019. A survey on Image Data Augmentation for Deep Learning. J Big Data 6, 60. https://doi.org/10.1186/s40537-019-0197-0
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C., 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, in: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2013, Association for Computational Linguistics, Seattle, Washington, USA, pp. 1631–1642.
Takahashi, N., Gygli, M., Pfister, B., Van Gool, L., 2016. Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection. arXiv:1604.07160 [cs].
Tang, D., Qin, B., Liu, T., 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pp. 1422–1432.
Turney, P.D., Pantel, P., 2010. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research 37, 141–188.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I., 2017. Attention Is All You Need. arXiv:1706.03762 [cs].
Wang, S., Manning, C., 2012. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification, in: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Presented at the ACL 2012, Association for Computational Linguistics, Jeju Island, Korea, pp. 90–94.
Wang, W.Y., Yang, D., 2015. That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Presented at the Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp. 2557–2563. https://doi.org/10.18653/v1/D15-1306
Wei, J., Zou, K., 2019. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. arXiv:1901.11196 [cs].
Wiebe, J., Wilson, T., Cardie, C., 2005. Annotating Expressions of Opinions and Emotions in Language. Language Res Eval 39, 165–210. https://doi.org/10.1007/s10579-005-7880-9
Wong, S.C., Gatt, A., Stamatescu, V., McDonnell, M.D., 2016. Understanding Data Augmentation for Classification: When to Warp?, in: 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA). Presented at the 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), IEEE, Gold Coast, Australia, pp. 1–6. https://doi.org/10.1109/DICTA.2016.7797091
Wu, R., Yan, S., Shan, Y., Dang, Q., Sun, G., 2015. Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876 7.
Xie, Q., Dai, Z., Hovy, E., Luong, M.-T., Le, Q.V., 2019. Unsupervised Data Augmentation for Consistency Training. arXiv:1904.12848 [cs, stat].
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V., 2019. Xlnet: Generalized autoregressive pretraining for language understanding, in: Advances in Neural Information Processing Systems. pp. 5753–5763.
Young, T., Hazarika, D., Poria, S., Cambria, E., 2018. Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine 13, 55–75.
Yu, A.W., Dohan, D., Luong, M.-T., Zhao, R., Chen, K., Norouzi, M., Le, Q.V., 2018. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. arXiv:1804.09541 [cs].
Zhang, X., Zhao, J., LeCun, Y., 2015a. Character-level Convolutional Networks for Text Classification, in: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 28. Curran Associates, Inc., pp. 649–657.
Zhang, X., Zhao, J., LeCun, Y., 2015b. Character-level convolutional networks for text classification, in: Advances in Neural Information Processing Systems. pp. 649–657. |