參考文獻 |
[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks”, In Advances in neural information processing systems, 2012.
[2] Simonyan, Karen and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition”, In International Conference on Learning Representations, 2015
[3] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna, “Rethinking the Inception Architecture for Computer Vision”, In IEEE conference on computer vision and pattern recognition, 2016.
[4] Sergey Ioffe and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”, In International Conference on Machine Learning, 2015.
[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning for image recognition”, In IEEE conference on computer vision and pattern recognition, 2016.
[6] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger “Densely Connected Convolutional Networks “, In IEEE conference on computer vision and pattern recognition, 2017.
[7] Ross Girshick, Jeff Donahue, Trevor Darrell and Jitendra Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, In IEEE conference on computer vision and pattern recognition, 2014.
[8] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders. “Selective search for object recognition”, International journal of computer vision, 2013.
[9] Suykens, Johan AK, and Joos Vandewalle. “Least squares support vector machine classifiers”, In International Conference on Machine Learning, 1998.
[10] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. “Faster r-cnn: Towards real-time object detection with region proposal networks”, In Advances in neural information processing systems, 2015.
[11] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. “You only look once: Unified, real-time object detection”, In IEEE conference on computer vision and pattern recognition, 2016.
[12] Jonathan Long, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation”, In IEEE conference on computer vision and pattern recognition, 2015.
[13] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. “Mask r-cnn.”, In Proceedings of the IEEE international conference on computer vision, 2017.
[14] Zhi Tian, Weilin Huang, Tong He, Pan He and Yu Qiao, “Detecting Text in Natural Image with Connectionist Text Proposal Network”, In European Conference on Computer Vision, 2016.
[15] Sepp Hochreiter and Jürgen Schmidhuber, “Long short-term memory”, In Neural Computation, 1997.
[16] Minghui Liao, Pengyuan Lyu, Minghang He, Cong Yao, Wenhao Wu and Xiang Bai, “Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes”, In European Conference on Computer Vision, 2018.
[17] Baoguang Shi, Xiang Bai, and Cong Yao. “An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition”, In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016.
[18] Alex Graves, Santiago Fernández, Faustino Gomez and Jürgen Schmidhuber, “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks”, In International Conference on Machine Learning, 2006.
[19] Fenfen Sheng; Zhineng Chen and Bo Xu, “NRTR: A No-Recurrence Sequence-to-Sequence Model for Scene Text Recognition”, In International Conference on Document Analysis and Recognition, 2019.
[20] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser and Illia Polosukhin, “Attention is All you Need”, In Advances in neural information processing systems, 2017.
[21] Jeonghun Baek, Geewook Kim, Junyeop Lee, Sungrae Park, Dongyoon Han, Sangdoo Yun, Seong Joon Oh and Hwalsuk Lee, “What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis”, In Proceedings of the IEEE international conference on computer vision, 2019.
[21] Max Jaderberg, Karen Simonyan, Andrea Vedaldi and Andrew Zisserman, “Reading Text in the Wild with Convolutional Neural Networks”, International journal of computer vision, 2016.
[22] Ankush Gupta, Andrea Vedaldi and Andrew Zisserman, “SynthText in the Wild Dataset”, In IEEE conference on computer vision and pattern recognition, 2016.
[23] Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun and Hwalsuk Lee, “Character Region Awareness for Text Detection”, In IEEE conference on computer vision and pattern recognition, 2019.
[24] Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li and Shi-Min Hu, “Chinese Text in the Wild”, In IEEE conference on computer vision and pattern recognition, 2018.
[25] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff and Hartwig Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”, In European Conference on Computer Vision, 2018.
[26] Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions”, In International Conference on Learning Representations, 2016.
[27] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy and Alan L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs”, In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
[28] 認識中文字元碼 http://idv.sinica.edu.tw/bear/charcodes/Section05.htm
[29] Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. “Spatial transformer networks”, In Advances in neural information processing systems, 2015.
[30] Hu, Jie, Li Shen, and Gang Sun. “Squeeze-and-excitation networks”, In IEEE conference on computer vision and pattern recognition, 2018.
[31] Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks”, In International Conference on Artificial Intelligence and Statistics, 2010.
[32] Kingma, Diederik P., and Jimmy Ba. “Adam: A method for stochastic optimization”, In International Conference on Learning Representations, 2015.
[33] Lee, Junyeop, Sungrae Park, Jeonghun Baek, Seong Joon Oh, Seonghyeon Kim and Hwalsuk Lee. “On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention”, In IEEE conference on computer vision and pattern recognition workshops, 2020.
[34] Baoguang Shi, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. “Robust scene text recognition with automatic rectification”, In IEEE conference on computer vision and pattern recognition, 2016.
[35] Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. “Aster: An attentional scene text recognizer with flexible rectification”, In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
[36] Wei Liu, Chaofeng Chen, and Kwan-Yee K. Wong. “Charnet: A character-aware neural network for distorted scene text recognition”, In AAAI Conference on Artificial Intelligence, 2018.
[37] Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong, Zhizhong Su, and Junyu Han. “Star-net: A spatial attention residue network for scene text recognition”, In British Machine Vision Conference, 2016.
[38] Yunze Gao, Yingying Chen, Jinqiao Wang, Zhen Lei, XiaoYu Zhang, and Hanqing Lu. “Recurrent calibration network for irregular text recognition”, In IEEE conference on computer vision and pattern recognition, 2018.
[39] Zhanzhan Cheng, Yangliu Xu, Fan Bai, Yi Niu, Shiliang Pu, and Shuigeng Zhou. “AON: Towards arbitrarily-oriented text recognition”, In IEEE conference on computer vision and pattern recognition, 2018.
[40] Hui Li, Peng Wang, Chunhua Shen, and Guyu Zhang. “Show, attend and read: A simple and strong baseline for irregular text recognition”, In AAAI Conference on Artificial Intelligence, 2019.
[41] Xiao Yang, Dafang He, Zihan Zhou, Daniel Kifer, and C Lee Giles. “Learning to read irregular text with attention mechanisms”, In International Joint Conferences on Artificial Intelligence, 2017.
[42] Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel and Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, In International Conference on Machine Learning, 2015.
[43] Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate”, In International Conference on Learning Representations 2015.
[44] Minh-Thang Luong, Hieu Pham, Christopher D. Manning, “Effective Approaches to Attention-based Neural Machine Translation”, In Empirical Methods in Natural Language Processing, 2015.
[45] S. M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong and R. Young, “ICDAR 2003 Robust Reading Competitions”, In International Conference on Document Analysis and Recognition, 2003.
[46] Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazàn Almazàn and Lluís Pere de las Heras, “ICDAR 2013 Robust Reading Competition”, In International Conference on Document Analysis and Recognition, 2013.
[47] Dimosthenis Karatzas, Lluis Gomez-Bigorda, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, Jiri Matas, Lukas Neumann, Vijay Ramaseshan Chandrasekhar, Shijian Lu, Faisal Shafait, Seiichi Uchida and Ernest Valveny, “ICDAR 2015 competition on Robust Reading”, In International Conference on Document Analysis and Recognition, 2015.
[48] Raul Gomez, Baoguang Shi, Lluis Gomez, Lukas Numann, Andreas Veit, Jiri Matas, Serge Belongie and Dimosthenis Karatzas, “ICDAR2017 Robust Reading Challenge on COCO-Text”, In International Conference on Document Analysis and Recognition, 2017. |