參考文獻 |
[1] M. Yang et al., "Symmetry-constrained rectification network for scene text recognition," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9147-9156.
[2] A. Singh, N. Thakur, and A. Sharma, "A review of supervised machine learning algorithms," in 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), 2016: Ieee, pp. 1310-1315.
[3] Z.-H. Zhou, "A brief introduction to weakly supervised learning," National science review, vol. 5, no. 1, pp. 44-53, 2018.
[4] N. Nayef et al., "Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt," in 2017 14th IAPR international conference on document analysis and recognition (ICDAR), 2017, vol. 1: IEEE, pp. 1454-1459.
[5] C. K. Ch′ng and C. S. Chan, "Total-text: A comprehensive dataset for scene text detection and recognition," in 2017 14th IAPR international conference on document analysis and recognition (ICDAR), 2017, vol. 1: IEEE, pp. 935-942.
[6] L. Yuliang, J. Lianwen, Z. Shuaitao, and Z. Sheng, "Detecting curve text in the wild: New dataset and new solution," arXiv preprint arXiv:1712.02170, 2017.
[7] D. G. Lowe, "Object recognition from local scale-invariant features," in Proceedings of the seventh IEEE international conference on computer vision, 1999, vol. 2: Ieee, pp. 1150-1157.
[8] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR′05), 2005, vol. 1: Ieee, pp. 886-893.
[9] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, "Support vector machines," IEEE Intelligent Systems and their applications, vol. 13, no. 4, pp. 18-28, 1998.
[10] L. Neumann and J. Matas, "Text localization in real-world images using efficiently pruned exhaustive search," in 2011 International Conference on Document Analysis and Recognition, 2011: IEEE, pp. 687-691.
[11] B. Epshtein, E. Ofek, and Y. Wexler, "Detecting text in natural scenes with stroke width transform," in 2010 IEEE computer society conference on computer vision and pattern recognition, 2010: IEEE, pp. 2963-2970.
[12] M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, "Textboxes: A fast text detector with a single deep neural network," in Proceedings of the AAAI conference on artificial intelligence, 2017, vol. 31, no. 1.
[13] M. Liao, Z. Wan, C. Yao, K. Chen, and X. Bai, "Real-time scene text detection with differentiable binarization," in Proceedings of the AAAI conference on artificial intelligence, 2020, vol. 34, no. 07, pp. 11474-11481.
[14] S. Albawi, T. A. Mohammed, and S. Al-Zawi, "Understanding of a convolutional neural network," in 2017 international conference on engineering and technology (ICET), 2017: Ieee, pp. 1-6.
[15] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, vol. 28, 2015.
[16] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
[17] W. Liu et al., "Ssd: Single shot multibox detector," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, 2016: Springer, pp. 21-37.
[18] Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, "Detecting text in natural image with connectionist text proposal network," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14, 2016: Springer, pp. 56-72.
[19] A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM networks," in Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., 2005, vol. 4: IEEE, pp. 2047-2052.
[20] M. Liao, Z. Zhu, B. Shi, G.-s. Xia, and X. Bai, "Rotation-sensitive regression for oriented scene text detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5909-5918.
[21] M. Liao, B. Shi, and X. Bai, "Textboxes++: A single-shot oriented scene text detector," IEEE transactions on image processing, vol. 27, no. 8, pp. 3676-3690, 2018.
[22] B. Shi, X. Bai, and S. Belongie, "Detecting oriented text in natural images by linking segments," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2550-2558.
[23] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[24] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 2015: Springer, pp. 234-241.
[25] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
[26] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881-2890.
[27] D. Deng, H. Liu, X. Li, and D. Cai, "Pixellink: Detecting scene text via instance segmentation," in Proceedings of the AAAI conference on artificial intelligence, 2018, vol. 32, no. 1.
[28] W. Wang et al., "Shape robust text detection with progressive scale expansion network," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9336-9345.
[29] P. Wang et al., "A single-shot arbitrarily-shaped text detector based on context attended multi-task learning," in Proceedings of the 27th ACM international conference on multimedia, 2019, pp. 1277-1285.
[30] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969.
[31] J. Liu, X. Liu, J. Sheng, D. Liang, X. Li, and Q. Liu, "Pyramid mask text detector," arXiv preprint arXiv:1903.11800, 2019.
[32] Y.-H. Hou, "Exploiting Distance to Boundary for Segmentation-based Scene-Text Spotting," 2021.
[33] K. Sun, B. Xiao, D. Liu, and J. Wang, "Deep high-resolution representation learning for human pose estimation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5693-5703.
[34] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, "Aggregated residual transformations for deep neural networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492-1500.
[35] Y. Baek, B. Lee, D. Han, S. Yun, and H. Lee, "Character region awareness for text detection," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9365-9374.
[36] S. Liu and W. Deng, "Very deep convolutional neural network based image classification using small training sample size," in 2015 3rd IAPR Asian conference on pattern recognition (ACPR), 2015: IEEE, pp. 730-734.
[37] L. Xing, Z. Tian, W. Huang, and M. R. Scott, "Convolutional character networks," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9126-9136.
[38] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[39] H. Law and J. Deng, "Cornernet: Detecting objects as paired keypoints," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 734-750.
[40] J. Ye, Z. Chen, J. Liu, and B. Du, "TextFuseNet: Scene Text Detection with Richer Fused Features," in IJCAI, 2020, vol. 20, pp. 516-522.
[41] C. K. Chng et al., "Icdar2019 robust reading challenge on arbitrary-shaped text-rrc-art," in 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019: IEEE, pp. 1571-1576.
[42] A. Gupta, A. Vedaldi, and A. Zisserman, "Synthetic data for text localisation in natural images," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2315-2324.
[43] D. Karatzas et al., "ICDAR 2013 robust reading competition," in 2013 12th international conference on document analysis and recognition, 2013: IEEE, pp. 1484-1493.
[44] D. Karatzas et al., "ICDAR 2015 competition on robust reading," in 2015 13th international conference on document analysis and recognition (ICDAR), 2015: IEEE, pp. 1156-1160.
[45] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, "PatchMatch: A randomized correspondence algorithm for structural image editing," ACM Trans. Graph., vol. 28, no. 3, p. 24, 2009.
[46] R. Fabbri, L. D. F. Costa, J. C. Torelli, and O. M. Bruno, "2D Euclidean distance transform algorithms: A comparative survey," ACM Computing Surveys (CSUR), vol. 40, no. 1, pp. 1-44, 2008.
[47] D. Bautista and R. Atienza, "Scene text recognition with permuted autoregressive sequence models," in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVIII, 2022: Springer, pp. 178-196.
[48] L. Tong, "Designs of the Traditional Chinese Scene Text Dataset and Performance Evaluation for Text Detection and Recognition," 2022.
[49] M. Ye et al., "Deepsolo: Let transformer decoder with explicit points solo for text spotting," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19348-19357. |