參考文獻 |
[1] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. of Neural Information Processing Systems (NIPS), Harrahs and Harveys, Lake Tahoe, NV, Dec.3-8, 2012, pp.1106-1114.
[2] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” arXiv:1405.0312.
[3] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and F.-F. Li, “Imagenet large scale visual recognition challenge,” Int. Journal of Computer Vision (IJCV), vol.115, no.3, pp.211-252, 2015.
[4] M. Everingham, L. V. Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (VOC) challenge, ” Int. Journal of Computer Vision (IJCV), vol.88, no.2, pp.303-338, 2010.
[5] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional neural networks,” in Proc. of European Conference on Computer Vision Conf., Zurich, Switzerland, Sep.6-12, 2014, pp.818-833.
[6] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[7] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. of International Conference on Learning Representations Conf., San Diego, CA, May 7-9, 2015, pp.1-14.
[8] S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” in Proc. of ICML Conf., Lille, France, Jul.7-9, 2015, vol.37, pp.448-456.
[9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[10] R. K. Srivastava, K. Greff, and J. Schmidhuber, “Training very deep networks,” in Proc. of Neural Information Processing Systems (NIPS), Montréal, Canada, Dec.7-12, 2015, pp.2377-2385.
[11] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, Jun.23-28, 2014, pp.580-587.
[12] R. Girshick, “Fast R-CNN,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, no.6, pp.1137-1149, 2016.
[14] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, real-time object detection,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp.779-788.
[15] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HZ, Jul.21-26, 2017, pp.6517-6525.
[16] J. Redmon and A. Farhadi, “Yolov3: an incremental improvement,” arXiv:1804.02767.
[17] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” in Proc. European Conf. on Computer Vision (ECCV), Amsterdam, Holland, Oct.8-16, 2016, pp.21-37.
[18] C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “DSSD: Deconvolutional single shot detector,” arXiv:1701.06659.
[19] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” arXiv:1708.02002.
[20] 陳世翔,深度學習的3D物件偵測、辨識、與方位估計,碩士論文,資訊工程系,國立中央大學,桃園市,台灣,2020/6。
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[22] Jonathan Tremblay, Thang To, and Stan Birchfield, “Falling Things: A synthetic dataset for 3D object detection and pose estimation,” arXiv:1804.06534.
[23] Y. Chen, C. Han, N. Wang, and Z. Zhang, “Revisiting feature alignment for one-stage object detection,” arXiv:1908.01570.
[24] A. Neubeck and L. Van Gool, “Efficient non-maximum suppression,” in Proc. of IEEE Int. Conf. on Pattern Recognition(ICPR), Hong Kong, Aug.20-24, 2006, pp.850-855.
[25] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017, pp.764-773.
[26] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017, pp.2980-2988.
[27] J. Uijlings, K. Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” Int. Journal of Computer Vision (IJCV), vol.104, no.2, pp.154-171, 2013.
[28] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” in Proc. European Conf. on Computer Vision (ECCV), Zurich, Switzerland, Sep.6-12, 2014, pp.346-361.
[29] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. 5th Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, CA, Jun.21-Jul.18, vol.1, 1967, pp.281-297.
[30] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, Jul.21-26, 2017, pp.936-944.
[31] S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Li, “Single shot refinement neural network for object detection,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun.18-23, 2018, pp.4203-4212.
[32] Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving into high quality object detection,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun.18-23, 2018, pp.6154-6162.
[33] X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao, and J. Yan, “FOTS: Fast oriented text spotting with a unified network,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , Salt Lake City, UT, June.18-23, 2018, pp.5676-5685.
[34] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Int. Journal of Neural Computation, vol.2, no.8, pp.1735-1780, 1997.
[35] X. Yang, J. Yan, Z. Feng, and T. He, “R3Det: Refined single-stage detector with feature refinement for rotating object,” arXiv:1908.05612.
[36] Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, “PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes,” arXiv:1711.00199.
[37] B. Tekin, S. N. Sinha, and P. Fua, “Real-time seamless single shot 6D object pose prediction,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , Salt Lake City, UT, 2018, pp.292-301.
[38] S. Shi, X. Wang, and H. Li, “PointRCNN: 3D object proposal generation and detection from point cloud,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , Long Beach, CA, June.16-20, 2019, pp.770-779.
[39] C.-R. Qi, L. Yi, H. Su, and L.-J. Guibas, “PointNet++: Deep hierarchical feature learning on point sets in a metric space,” in Proc. of Int. Conf. on Neural Information Processing Systems (NIPS), Long Beach, CA, Dec.4-9, 2017, pp.5105-5114.
[40] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. of Int. Conf. on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, Oct.5-9, 2015, pp.234-241.
[41] D. P. Kingma, and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980. |