參考文獻 |
[1] K. He, G. Gkioxari, P. Dollár and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2980-2988.
[2] T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 936-944.
[3] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017.
[4] M. de La Gorce, D. J. Fleet and N. Paragios, "Model-Based 3D Hand Pose Estimation from Monocular Video," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 9, pp. 1793-1805, Sept. 2011.
[5] P. Krejov and R. Bowden, "Multi-touchless: Real-time fingertip detection and tracking using geodesic maxima," 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, 2013, pp. 1-7.
[6] Liang, Hui & Yuan, Junsong & Thalmann, Daniel, “3D Fingertip and Palm Tracking in Depth Image Sequences,” MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, pp. 785-788, 2012
[7] Y. Cao, X. Niu and Y. Dou, "Region-based convolutional neural networks for object detection in very high resolution remote sensing images," 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, 2016, pp. 548-554.
[8] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 779-788.
[9] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 6517-6525.
[10] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv:1804.02767, April, 2018.
[11] Liu, Xiaorui; Huang, Yichao; Zhang, Xin; Jin, Lianwen, “Fingertip in the Eye: A cascaded CNN pipeline for the real-time fingertip detection in egocentric videos,” arXiv:1511.02282, November, 2015.
[12] Y. Huang, X. Liu, X. Zhang and L. Jin, "A Pointing Gesture Based Egocentric Interaction System: Dataset, Approach and Application," 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, 2016, pp. 370-377.
[13] Mukherjee, Sohom; Ahmed, Arif; Prosad Dogra, Debi; Kar, Samarjit; Pratim Roy, Partha, “Fingertip Detection and Tracking for Recognition of Air-Writing in Videos,” arXiv:1809.03016, September, 2018.
[14] Wei Liu and Dragomir Anguelov and Dumitru Erhan and Christian Szegedy and Scott E. Reed and Cheng-Yang Fu and Alexander C. Berg, “SSD: Single Shot MultiBox Detector,” ECCV, 2016.
[15] Simonyan, Karen & Zisserman, Andrew, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv 1409.1556, September, 2014.
[16] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778.
[17] G. Huang, Z. Liu, L. v. d. Maaten and K. Q. Weinberger, "Densely Connected Convolutional Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 2261-2269.
[18] C. Szegedy et al., "Going deeper with convolutions," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 1-9.
[19] F. Chollet, "Xception: Deep Learning with Depthwise Separable Convolutions," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 1800-1807.
[20] Howard, Andrew G.; Zhu, Menglong; Chen, Bo; Kalenichenko, Dmitry; Wang, Weijun; Weyand, Tobias; Andreetto, Marco; Adam, Hartwig, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv:1704.04861, April, 2017.
[21] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and L. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 4510-4520.
[22] A. Gupta, A. Vedaldi and A. Zisserman, "Synthetic Data for Text Localisation in Natural Images," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 2315-2324.
[23] Zimmermann, Christian; Brox, Thomas, “Learning to Estimate 3D Hand Pose from Single RGB Images,” arXiv:1705.01389, May, 2017
[24] J. Zhang, J. Jiao, M. Chen, L. Qu, X. Xu and Q. Yang, "A hand pose tracking benchmark from stereo matching," 2017 IEEE International Conference on Image Processing (ICIP), Beijing, 2017, pp. 982-986.
[25] von Zitzewitz, Gustav, “Deep Learning and Real-Time Computer Vision for Mobile Platforms,” 2018.
[26] Yin Guobing, “Deep Learning: Separable Convolution,” https://yinguobing.com/separable-convolution
[27] Joyce Xu, “Deep Learning for Object Detection: A Comprehensive Review,” https://towardsdatascience.com/deep-learning-for-object-detection-a-comprehensive-review-73930816d8d9
[28] Henriques, J. F., Caseiro, R., Martins, P., and Batista, J. “High-speed tracking with kernelized correlation filters,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3):583–596, 2015.
[29] Kalal, Z., Mikolajczyk, K., and Matas, J. “Tracking-learning-detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7):1409–1422, 2012.
[30] Babenko, B., Yang, M.-H., and Belongie, S. “Robust object tracking with online multiple instance learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(8):1619–1632, 2011.
|