參考文獻 |
[1] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. S. Torr, “Fully-convolutional Siamese networks for object tracking,” in Proc. European Conference on Computer Vision, pp. 850-865, Sept. 2016.
[2] B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu, “High Performance Visual Tracking with Siamese Region Proposal Network,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971-8980, June 2018.
[3] B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan, “SiamRPN++: Evolution of Siamese visual tracking with very deep networks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4282-4291, June 2019.
[4] D. Guo, J. Wang, Y. Cui, Z. Wang, and S. Chen, “SiamCAR: Siamese fully convolutional classification and regression for visual tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6268-6276, June 2020.
[5] Z. Chen, B. Zhong, G. Li, S. Zhang, and R. Ji, “Siamese box adaptive network for visual tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6667-6676, June 2020.
[6] Z. Zhang, H. Peng, J. Fu, B. Li, and W. Hu, “Ocean: object-aware anchor-free tracking,” in Proc. European Conference on Computer Vision, pp 771-787. Aug. 2020.
[7] M. Danelljan, G. Bhat, F. S. Khan, and M. Felsberg, “ATOM: Accurate tracking by overlap maximization,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4655-4664, June 2019.
[8] G. Bhat, M. Danelljan, L. Van Gool, and R. Timofte, “Learning discriminative model prediction for tracking,” in Proc. IEEE International Conference on Computer Vision, pp. 6181-6190, Oct. 2019.
[9] M. Danelljan, L. Van Gool, and R. Timofte, “Probabilistic regression for visual tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7181-7190, June 2020.
[10] X. Chen, B. Yan, J. Zhu, D. Wang, X. Yang, H. Lu, “Transformer tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, June 2021.
[11] N. Wang, W. Zhou, J. Wang, and H. Li, “Transformer meets tracker: exploiting temporal context for robust visual tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, June 2021.
[12] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. International Conference on Neural Information Processing Systems, pp. 6000-6010, Dec. 2017.
[13] T. Yang and A. B. Chan, “Learning dynamic memory networks for object tracking,” in Proc. European Conference on Computer Vision, pp. 152-167, Oct. 2018.
[14] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature verification using a “Siamese” time delay neural network,” in Conference on Neural Information Processing Systems, Vol. 6, pp. 737-744, Nov. 1993.
[15] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 539–546, June 2005.
[16] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” in Proc. International Conference on Machine Learning Deep Learning Workshop, Vol. 2, July 2015.
[17] D. Held, T. Sebastian, and S. Silvio, “Learning to track at 100 fps with deep regression networks.” in Proc. European Conference on Computer Vision, pp. 749-765, Oct. 2016.
[18] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, pp. 1137-1149, June 2017.
[19] Y. Wu, J. Lim, and M. Yang, “Object tracking benchmark,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 37, No. 9, pp.1834-1848, Sept. 2015.
[20] M. Kristan, A. Leonardis, J. Matas, M. Felsberg, R. Pfugfelder, L. C. Zajc, T. Vojir, G. Bhat, A. Lukezic, A. Eldesokey, G. Fernandez, and et al, “The sixth visual object tracking VOT2018 challenge results,” in Proc. European Conference on Computer Vision Workshops, pp. 3-53, Jan. 2018.
[21] H. Fan, L. Lin, F. Yang, P. Chu, G. Deng, S. Yu, H. Bai, Y. Xu, C. Liao, and H. Ling. “LaSOT: A high-quality benchmark for large-scale single object tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5369-5378, June 2018.
[22] M. Muller, A. Bibi, S. Giancola, S. Al-Subaihi, and B. Ghanem, “Trackingnet: A large-scale dataset and benchmark for object tracking in the wild,” in Proc. European Conference on Computer Vision, pp. 300-317, Oct. 2018.
[23] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” in Proc. IEEE International Conference on Computer Vision, pp. 9626-9635, Oct. 2019.
[24] L. Huang, X. Zhao, and K. Huang, “GOT-10k: A large high-diversity benchmark for generic object tracking in the wild,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 43, No. 5, pp. 1562-1577, Dec. 2019.
[25] F. Du, P. Liu, W. Zhao, and X. Tang, “Correlation-guided attention for corner detection based visual tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6835-6844, June 2020.
[26] Y. Yu, Y. Xiong, W. Huang, and M. R. Scott, “Deformable Siamese attention networks for visual object tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6727-6736, June 2020.
[27] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proc. International Conference on Learning Representations, May 2015.
[28] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” in Proc. Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734, Oct. 2014.
[29] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, June 2016.
[30] A. Graves, “Generating sequences with recurrent neural networks,” arXiv preprint arXiv:1308.0850, June 2014.
[31] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794-7803, June 2018.
[32] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, June 2018.
[33] A. He, C. Luo, X. Tian, and W. Zeng, “A twofold Siamese network for real-time object tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4834–4843, June 2018.
[34] D. C. Luvizon, H. Tabia, and D. Picard, “Human pose regression by combining indirect part detection and contextual information,” Computers & Graphics, Vol. 85, pp. 15-22, Dec. 2019.
[35] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818-2826, June 2016.
[36] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance normalization: the missing ingredient for fast stylization,” arXiv preprint arXiv:1607.08022, July 2016.
[37] Y. Xu, Z. Wang, Z. Li, Y. Yuan, and G. Yu, “SiamFC++: Towards robust and accurate visual tracking with target estimation guidelines,” in Proc. AAAI Conference on Artificial Intelligence, Vol. 34, No. 07 pp. 12549-12556, April 2020.
[38] S. Liu and W. Deng, “Very deep convolutional neural network based image classification using small training sample size,” in Proc. IAPR Asian Conference on Pattern Recognition, pp. 730-734, Nov. 2015.
[39] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, and M. Bernstein, “ImageNet large scale visual recognition challenge,” International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, Apr. 2015.
[40] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. the 32nd International Conference on Machine Learning. Vol. 37, pp. 448-456, July 2015.
[41] T. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for Dense Object Detection,” in Proc. IEEE International Conference on Computer Vision, pp. 2999-3007. Oct. 2017.
[42] J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, “Unitbox: An advanced object detection network,” in Proc. the 24th ACM International Conference on Multimedia, pp. 516-520, Oct. 2016.
[43] M. Mueller, N. Smith, and B. Ghanem, “A benchmark and simulator for UAV tracking,” in Proc. European Conference on Computer Vision, pp. 445-461, Oct. 2016.
[44] H. K. Galoogahi, A. Fagg, C. Huang, D. Ramanan, and S. Lucey, “Need for speed: A benchmark for higher frame rate object tracking,” in Proc. IEEE International Conference on Computer Vision, pp. 1134-1143, Oct. 2017. |