參考文獻 |
[1] Soleimanitaleb, Z., and M. A. Keyvanrad. "Single Object Tracking: A Survey of Methods, Datasets, and Evaluation Metrics. arXiv 2022." arXiv preprint arXiv:2201.13066.
[2] Bolme, David S., et al. "Visual object tracking using adaptive correlation filters." 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 2010.
[3] Henriques, Joao F., et al. "Exploiting the circulant structure of tracking-by-detection with kernels." Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part IV 12. Springer Berlin Heidelberg, 2012.
[4] Henriques, João F., et al. "High-speed tracking with kernelized correlation filters." IEEE transactions on pattern analysis and machine intelligence 37.3 (2014): 583-596.
[5] Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR′05). Vol. 1. Ieee, 2005.
[6] Danelljan, Martin, et al. "Adaptive color attributes for real-time visual tracking." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
[7] Bertinetto, Luca, et al. "Fully-convolutional siamese networks for object tracking." Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II 14. Springer International Publishing, 2016.
[8] Bromley, Jane, et al. "Signature verification using a" siamese" time delay neural network." Advances in neural information processing systems 6 (1993).
[9] Li, Bo, et al. "High performance visual tracking with siamese region proposal network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[10] Zhu, Zheng, et al. "Distractor-aware siamese networks for visual object tracking." Proceedings of the European conference on computer vision (ECCV). 2018.
[11] Zhang, Zhipeng, and Houwen Peng. "Deeper and wider siamese networks for real-time visual tracking." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[12] Li, Bo, et al. "Siamrpn++: Evolution of siamese visual tracking with very deep networks." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[13] Xu, Yinda, et al. "Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines." Proceedings of the AAAI conference on artificial intelligence. Vol. 34. No. 07. 2020.
[14] Hu, Weiming, et al. "Siammask: A framework for fast online object tracking and segmentation." IEEE Transactions on Pattern Analysis and Machine Intelligence 45.3 (2023): 3072-3089.
[15] Voigtlaender, Paul, et al. "Siam r-cnn: Visual tracking by re-detection." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[16] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16words: Trans formers for image recognition at scale. In ICLR, 2021.
[17] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
[18] Bao, Hangbo, et al. BEiT: BERT Pre-Training of Image Transformers. International Conference on Learning Representations. 2021.
[19] Caron, Mathilde, et al. Emerging properties in self-supervised vision transformers. Proceedings of the IEEE/CVF international conference on computer vision. 2021.
[20] Carion, Nicolas, et al. End-to-end object detection with transformers. European conference on computer vision. Cham: Springer International Publishing, 2020.
[21] Li, Yanghao, et al. Exploring plain vision transformer backbones for object detection. European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022.
[22] Chen, Xin, et al. "Transformer tracking." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
[23] Yan, Bin, et al. "Learning spatio-temporal transformer for visual tracking." Proceedings of the IEEE/CVF international conference on computer vision. 2021.
[24] Kugarajeevan, Janani, et al. "Transformers in single object tracking: An experimental survey." IEEE Access (2023).
[25] Fan, Heng, et al. "Lasot: A high-quality benchmark for large-scale single object tracking." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[26] Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, and Huchuan Lu. LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search. In CVPR, pages 15180–15189, 2021.
[27] Vasyl Borsuk, Roman Vei, Orest Kupyn, Tetiana Martyniuk, Igor Krashenyi, and Jiˇ ri Matas. FEAR: Fast, Efficient, Ac curate and Robust Visual Tracker. In ECCV, pages 644–663, 2022.
[28] Kang, Ben, et al. "Exploring lightweight hierarchical vision transformers for efficient visual tracking." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.
[29] Lin, Weidong, et al. "CAT: cross-attention transformer for one-shot object detection." arXiv preprint arXiv:2104.14984 (2021).
[30] Ren, Qiang, et al. "A robust and accurate end-to-end template matching method based on the Siamese network." IEEE Geoscience and Remote Sensing Letters 19 (2021): 1-5.
[31] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012).
[32] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[33] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[34] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[35] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4510–4520 (2018)
[36] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems 28 (2015).
[37] S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, ”Transformers in vision: A survey,’’ ACM Comput. Surv., vol. 54, pp. 1–41, Jan. 2022.
[38] K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao, C. Xu, Y. Xu, Z. Yang, Y. Zhang, and D. Tao, ‘‘A survey on vision transformer,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 1, pp. 87–110, Jan. 2023.
[39] Carion, Nicolas, et al. "End-to-end object detection with transformers." European conference on computer vision. Cham: Springer International Publishing, 2020.
[40] Chen, Xin, et al. "Seqtrack: Sequence to sequence learning for visual object tracking." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[41] Blatter, Philippe, et al. "Efficient visual tracking with exemplar transformers." Proceedings of the IEEE/CVF Winter conference on applications of computer vision. 2023.
[42] Hu, Jie, Li Shen, and Gang Sun. “Squeeze-and-excitation networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[43] Woo, Sanghyun, et al. “Cbam: Convolutional block attention module.” Proceedings of the European conference on computer vision (ECCV). 2018.
[44] Zheng, Zhaohui, et al. "Distance-IoU loss: Faster and better learning for bounding box regression." Proceedings of the AAAI conference on artificial intelligence. Vol. 34. No. 07. 2020.
[45] Chiu, Yu-Chen, et al. "Mobilenet-SSDv2: An improved object detection model for embedded systems." 2020 International conference on system science and engineering (ICSSE). IEEE, 2020.
[46] T. -H. Tsai and W. -C. Wan, "NL-DSE: Non-Local Neural Network with Decoder-Squeeze-and-Excitation for Monocular Depth Estimation." IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023.
[47] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer International Publishing, 2014.
[48] Fan, Heng, et al. "Lasot: A high-quality benchmark for large-scale single object tracking." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[49] Muller, Matthias, et al. "Trackingnet: A large-scale dataset and benchmark for object tracking in the wild." Proceedings of the European conference on computer vision (ECCV). 2018.
[50] Huang, Lianghua, Xin Zhao, and Kaiqi Huang. "Got-10k: A large high-diversity benchmark for generic object tracking in the wild." IEEE transactions on pattern analysis and machine intelligence 43.5 (2019): 1562-1577.
[51] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high- performance deep learning library. Advances in neural information processing systems 32 (2019)
[52] Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101, 2017.
[53] Wu, Yi, Jongwoo Lim, and Ming-Hsuan Yang. "Online object tracking: A benchmark." Proceedings of the IEEE conference on computer vision and pattern recognition. 2013.
[54] Danelljan, Martin, et al. "Atom: Accurate tracking by overlap maximization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. |