參考文獻 |
[1] K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask r-cnn. In
Proceedings of the IEEE international conference on computer vision,
pages 2961–2969, 2017.
[2] S. Jetley, N. A. Lord, N. Lee, and P. H. Torr. Learn to pay attention.
arXiv preprint arXiv:1804.02391, 2018.
[3] J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In
Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 7132–7141, 2018.
[4] O. Koller, J. Forster, and H. Ney. Continuous sign language recognition:
Towards large vocabulary statistical recognition systems handling
multiple signers. Computer Vision and Image Understanding,
141:108–125, Dec. 2015.
[5] J. Pu, W. Zhou, J. Zhang, and H. Li. Sign language recognition based
on trajectory modeling with hmms. In International Conference on
Multimedia Modeling, pages 686–697. Springer, 2016.
[6] L. Lamberti and F. Camastra. Real-time hand gesture recognition
using a color glove. In International Conference on Image Analysis
and Processing, pages 365–373. Springer, 2011.
[7] L.-J. Kau, W.-L. Su, P.-J. Yu, and S.-J. Wei. A real-time portable
sign language translation system. In 2015 IEEE 58th International
Midwest Symposium on Circuits and Systems (MWSCAS), pages 1–4.
IEEE, 2015.
[8] L. Jing, E. Vahdani, M. Huenerfauth, and Y. Tian. Recognizing
american sign language manual signs from rgb-d videos. arXiv
preprint arXiv:1906.02851, 2019.
[9] D.-Y. Huang, W.-C. Hu, and S.-H. Chang. Vision-based hand gesture
recognition using pca+ gabor filters and svm. In 2009 fifth international
conference on intelligent information hiding and multimedia
signal processing, pages 1–4. IEEE, 2009.
[10] K. Pearson. Liii. on lines and planes of closest fit to systems of points
in space. The London, Edinburgh, and Dublin Philosophical Magazine
and Journal of Science, 2(11):559–572, 1901.
[11] C. Cortes and V. Vapnik. Support-vector networks. Machine learning,
20(3):273–297, 1995.
[12] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 770–778, 2016.
[13] K. Hara, H. Kataoka, and Y. Satoh. Can spatiotemporal 3d cnns
retrace the history of 2d cnns and imagenet? In Proceedings of the
38
IEEE conference on Computer Vision and Pattern Recognition, pages
6546–6555, 2018.
[14] M.-T. Luong, H. Pham, and C. D. Manning. Effective approaches
to attention-based neural machine translation. arXiv preprint
arXiv:1508.04025, 2015.
[15] D. Britz, A. Goldie, M.-T. Luong, and Q. Le. Massive exploration
of neural machine translation architectures. arXiv preprint
arXiv:1703.03906, 2017.
[16] J. Cheng, L. Dong, and M. Lapata. Long short-term memorynetworks
for machine reading. arXiv preprint arXiv:1601.06733,
2016.
[17] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation
by jointly learning to align and translate. arXiv preprint
arXiv:1409.0473, 2014.
[18] P. Sermanet, A. Frome, and E. Real. Attention for fine-grained categorization.
arXiv preprint arXiv:1412.7054, 2014.
[19] X. Liu, T. Xia, J. Wang, Y. Yang, F. Zhou, and Y. Lin. Fully
convolutional attention networks for fine-grained recognition. arXiv
preprint arXiv:1603.06765, 2016.
[20] J. Ba, V. Mnih, and K. Kavukcuoglu. Multiple object recognition
with visual attention. arXiv preprint arXiv:1412.7755, 2014.
[21] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov,
R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In International conference
on machine learning, pages 2048–2057. PMLR, 2015.
[22] V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu. Recurrent models
of visual attention. arXiv preprint arXiv:1406.6247, 2014.
[23] R. S. Sutton. Learning to predict by the methods of temporal differences.
Machine learning, 3(1):9–44, 1988.
[24] L. Wright. Ranger - a synergistic optimizer. https://github.com/
lessw2020/Ranger-Deep-Learning-Optimizer, 2019.
[25] D. R. Cox. The regression analysis of binary sequences. Journal
of the Royal Statistical Society: Series B (Methodological), 20(2):
215–232, 1958.
[26] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature
hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 580–587, 2014.
[27] R. Girshick. Fast r-cnn. In Proceedings of the IEEE international
conference on computer vision, pages 1440–1448, 2015.
[28] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime
object detection with region proposal networks. arXiv preprint
arXiv:1506.01497, 2015.
[29] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look
once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–
788, 2016.
[30] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,
M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray,
C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,
P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals,
P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. Tensor-
Flow: Large-scale machine learning on heterogeneous systems, 2015.
URL https://www.tensorflow.org/. Software available from tensorflow.
org. |