參考文獻 |
[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by
back-propagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986.
[2] M. Jaderberg, W. M. Czarnecki, S. Osindero, et al., “Decoupled neural interfaces
using synthetic gradients,” in International conference on machine learning, PMLR,
2017, pp. 1627–1635.
[3] S. Hochreiter, “The vanishing gradient problem during learning recurrent neural
nets and problem solutions,” International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems, vol. 6, no. 02, pp. 107–116, 1998.
[4] M. A. Nielsen, Neural networks and deep learning. Determination press San Francisco, CA, USA, 2015, vol. 25.
[5] C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun.
2015.
[6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep
bidirectional transformers for language understanding,” in Proceedings of the 2019
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers),
Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019,
pp. 4171–4186. doi: 10.18653/v1/N19-1423.
[7] Y.-W. Kao and H.-H. Chen, “Associated learning: Decomposing end-to-end backpropagation based on autoencoders and target propagation,” Neural Computation,
vol. 33, no. 1, pp. 174–193, 2021.
[8] D. Y. Wu, D. Lin, V. Chen, and H.-H. Chen, “Associated learning: An alternative to end-to-end backpropagation that works on cnn, rnn, and transformer,” in
International Conference on Learning Representations, 2021.
[9] S. Teerapittayanon, B. McDanel, and H.-T. Kung, “Branchynet: Fast inference via
early exiting from deep neural networks,” in 2016 23rd International Conference
on Pattern Recognition (ICPR), IEEE, 2016, pp. 2464–2469.
[10] H. Mostafa, V. Ramesh, and G. Cauwenberghs, “Deep supervised learning using
local errors,” Frontiers in neuroscience, p. 608, 2018.
[11] C.-K. Wang, “Decomposing end-to-end backpropagation based on scpl,” 碩士論文,
國立中央大學軟體工程研究所, 2022.
[12] A. Nøkland and L. H. Eidnes, “Training neural networks with local error signals,”
in International conference on machine learning, PMLR, 2019, pp. 4839–4850.
[13] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine
learning, PMLR, 2020, pp. 1597–1607.
[14] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, 2020, pp. 9729–9738.
[15] P. Khosla, P. Teterwak, C. Wang, et al., “Supervised contrastive learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 18 661–18 673, 2020.
[16] Y. Huang, Y. Cheng, A. Bapna, et al., Gpipe: Efficient training of giant neural
networks using pipeline parallelism, 2019. arXiv: 1811.06965 [cs.CV].
[17] A. Paszke, S. Gross, F. Massa, et al., “Pytorch: An imperative style, high-performance
deep learning library,” in Advances in Neural Information Processing Systems, H.
Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett,
Eds., vol. 32, Curran Associates, Inc., 2019.
[18] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[19] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in Proceedings of the IEEE conference on computer vision and pattern recognition,
2016, pp. 770–778.
[20] A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny
images,” 2009.
[21] Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7,
no. 7, p. 3, 2015.
[22] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, issn: 0899-7667. doi: 10.1162/neco.
1997.9.8.1735.
[23] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in Advances
in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, et
al., Eds., vol. 30, Curran Associates, Inc., 2017.
74
[24] X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text
classification,” in Advances in Neural Information Processing Systems, C. Cortes, N.
Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28, Curran Associates,
Inc., 2015.
[25] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA: Association for Computational Linguistics, Jun. 2011,
pp. 142–150.
[26] R. Socher, A. Perelygin, J. Wu, et al., “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 Conference
on Empirical Methods in Natural Language Processing, Seattle, Washington, USA:
Association for Computational Linguistics, Oct. 2013, pp. 1631–1642.
[27] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word
representation,” in Empirical Methods in Natural Language Processing (EMNLP),
2014, pp. 1532–1543.
[28] J.-B. Grill, F. Strub, F. Altché, et al., “Bootstrap your own latent - a new approach
to self-supervised learning,” in Advances in Neural Information Processing Systems,
H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33, Curran
Associates, Inc., 2020, pp. 21 271–21 284.
[29] X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
2021, pp. 15 750–15 758.
[30] A. Bardes, J. Ponce, and Y. LeCun, “VICReg: Variance-invariance-covariance regularization for self-supervised learning,” in International Conference on Learning
Representations, 2022.
[31] C.-H. Yeh, C.-Y. Hong, Y.-C. Hsu, T.-L. Liu, Y. Chen, and Y. LeCun, “Decoupled
contrastive learning,” in Computer Vision–ECCV 2022: 17th European Conference,
Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, Springer, 2022,
pp. 668–684. |