參考文獻 |
[1] Zoph, Barret; LE, Quoc V. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
[2] Baker, Bowen, et al. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167, 2016.
[3] Miikkulainen, Risto, et al. Evolving deep neural networks. In: Artificial Intelligence in the Age of Neural Networks and Brain Computing. Academic Press p. 293-312., 2019.
[4] Saxena, Shreyas; Verbeek, Jakob. Convolutional neural fab-rics. In: Advances in Neural Information Processing Systems. p. 4053-4061, 2016.
[5] Schaffer, J. David; Whitley, Darrell; Eshelman, Larry J. Com-binations of genetic algorithms and neural networks: A sur-vey of the state of the art. In: [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algo-rithms and Neural Networks. IEEE p. 1-37, 1992.
[6] Stanley, Kenneth O.; Miikkulainen, Risto. Evolving neural networks through augmenting topologies. Evolutionary computation10.2: 99-127, 2002.
[7] Verbancsics, Phillip; Harguess, Josh. Generative neuroevolu-tion for deep learning. arXiv preprint arXiv:1312.5355, 2013.
[8] Liu, Hanxiao, et al. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436, 2017.
[9] Tomassini, Marco. Parallel and distributed evolutionary algorithms: A review. 1999.
[10] Koutník, Jan; Schmidhuber, Jürgen; Gomez, Faustino. Evolv-ing deep unsupervised convolutional networks for vision-based reinforcement learning. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computa-tion. ACM p. 541-548, 2014.
[11] Tsitsiklis, John N. Asynchronous stochastic approximation and Q-learning. Machine learning, 16.3: 185-202, 1994.
[12] Bertsekas, Dimitri. Distributed dynamic programming. IEEE transactions on Automatic Control, 27.3: 610-616, 1982.
[13] Cai, Han, et al. Efficient architecture search by network transformation. In: Thirty-Second AAAI Conference on Arti-ficial Intelligence. 2018.
[14] Bello, Irwan, et al. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.
[15] Pham, Hieu, et al. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268, 2018.
[16] Shahriari, Bobak, et al. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104.1: 148-175, 2015.
[17] Bergstra, James; Yamins, Daniel; Cox, David Daniel. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. 2013.
[18] Domhan, Tobias; Springenberg, Jost Tobias; Hutter, Frank. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: Twenty-Fourth International Joint Conference on Artificial In-telligence. 2015.
[19] Snoek, Jasper; Larochelle, Hugo; Adams, Ryan P. Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, p. 2951-2959. 2012.
[20] Kevin Swersky, Jasper Snoek, and Ryan P Adams. Multi-task bayesian optimization. NIPS, pp. 2004–2012, 2013.
[21] Bergstra, James S., et al. Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems. p. 2546-2554. 2011.
[22] Vilalta, Ricardo; Drissi, Youssef. A perspective view and survey of meta-learning. Artificial intelligence review, 18.2: 77-95, 2002.
[23] Hochreiter, Sepp; Younger, A. Steven; Conwell, Peter R. Learning to learn using gradient descent. In: International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, p. 87-94. 2001.
[24] Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." Advances in Neural Infor-mation Processing Systems. 2016.
[25] LIU, Xu-Ying; WU, Jianxin; ZHOU, Zhi-Hua. Exploratory undersampling for class-imbalance learning. IEEE Transac-tions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39.2: 539-550, 2008.
[26] Schapire, Robert E. A brief introduction to boosting. In: Ijcai. p. 1401-1406. 1999.
[27] Özdemir, Ahmet Turan, and Billur Barshan. “Detecting Falls with Wearable Sensors Using Machine Learning Techniques.” Sensors (Basel, Switzerland) 14.6 (2014): 10691–10708. PMC. Web. 23 Apr. 2017.
[28] Ian J Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron C Courville, and Yoshua Bengio. Maxout networks. ICML (3), 28:1319–1327, 2013.
[29] Min Lin, Qiang Chen, and Shuicheng Yan. Network in net-work. arXiv preprint arXiv:1312.4400, 2013.
[30] Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2014.
[31] Zhong, Zhao, Junjie Yan, and Cheng-Lin Liu. "Practical net-work blocks design with q-learning." arXiv preprint arXiv:1708.055521.2 (2017): 5.
[32] Swersky, Kevin, Jasper Snoek, and Ryan P. Adams. "Multi-task bayesian optimization." Advances in neural information processing systems. 2013.
[33] Wan, Li, et al. "Regularization of neural networks using dropconnect." International conference on machine learning. 2013.
[34] Shahriari, Bobak, et al. "Taking the human out of the loop: A review of bayesian optimization." Proceedings of the IEEE 104.1 (2016): 148-175.
[35] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529.
[36] Lin, Long-Ji. Reinforcement learning for robots using neural networks. No. CMU-CS-93-103. CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCI-ENCE, 1993.
[37] Kaelbling, Leslie Pack, Michael L. Littman, and Andrew W. Moore. "Reinforcement learning: A survey." Journal of artifi-cial intelligence research 4 (1996): 237-285.
[38] Vermorel, Joannes, and Mehryar Mohri. "Multi-armed bandit algorithms and empirical evaluation." European conference on machine learning. Springer, Berlin, Heidelberg, 2005.
[39] Galstyan, Aram, Karl Czajkowski, and Kristina Lerman. "Resource allocation in the grid using reinforcement learning." Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3. IEEE Computer Society, 2004.
[40] Gomes, Eduardo Rodrigues, and Ryszard Kowalczyk. "Learning the IPA market with individual and social re-wards." Web Intelligence and Agent Systems: An Interna-tional Journal 7.2 (2009): 123-138.
[41] Ziogos, N. P., et al. "A reinforcement learning algorithm for market participants in FTR auctions." 2007 IEEE Lausanne Power Tech. IEEE, 2007.
[42] Bertsekas, Dimitri P., and Athena Scientific. Convex optimi-zation algorithms. Belmont: Athena Scientific, 2015.
[43] Watkins, Christopher John Cornish Hellaby. Learning from delayed rewards. Diss. King′s College, Cambridge, 1989.
[44] Dean, Jeffrey, et al. ”Large scale distributed deep networks.” Advances in neural information processing systems. 2012.
[45] Gu, Shixiang, et al. ”Continuous deep q-learning with model-based acceleration.” arXiv preprint arXiv:1603.00748 (2016).
[46] Van Hasselt, Hado, Arthur Guez, and David Silver. ”Deep Reinforcement Learning with Double Q-Learning.” AAAI. 2016.
[47] Narendra, Kumpati S., Yu Wang, and Snehasis Mukhopady-hay. ”Fast Reinforcement Learning using Multiple Models.”, 2016 Control and Decision Conference, Las Vegas
[48] Narendra, Kumpati S., Snehasis Mukhopadyhay, and Yu Wang. ”Improving the Speed of Response of Learning Algo-rithms Using Multiple Models: An Introduction.”, the 17th Yale Workshop on Adaptive and Learning Systems
[49] S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” arXiv:1504.00702 [cs.LG], 2015.
[50] J.-A. M. Assael, N. Wahlström, T. B. Schön, and M. P. Deisenroth, “Data-efficient learning of feedback policies from image pixels using deep dynamical models,” arXiv:1510.02173 [cs.AI], 2015.
[51] J. Ba, V. Mnih, and K. Kavukcuoglu, “Multiple object recog-nition with visual attention,” arXiv:1412.7755 [cs.LG], 2014. |