GANAS: A GAN-based Neural Architecture Search

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：38

、訪客IP：13.59.205.157

姓名

陳正育(Cheng-Yu Chen) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(GANAS: A GAN-based Neural Architecture Search)

相關論文

★ 台灣50走勢分析：以多重長短期記憶模型架構為基礎之預測	★ 以多重遞迴歸神經網路模型為基礎之黃金價格預測分析
★ 增量學習用於工業4.0瑕疵檢測	★ 遞回歸神經網路於電腦零組件銷售價格預測之研究
★ 長短期記憶神經網路於釣魚網站預測之研究	★ 基於深度學習辨識跳頻信號之研究
★ Opinion Leader Discovery in Dynamic Social Networks	★ 深度學習模型於工業4.0之機台虛擬量測應用
★ A Novel NMF-Based Movie Recommendation with Time Decay	★ 以類別為基礎sequence-to-sequence模型之POI旅遊行程推薦
★ A DQN-Based Reinforcement Learning Model for Neural Network Architecture Search	★ Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
★ 生成式對抗網路架構搜尋	★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
★ Enhanced Model Agnostic Meta Learning with Meta Gradient Memory	★ 遞迴類神經網路結合先期工業廢水指標之股價預測研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-1以後開放)

摘要(中)

在現今社會中，AI扮演的角色日漸重要，它在眾多領域的應用也逐漸得到認可，從而吸引了越來越多的學者致力於AI領域的研究工作。其中，CNN在圖像和影像處理領域上得到了顯著的成果。然而，這些CNN模型的準確性通常取決於其底層架構。為了使模型的效能上升，領域專家經常需要手動設計架構，這意味著將花費大量的時間和精力來微調模型結構。為了解決這個難題，本研究提出了一種RL-based GAN 的神經架構搜索方法，GANAS。透過現有的 CNN 模型，GANAS 將學習如何生成更好的模型架構。並且與基於其他策略的神經架構搜索方法相比，GANAS 將能以更短的時間完成訓練。同時，GAN的多樣性和集束搜索能使其找到更多的可行解，從而解決NAS單一最優解和尋找效能不穩定的問題。

摘要(英)

Presently, AI has witnessed widespread proliferation and prosperity across diverse domains, thereby attracting a burgeoning community of scholars engaged in AI research endeavors. Among these advancements, CNNs have attained awe-inspiring accomplishments in the domains of image and video processing. However, the accuracy of these CNN models is typically contingent upon their underlying architectures. To bolster model performance, domain experts often resort to manual architectural design, entailing significant time and effort spent on fine-tuning the model structures. To tackle this quandary, our study proposes a RL-based GAN neural architecture search methodology, GANAS. Capitalizing on pre-existing CNN models, GANAS acquires the capacity to learn how to generate enhanced model architectures within the GAN framework. In comparison to other strategy-based neural architecture search methods, GANAS demonstrates expedited training durations. Additionally, by harnessing the diversity of GAN and beam search techniques, we remedy the limitations of NAS characterized by the single optimal solution and the instability of search performance.

關鍵字(中)

★ 深度學習
★ 生成對抗網路
★ 神經架構搜索

關鍵字(英)

★ Deep Learning
★ Generative Adversarial Networks
★ Neural Architecture Search

論文目次

摘要 i
Abstract ii
Table of Contents iii
List of Figures iv
List of Tables v
1. Introduction 1
2. Related Works 7
2.1 Convolutional Neural Network(CNN) 7
2.2 Neural architecture search(NAS) 8
2.3 Generative Adversarial Network(GAN) 12
3. Proposed Method 15
3.1 Model Encoding and Predictor 16
3.2 OuterGAN 18
3.3 InnerGAN 22
4. Experiments and Evaluation 25
4.1 Baselines and Metrics 27
4.2 Performance Comparison 29
4.3 Memory Influence Discussion 32
4.4 Predictor Architecture Analysis 33
4.5 Ablation Study 36
4.6 Parameters Setting 40
4.7 Case Study 44
5. Conclusion 48
Reference 49

參考文獻

Reference
[1] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[2] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
[3] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[5] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251-1258.
[6] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.
[7] M. Tan and Q. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in International conference on machine learning, 2019: PMLR, pp. 6105-6114.
[8] T. Domhan, J. T. Springenberg, and F. Hutter, "Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves," in Twenty-fourth international joint conference on artificial intelligence, 2015.
[9] J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of machine learning research, vol. 13, no. 2, 2012.
[10] H. Mendoza, A. Klein, M. Feurer, J. T. Springenberg, and F. Hutter, "Towards automatically-tuned neural networks," in Workshop on automatic machine learning, 2016: PMLR, pp. 58-65.
[11] T. Elsken, J. H. Metzen, and F. Hutter, "Neural architecture search: A survey," The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997-2017, 2019.
[12] M. Wistuba, A. Rawat, and T. Pedapati, "A survey on neural architecture search," arXiv preprint arXiv:1905.01392, 2019.
[13] P. Ren, Y. Xiao, X. Chang, P.-Y. Huang, Z. Li, X. Chen, and X. Wang, "A comprehensive survey of neural architecture search: Challenges and solutions," ACM Computing Surveys (CSUR), vol. 54, no. 4, pp. 1-34, 2021.
[14] B. Baker, O. Gupta, R. Raskar, and N. Naik, "Accelerating neural architecture search using performance prediction," arXiv preprint arXiv:1705.10823, 2017.
[15] B. Baker, O. Gupta, N. Naik, and R. Raskar, "Designing neural network architectures using reinforcement learning," arXiv preprint arXiv:1611.02167, 2016.
[16] H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang, "Efficient architecture search by network transformation," in Proceedings of the AAAI Conference on Artificial Intelligence, 2018, vol. 32, no. 1.
[17] H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, "Efficient neural architecture search via parameters sharing," in International conference on machine learning, 2018: PMLR, pp. 4095-4104.
[18] M. Guo, Z. Zhong, W. Wu, D. Lin, and J. Yan, "Irlas: Inverse reinforcement learning for architecture search," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9021-9029.
[19] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning transferable architectures for scalable image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697-8710.
[20] M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, "Mnasnet: Platform-aware neural architecture search for mobile," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 2820-2828.
[21] B. Zoph and Q. V. Le, "Neural architecture search with reinforcement learning," arXiv preprint arXiv:1611.01578, 2016.
[22] I. Bello, B. Zoph, V. Vasudevan, and Q. V. Le, "Neural optimizer search with reinforcement learning," in International Conference on Machine Learning, 2017: PMLR, pp. 459-468.
[23] Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, "Practical block-wise neural network architecture generation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2423-2432.
[24] P. J. Angeline, G. M. Saunders, and J. B. Pollack, "An evolutionary algorithm that constructs recurrent neural networks," IEEE transactions on Neural Networks, vol. 5, no. 1, pp. 54-65, 1994.
[25] X. Yao, "Evolving artificial neural networks," Proceedings of the IEEE, vol. 87, no. 9, pp. 1423-1447, 1999.
[26] K. O. Stanley and R. Miikkulainen, "Evolving neural networks through augmenting topologies," Evolutionary computation, vol. 10, no. 2, pp. 99-127, 2002.
[27] L. Xie and A. Yuille, "Genetic cnn," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1379-1388.
[28] M. Suganuma, S. Shirakawa, and T. Nagao, "A genetic programming approach to designing convolutional neural network architectures," in Proceedings of the genetic and evolutionary computation conference, 2017, pp. 497-504.
[29] H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu, "Hierarchical representations for efficient architecture search," arXiv preprint arXiv:1711.00436, 2017.
[30] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, "Large-scale evolution of image classifiers," in International Conference on Machine Learning, 2017: PMLR, pp. 2902-2911.
[31] D. Floreano, P. Dürr, and C. Mattiussi, "Neuroevolution: from architectures to learning," Evolutionary intelligence, vol. 1, pp. 47-62, 2008.
[32] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, "Regularized evolution for image classifier architecture search," in Proceedings of the aaai conference on artificial intelligence, 2019, vol. 33, no. 01, pp. 4780-4789.
[33] T. Elsken, J.-H. Metzen, and F. Hutter, "Simple and efficient architecture search for convolutional neural networks," arXiv preprint arXiv:1711.04528, 2017.
[34] C. Liu, L.-C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and L. Fei-Fei, "Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 82-92.
[35] H. Liu, K. Simonyan, and Y. Yang, "Darts: Differentiable architecture search," arXiv preprint arXiv:1806.09055, 2018.
[36] B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, "Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10734-10742.
[37] X. Dong and Y. Yang, "Network pruning via transformable architecture search," Advances in Neural Information Processing Systems, vol. 32, 2019.
[38] X. Dong and Y. Yang, "One-shot neural architecture search via self-evaluated template network," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3681-3690.
[39] C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy, "Progressive neural architecture search," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 19-34.
[40] H. Cai, L. Zhu, and S. Han, "Proxylessnas: Direct neural architecture search on target task and hardware," arXiv preprint arXiv:1812.00332, 2018.
[41] X. Dong and Y. Yang, "Searching for a robust neural architecture in four gpu hours," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1761-1770.
[42] D. Stamoulis, R. Ding, D. Wang, D. Lymberopoulos, B. Priyantha, J. Liu, and D. Marculescu, "Single-path nas: Designing hardware-efficient convnets in less than 4 hours," in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, 2019, Proceedings, Part II, 2020: Springer, pp. 481-497.
[43] A. Brock, T. Lim, J. M. Ritchie, and N. Weston, "Smash: one-shot model architecture search through hypernetworks," arXiv preprint arXiv:1708.05344, 2017.
[44] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial networks," Communications of the ACM, vol. 63, no. 11, pp. 139-144, 2020.
[45] M. Mirza and S. Osindero, "Conditional generative adversarial nets," arXiv preprint arXiv:1411.1784, 2014.
[46] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, "Infogan: Interpretable representation learning by information maximizing generative adversarial nets," Advances in neural information processing systems, vol. 29, 2016.
[47] J. Donahue, P. Krähenbühl, and T. Darrell, "Adversarial feature learning," arXiv preprint arXiv:1605.09782, 2016.
[48] X. Yu, X. Zhang, Y. Cao, and M. Xia, "VAEGAN: A Collaborative Filtering Framework based on Adversarial Variational Autoencoders," in IJCAI, 2019, pp. 4206-4212.
[49] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223-2232.
[50] L. Yu, W. Zhang, J. Wang, and Y. Yu, "Seqgan: Sequence generative adversarial nets with policy gradient," in Proceedings of the AAAI conference on artificial intelligence, 2017, vol. 31, no. 1.
[51] W. Zhou, T. Ge, K. Xu, F. Wei, and M. Zhou, "Self-adversarial learning with comparative discrimination for text generation," arXiv preprint arXiv:2001.11691, 2020.
[52] M. J. Kusner and J. M. Hernández-Lobato, "Gans for sequences of discrete elements with the gumbel-softmax distribution," arXiv preprint arXiv:1611.04051, 2016.
[53] T. Che, Y. Li, R. Zhang, R. D. Hjelm, W. Li, Y. Song, and Y. Bengio, "Maximum-likelihood augmented discrete generative adversarial networks," arXiv preprint arXiv:1702.07983, 2017.
[54] R. Fathony and N. Goela, "Discrete Wasserstein Generative Adversarial Networks (DWGAN)," 2018.
[55] J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, "Long text generation via adversarial training with leaked information," in Proceedings of the AAAI conference on artificial intelligence, 2018, vol. 32, no. 1.
[56] Z. Shi, X. Chen, X. Qiu, and X. Huang, "Toward diverse text generation with inverse reinforcement learning," arXiv preprint arXiv:1804.11258, 2018.
[57] Y. Yang, X. Dan, X. Qiu, and Z. Gao, "FGGAN: Feature-guiding generative adversarial networks for text generation," IEEE Access, vol. 8, pp. 105217-105225, 2020.
[58] G. Rizzo and T. H. M. Van, "Adversarial text generation with context adapted global knowledge and a self-attentive discriminator," Information Processing & Management, vol. 57, no. 6, p. 102217, 2020.
[59] Y. Wu and J. Wang, "Text generation service model based on truth-guided SEQGAN," IEEE Access, vol. 8, pp. 11880-11886, 2020.
[60] P. Chrabaszcz, I. Loshchilov, and F. Hutter, "A downsampled variant of ImageNet as an alternative to the cifar datasets," arXiv preprint arXiv:1707.08819, 2017.
[61] L. Li and A. Talwalkar, "Random search and reproducibility for neural architecture search," in Uncertainty in artificial intelligence, 2020: PMLR, pp. 367-377.

指導教授

陳以錚(Yi-Cheng Chen)

審核日期

2023-7-20

推文