基於標靶訓練策略與強預測器的神經網路架構搜索方法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：61

、訪客IP：18.118.163.142

姓名

張毓修(Yu-Hsiu Chang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於標靶訓練策略與強預測器的神經網路架構搜索方法
(HTTP-NAS：Highly Targeted Training Strategy with Strong Predictors-based Method for Neural Architecture Search)

相關論文

★ 用於邊緣計算的全新輕量化物件偵測系統	★ 基於自注意力與擬合平面感知局部幾何之三維點雲分類網路
★ 利用ε-greedy強化基於Transformer的物件偵測演算法之效能

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-11以後開放)

摘要(中)

正如我們所知，設計神經網路架構需要大量的手工努力。因此促進了神經網路架構搜索(NAS)的發展。但訓練和驗證每個候選架構需要大量的時間，因此如何在最少的時間成本下找到效能最好的神經網路架構就是NAS領域很重要的衡量指標。最近研究者會採用迭代式訓練策略(例如BRP-NAS, WeakNAS)或者結合zero-cost(例如：ProxyBO)讓訓練過程中盡量挑選高效能的架構來訓練預測器，事實也證明在相同預算下會強於隨機挑選訓練架構訓練出來的預測器，這因此激發我們做出進一步猜想：在迭代式訓練策略中，如果相同訓練預算下只保留一部分高分架構來訓練預測器，會不會比全部訓練預算都拿來訓練的預測器還要強?我們對此做了一系列的實驗並且驗證了此猜想，而且效果非常卓越，因此我們將此發現結合迭代式訓練策略，提出了Highly Targeted Training Strategy(HTTS)。在預測器架構方面，我們針對Predictor-based NAS領域中基於雙向圖形卷積網路(Bi-GCN)的強預測器架構進行分析和優化。在本論文中，我們提出了更強力的預測器：Fully-BiGCN，其大幅加強了預測器對每層特徵的重視，使用Fully-BiGCN預測器搭配HTTS，我們發展出NAS新方法：HTTP-NAS。跟目前Predictor-based NAS領域的SOTA(WeakNAS)相比，HTTP-NAS取得了很好的效果，以NAS-Bench-201當作Benchmark，分別只需要WeakNAS的27.1% (CIFAR10), 49.0% (CIFAR100), 51.75% (ImageNet16-120)的訓練預算，預測器就可以找到全局最佳架構。

摘要(英)

As we know, the design of a neural network architecture requires a significant amount of manual effort. It hence spurs the development of Neural Architecture Search (NAS). However, the training and evaluation of each candidate′s architecture requires tremendous amount of time. Thus, finding the best-performing neural network architecture with minimal computation cost is a crucial event in the NAS research. Recently, researchers adopt iterative training strategies (e.g., BRP-NAS, WeakNAS) or combine them with zero-cost approaches (e.g., ProxyBO) to train predictors to select high-performance architectures during the training process. It has been observed that these methods outperform random sample-based training architectures under the same cost. It hence leads to a hypothesis: If we train a predictor by retaining only a subset of high-score architectures within the same training budget, will it be more robust than a predictor trained with the entire training? We have conducted a series of experiments to validate this hypothesis and found significant improvements. Combining this discovery with the iterative training strategy, we proposed the Highly Targeted Training Strategy (HTTS). In terms of predictor architecture, we analyze and optimize the strong predictor architecture based on the Bidirectional Graph Convolutional Network (Bi-GCN) in the field of Predictor-based NAS. In this thesis, we propose a more powerful predictor called Fully-BiGCN which can significantly enhance the emphasis of the predictor on each layer′s features. Using the Fully-BiGCN predictor with HTTS, a new NAS method called HTTP-NAS is developed. HTTP-NAS achieves remarkable results comparing with the state-of-the-art in Predictor-based NAS (WeakNAS),. Using NAS-Bench-201 as the benchmark, HTTP-NAS required only 27.1% (CIFAR10), 49.0% (CIFAR100), and 51.75% (ImageNet16-120) of training cost of WeakNAS in finding the globally optimal architecture.

關鍵字(中)

★ 神經網路架構搜索

關鍵字(英)

★ Neural Architecture Search

論文目次

摘要…. i
Abstract ii
目錄…. iii
圖目錄. v
表目錄. vii
第一章簡介 1
第二章相關文獻探討 4
2-1 神經網路架構搜索(NAS) 4
2-2 基於貝葉斯優化的神經網路架構搜索方法(BO-based NAS) 9
2-3 基於架構的神經網路架構搜索方法(Predictor-based NAS) 10
第三章研究方法 16
3-1 標靶訓練的假設與實驗 16
3-2 標靶訓練策略(Highly Targeted Training Strategy) 21
3-3 全連接雙向圖卷積網路預測器(Fully-BiGCN Predictor) 23
3-4 HTTP-NAS的變形 25
第四章實驗結果與討論 27
4-1 實驗環境與超參數設置 27
4-2 資料集介紹 27
4-2-1 NAS-Bench-101: Towards Reproducible Neural Architecture Search 28
4-2-2 NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search 28
4-3 實驗結果 29
第五章結論 36
5-1 結論 36
5-2 未來展望 36
參考文獻 37

參考文獻

[1] Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender, and Pieter-Jan Kindermans. Neural predictor for neural architecture search. In European Conference on computer
vision, pages 660–676. Springer, 2020.
[2] Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. Brp-nas: Prediction-based nas using gcns. Advances in Neural Information Processing Systems, 33:10480–10490, 2020.
[3] Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen, and Lu Yuan. Stronger nas with weaker predictors. Advances in Neural Information Processing Systems, 34:28904–28918, 2021.
[4] Yu Shen, Yang Li, Jian Zheng, Wentao Zhang, Peng Yao, Jixiang Li, Sen Yang, Ji Liu, and Bin Cui. Proxybo: Accelerating neural architecture search via bayesian optimization with zero-cost proxies. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9792–9801, 2023.
[5] Hidenori Tanaka, Daniel Kunin, Daniel L Yamins, and Surya Ganguli. Pruning neural networks without any data by iteratively conserving synaptic flow. Advances in neural information processing systems, 33:6377–6389, 2020.
[6] Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020.
[7] Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
[8] He, K., Zhang, X., Ren, S., & Sun, J.. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778), 2016.
[9] G Larsson, M Maire, G Shakhnarovich. FractalNet: Ultra-Deep Neural Networks Without Residuals. In International Conference on Learning Representations (ICLR), 2017.
[10] Sergey Zagoruyko, Nikos Komodakis. Wide Residual Networks. In The British Machine Vision Conference (BMVC), 2016.
[11] Golnaz Ghiasi, Tsung-Yi Lin, and Quoc V Le. Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7036–7045, 2019.
[12] Thomas Elsken, Jan Hendrik Metzen, Frank Hutter. Neural Architecture Search: A Survey. In Journal of Machine Learning Research 20(2019)1-21, 2019.
[13] Chris Ying, Aaron Klein, Eric Christiansen, Esteban Real, Kevin Murphy, and Frank Hutter. Nas-bench-101: Towards reproducible neural architecture search. In International conference on machine learning, pages 7105–7114. PMLR, 2019.
[14] Julien Niklas Siems, Lucas Zimmer, Arber Zela, Jovita Lukasik, Margret Keuper, Frank Hutter. NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search.. In International Conference on Learning Representations (ICLR), 2021.
[15] Abhinav Mehrotra, Alberto Gil C. P. Ramos, Sourav Bhattacharya, Łukasz Dudziak, Ravichander Vipperla, Thomas Chau, Mohamed S Abdelfattah, Samin Ishtiaq, Nicholas Donald Lane. NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition.. In International Conference on Learning Representations (ICLR), 2021.
[16] Tony Duan, Avati Anand, Daisy Yi Ding, Khanh K Thai, Sanjay Basu, Andrew Ng, and Alejandro Schuler. Ngboost: Natural gradient boosting for probabilistic prediction. In International conference on machine learning, pages 2690–2700. PMLR, 2020.
[17] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
[18] Colin White, Willie Neiswanger, and Yash Savani. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10293–10301, 2021.
[19] Namhoon Lee, Thalaiyasingam Ajanthan, and Philip HS Torr. SNIP: Single-Shot Network Pruning Based on Connection Sensitivity. In International Conference on Learning Representations (ICLR), 2019
[20] Joe Mellor, Jack Turner, Amos Storkey, and Elliot J Crowley. Neural architecture search without training. In International Conference on Machine Learning, pages 7588–7598. PMLR, 2021.
[21] H M Dipu Kabir . Reduction of Class Activation Uncertainty with Background Information. In Computer Vision and Pattern Recognition (CVPR), 2023.
[22] Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019.
[23] Han Cai, Ligeng Zhu, Song Han. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In International Conference on Learning Representations (ICLR), 2019.

指導教授

范國清謝君偉(Kuo-Chin Fan Jun-Wei Hsieh)

審核日期

2023-7-13

推文