基於關聯式學習的動態自適應推論及動態網路擴增

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：111

、訪客IP：18.117.104.216

姓名

彭彥霖(Yen-Lin Peng) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於關聯式學習的動態自適應推論及動態網路擴增
(Adaptive Inference and Dynamic Network Accumulation based on Associated Learning)

相關論文

★ 透過網頁瀏覽紀錄預測使用者之個人資訊與性格特質	★ 透過矩陣分解之多目標預測方法預測使用者於特殊節日前之瀏覽行為變化
★ 動態多模型融合分析研究	★ 擴展點擊流：分析點擊流中缺少的使用者行為
★ 關聯式學習：利用自動編碼器與目標傳遞法分解端到端倒傳遞演算法	★ 融合多模型排序之點擊預測模型
★ 分析網路日誌中有意圖、無意圖及缺失之使用者行為	★ 基於自注意力機制產生的無方向性序列編碼器使用同義詞與反義詞資訊調整詞向量
★ 探索深度學習或簡易學習模型在點擊率預測任務中的使用時機	★ 空氣品質感測器之故障偵測--基於深度時空圖模型的異常偵測框架
★ 以同反義詞典調整的詞向量對下游自然語言任務影響之實證研究	★ 結合時空資料的半監督模型並應用於PM2.5空污感測器的異常偵測
★ 藉由權重之梯度大小調整DropConnect的捨棄機率來訓練神經網路	★ 使用圖神經網路偵測 PTT 的低活躍異常帳號
★ 針對個別使用者從其少量趨勢線樣本生成個人化趨勢線	★ 基於雙變量及多變量貝他分布的兩個新型機率分群模型

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

關聯式學習(Associated Learning, AL)將傳統的多層類神經網路模組化成多個較小的區塊，每個區塊有各自的局部目標。這些目標互相獨立，因此AL可以同時地訓練不同層的參數以提高訓練模型時的效率。儘管AL架構在多種任務上已證實能達到不輸傳統神經網路的成績，AL仍有多項尚未實驗證實的優點。
AL模型架構允許動態地增加模型層數，在不改變已訓練參數的狀況下，專注於訓練新增的AL層之參數，以達到更好的預測準率。相較之下，使用傳統神經網絡要動態地擴增參數量是非常的困難。
此外，AL架構在各層區塊預留了冗餘的捷徑(Shortcuts)，這些捷徑讓資料流在推論階段有多種路徑可選擇。
本論文探討AL動態疊加層(Dynamic Layer Accumulation)、提早輸出(Early Exit)以及自適應推論(Adaptive Inference)的特性，實作改良版本的AL，並比較各種AL推論方法。
我們更提出一種動態增加訓練特徵的架構，讓追加的AL層能夠額外接收原本AL層所沒有的特徵，以達到更好的訓練效果，我們的實驗使用多種經典的RNN及CNN模型作為AL架構的骨幹網路，並且在公開的文章分類及圖形分類資料集上實驗。

摘要(英)

Associated Learning (AL) modularizes traditional multi-layer neural networks into smaller blocks, each with its local objective. These independent objectives enables AL to train parameters of different layers simultaneously and improve training efficiency. Despite achieving comparable performance to traditional neural networks in various tasks, AL possesses several unexplored advantages.
The AL framework allows dynamic layer stacking, enabling the addition of AL layers without modifying the already trained parameters. This approach focuses on training the parameters of the newly added AL layers to achieve better prediction accuracy. In contrast, dynamically increasing the parameter size in traditional neural networks is challenging.
Furthermore, the AL architecture incorporates redundant shortcuts at each layer block, providing multiple paths for data flow during the inference stage.
This paper explores the characteristics of AL, including Dynamic Layer Accumulation, Early Exit, and Adaptive Inference, and implements improved versions of AL, comparing various AL inference methods.
We propose a framework for dynamically increasing training features, allowing the appended AL layers to receive additional features not present in the original AL layers. This enhancement aims to improve training effectiveness.
Since the design can incorporate new features without retraining the entire network, it improves the training effectiveness in a dynamic environment where new features may appear over time.
Our experiments employ classic RNN and CNN models as the backbone networks for the AL architecture and conduct evaluations on publicly available text classification and image classification datasets.

關鍵字(中)

★ 關聯式學習
★ 動態網路
★ 提早輸出
★ 動態推論
★ 動態擴增模型

關鍵字(英)

★ Associated Learning
★ Dynamic Neural Networks
★ Early Exit
★ Adaptive Inference
★ Dynamic Layer Accumulation

論文目次

摘要 v
Abstract vi
致謝 viii
目錄 ix
一、緒論 1
二、相關研究 3
2.1 關聯式學習架構 (Associated Learning) ............................ 3
2.2 動態模型架構 (Dynamic Architectures) ........................... 4
2.2.1 提早退出 (Early Exit)......................................... 4
2.2.2 動態擴增網路 ................................................... 4
三、研究模型及方法 6
3.1 AL 的推論路徑 .......................................................... 6
3.1.1 完整路徑推論 (Fullpath Inference)......................... 6
3.1.2 捷徑推論 (Shortcut Inference) .............................. 6
3.1.3 自適應路徑推論 (Adaptive Inference) ..................... 7
3.2 動態增加層機制 ......................................................... 8
3.3 動態增加特徵機制 ...................................................... 8
四、實驗結果與分析 10
4.1 實驗設定與實作細節 ................................................... 10
4.1.1 實驗設定 ......................................................... 10
4.1.2 實作細節 ......................................................... 10
4.2 提早輸出及動態推論 ................................................... 12
4.2.1 文字分類 ......................................................... 12
4.2.2 圖片分類 ......................................................... 13
4.2.3 準確率與推論時間 ............................................. 14
4.3 動態增加訓練層 ......................................................... 19
4.3.1 文字分類 ......................................................... 19
4.3.2 圖片分類 ......................................................... 20
4.4 動態增加訓練特徵 ...................................................... 23
4.4.1 文字分類 ......................................................... 23
4.5 討論 ........................................................................ 24
4.5.1 各層 FLOPs ..................................................... 24
4.5.2 各層樣本分布 ................................................... 26
4.5.3 自適應推論的閾值設定 ....................................... 29
五、總結 30
5.1 結論 ........................................................................ 30
5.2 未來展望 .................................................................. 30
參考文獻 32
附錄 A 實驗程式碼 34

參考文獻

[1] Y. Han, G. Huang, S. Song, L. Yang, H. Wang, and Y. Wang, Dynamic neural
networks: A survey, 2021. arXiv: 2102.04906 [cs.CV].
[2] S. Teerapittayanon, B. McDanel, and H. T. Kung, Branchynet: Fast inference via
early exiting from deep neural networks, 2017. arXiv: 1709.01686 [cs.NE].
[3] W. Liu, P. Zhou, Z. Zhao, Z. Wang, H. Deng, and Q. Ju, Fastbert: A self-distilling
bert with adaptive inference time, 2020. arXiv: 2004.02178 [cs.CL].
[4] Y. Wang, K. Lv, R. Huang, S. Song, L. Yang, and G. Huang, Glance and focus:
A dynamic approach to reducing spatial redundancy in image classification, 2020.
arXiv: 2010.05300 [cs.CV].
[5] Y.-W. Kao and H.-H. Chen, “Associated learning: Decomposing end-to-end back-
propagation based on autoencoders and target propagation,” Neural Computation,
vol. 33, no. 1, pp. 174–193, 2021.
[6] D. Y. Wu, D. Lin, V. Chen, and H.-H. Chen, “Associated learning: An alternative
to end-to-end backpropagation that works on CNN, RNN, and transformer,” in
International Conference on Learning Representations, 2022. [Online]. Available:
https://openreview.net/forum?id=4N-17dske79.
[7] E. Park, D. Kim, S. Kim, et al., “Big/little deep neural network for ultra low power
inference,” in 2015 International Conference on Hardware/Software Codesign and
System Synthesis (CODES+ISSS), 2015, pp. 124–132. doi: 10.1109/CODESISSS.
2015.7331375.
[8] T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama, Adaptive neural networks for
eﬀicient inference, 2017. arXiv: 1702.07811 [cs.LG].
[9] A. A. Rusu, N. C. Rabinowitz, G. Desjardins, et al., Progressive neural networks,
2022. arXiv: 1606.04671 [cs.LG].
[10] C. Liu, B. Zoph, M. Neumann, et al., Progressive neural architecture search, 2018.
arXiv: 1712.00559 [cs.CV].
[11] G. Huang, D. Chen, T. Li, F. Wu, L. van der Maaten, and K. Q. Weinberger,
Multi-scale dense networks for resource eﬀicient image classification, 2018. arXiv:
1703.09844 [cs.LG].
[12] L. Yang, Y. Han, X. Chen, S. Song, J. Dai, and G. Huang, Resolution adaptive
networks for eﬀicient inference, 2020. arXiv: 2003.07326 [cs.CV].
[13] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computa-
tion, vol. 9, pp. 1735–1780, 8 1997.
[14] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in Advances
in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, et
al., Eds., vol. 30, Curran Associates, Inc., 2017.
[15] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in Proceedings of the IEEE conference on computer vision and pattern recognition,
2016, pp. 770–778.
[17] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, 2017. arXiv:
1412.6980 [cs.LG].
33

指導教授

陳弘軒(Hung-Hsuan Chen)

審核日期

2023-7-25

推文