基於關聯式學習的動態自適應推論及動態網路擴增;Adaptive Inference and Dynamic Network Accumulation based on Associated Learning

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/93270

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/93270

Title:	基於關聯式學習的動態自適應推論及動態網路擴增;Adaptive Inference and Dynamic Network Accumulation based on Associated Learning
Authors:	彭彥霖;Peng, Yen-Lin
Contributors:	資訊工程學系
Keywords:	關聯式學習;動態網路;提早輸出;動態推論;動態擴增模型;Associated Learning;Dynamic Neural Networks;Early Exit;Adaptive Inference;Dynamic Layer Accumulation
Date:	2023-07-25
Issue Date:	2024-09-19 16:51:20 (UTC+8)
Publisher:	國立中央大學
Abstract:	關聯式學習(Associated Learning, AL)將傳統的多層類神經網路模組化成多個較小的區塊，每個區塊有各自的局部目標。這些目標互相獨立，因此AL可以同時地訓練不同層的參數以提高訓練模型時的效率。儘管AL架構在多種任務上已證實能達到不輸傳統神經網路的成績，AL仍有多項尚未實驗證實的優點。 AL模型架構允許動態地增加模型層數，在不改變已訓練參數的狀況下，專注於訓練新增的AL層之參數，以達到更好的預測準率。相較之下，使用傳統神經網絡要動態地擴增參數量是非常的困難。此外，AL架構在各層區塊預留了冗餘的捷徑(Shortcuts)，這些捷徑讓資料流在推論階段有多種路徑可選擇。本論文探討AL動態疊加層(Dynamic Layer Accumulation)、提早輸出(Early Exit)以及自適應推論(Adaptive Inference)的特性，實作改良版本的AL，並比較各種AL推論方法。我們更提出一種動態增加訓練特徵的架構，讓追加的AL層能夠額外接收原本AL層所沒有的特徵，以達到更好的訓練效果，我們的實驗使用多種經典的RNN及CNN模型作為AL架構的骨幹網路，並且在公開的文章分類及圖形分類資料集上實驗。;Associated Learning (AL) modularizes traditional multi-layer neural networks into smaller blocks, each with its local objective. These independent objectives enables AL to train parameters of different layers simultaneously and improve training efficiency. Despite achieving comparable performance to traditional neural networks in various tasks, AL possesses several unexplored advantages. The AL framework allows dynamic layer stacking, enabling the addition of AL layers without modifying the already trained parameters. This approach focuses on training the parameters of the newly added AL layers to achieve better prediction accuracy. In contrast, dynamically increasing the parameter size in traditional neural networks is challenging. Furthermore, the AL architecture incorporates redundant shortcuts at each layer block, providing multiple paths for data flow during the inference stage. This paper explores the characteristics of AL, including Dynamic Layer Accumulation, Early Exit, and Adaptive Inference, and implements improved versions of AL, comparing various AL inference methods. We propose a framework for dynamically increasing training features, allowing the appended AL layers to receive additional features not present in the original AL layers. This enhancement aims to improve training effectiveness. Since the design can incorporate new features without retraining the entire network, it improves the training effectiveness in a dynamic environment where new features may appear over time. Our experiments employ classic RNN and CNN models as the backbone networks for the AL architecture and conduct evaluations on publicly available text classification and image classification datasets.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	44	View/Open

社群 sharing

Loading...