English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41627620      線上人數 : 2366
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89943


    題名: 基於腦浮點運算及稀疏性考量之低功耗高能效神經網路訓練硬體架構設計與實作;Design and Implementation of Low-Power, Energy-Efficient Neural Network Training Hardware Accelerators Based on Brain Floating-Point Computing and Sparsity Aware
    作者: 林定邦;Lin, Ding-Bang
    貢獻者: 電機工程學系
    關鍵詞: 全連接層;AI加速器;記憶體優化;稀疏性;低功耗;Fully Connected Layers;AI Accelerator;Optimized Memory Access;Sparsity;Low Power
    日期: 2022-08-03
    上傳時間: 2022-10-04 12:05:23 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年來隨著科技的進步與大數據時代的來臨,深度學習給各領域中帶來 革命性的進展,從基本的影像前處理、影像增強技術、人臉辨識、語音辨識 等相關技術,逐漸的取代了傳統演算法,這說明了神經網路的興起已經帶動 人工智慧的在這些領域中改革。但受限於 GPU 高成本的問題,導致其產品 都很昂貴,且因為 GPU 的功耗較大,這也造成了在推理神經網路時能效數 值偏低。由於神經網路的演算法有龐大的計算量,須配合加速的硬體來進行 實時運算,這促使近幾年來有不少研究是針對深度神經網路的加速數位電 路硬體設計。
    在本論文中,我們提出一個具有高效能、高靈活性的訓練處理器,我們 把它命名為 EESA。擬議的訓練處理器具有低功耗、高吞吐量和高能效等特 點,EESA 利用神經元激活函數後的稀疏性來減少記憶體訪問的次數以及記 憶體儲存的空間,以實現高效的訓練加速器。所提出的處理器使用了一種新 穎的可重新配置的計算架構,在正向傳播(FP)以及反向傳播(BP)過程 中保持高性能。該處理器採用台積電 40 nm 工藝技術實現,能運行的操作 頻率為 294 MHz,整個晶片的功耗為 87.12 mW,使用的核心電壓為 0.9 V。 在整個晶片中,我們使用 16 位元的腦浮點運算精度格式來完成所有資料的 數值運算,最終該處理器實現了 1.72 TOPS/W 的高能效表現。;In recent years, deep learning has brought revolutionary progress in various fields with the advent of technology and big data era, from basic image pre- processing, image enhancement technology, face recognition, voice recognition and other related technologies, gradually replacing traditional algorithms, which shows that the rise of neural networks has led to the reform of artificial intelligence in these fields. However, due to the high cost of GPUs, the products are expensive, and the power consumption of GPUs is high, which results in low energy efficiency values when reasoning about neural networks. Since neural network algorithms are computationally intensive, they require accelerated hardware for real-time computation, which has led to a lot of research in recent years on accelerated digital circuit hardware design for deep neural networks.
    In this paper, we proposed an efficient and flexible training processor, called EESA. Our proposed training processor features low power consumption, high throughput and high energy efficiency. EESA uses the sparsity of neuron activations to reduce the number of memory accesses and storage memory space to achieve an efficient training accelerator. The proposed processor uses a novel reconfigurable computing architecture to maintain high performance when operating Forward Propagation (FP) and Backward Propagation (BP) passes. The processor is implemented in TSMC 40nm technology process, with an operating frequency of 294MHz and power consumption of 87.12mW at the core voltage of 0.9V. For 16-bit brain floating point precision format, the processor achieves an energy efficiency of 1.72TOPS/W.
    顯示於類別:[電機工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML41檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明