中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/95335
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41363119      Online Users : 1092
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/95335


    Title: 採用潛在擴散模型的影像去模糊;Image Deblurring Using Latent Diffusion Models
    Authors: 吳政佳;Wu, Cheng-Chia
    Contributors: 通訊工程學系
    Keywords: 影像去模糊;擴散模型;預訓練;提示微調;Image deblurring;diffusion model;pre-trained model;prompt tuning
    Date: 2024-07-22
    Issue Date: 2024-10-09 16:40:43 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 在影像去模糊任務中,大多採用pixel-level的損失函數,以減輕回復結果與真相之間的失真﹙distortion﹚。但此類做法忽視了人類眼睛對影像品質的主觀感知﹙perception﹚,導致回復結果細節不足。近年來,在影像合成﹙image synthesis﹚領域取得成功的擴散模型,也開始被運用於影像去模糊領域,不過雖然現有基於擴散模型的方法可以幫助感知問題,但在推論時,需要更多的運算量或處理時間。因此本篇論文設計了採用預訓練的大型潛在擴散模型幫助現有影像去模糊網路的方法,此作法僅會在訓練過程中,藉由潛在擴散模型提升原本影像去模糊網路的影像感知品質。預訓練的潛在擴散模型,會先經過本篇論文所提之prompt tuning方法,調整使其適配於幫助影像去模糊網路,相較於整個模型進行fine-tuning,需要訓練的參數量更少,且有效保持潛在擴散模型在預訓練時所得的先驗﹙prior﹚知識。最後本論文所提方案在GoPro資料集上,相較於原FFTformer方案,PSNR雖然下降了0.64dB,但在感知指標上有所提升,LPIPS下降了0.012,NIQE下降0.51,FID下降0.63,CLIP-IQA上升0.002以及CLIP-IQA^+上升0.01。;In the image deblurring task, most work uses pixel-level loss to reduce the distortion between the restored result and ground truth. However, these kind of methods overlook the human perception of image quality, leading to insufficient details in the restored results. Recently, diffusion models, which have achieved impressive success in image synthesis, have also been applied to the image deblurring task. Although the existing diffusion-based image deblurring methods can address the perception issue, they require more computational consumption or processing time during inference. In this paper, we propose a method that employs a pre-trained latent diffusion model to enchance the existing image deblurring model. This approach only utilizes latent diffusion model to improve perceptual quality of the result of the original image deblurring model (e.g., FFTformer) during training. And the pre-trained latent diffusion model will be adjusted to make it suitable for aiding the image deblurring network by new prompt tuning methods, as proposed in this paper. Compared with fine-tuning, the proposed method requires fewer training parameters and maintains the prior knowledge obtained during pre-training of the latent diffusion model. In experiments, our proposed method shows a 0.64 dB decrease in PSNR. However, it improves perceptual metrics, with LPIPS decreasing by 0.012, NIQE decreasing by 0.51, FID decreasing by 0.63, CLIP-IQA increasing by 0.002, and CLIP-IQA^+ increasing by 0.01.
    Appears in Collections:[Graduate Institute of Communication Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML12View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明