中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/95335
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41143662      Online Users : 156
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/95335


    题名: 採用潛在擴散模型的影像去模糊;Image Deblurring Using Latent Diffusion Models
    作者: 吳政佳;Wu, Cheng-Chia
    贡献者: 通訊工程學系
    关键词: 影像去模糊;擴散模型;預訓練;提示微調;Image deblurring;diffusion model;pre-trained model;prompt tuning
    日期: 2024-07-22
    上传时间: 2024-10-09 16:40:43 (UTC+8)
    出版者: 國立中央大學
    摘要: 在影像去模糊任務中,大多採用pixel-level的損失函數,以減輕回復結果與真相之間的失真﹙distortion﹚。但此類做法忽視了人類眼睛對影像品質的主觀感知﹙perception﹚,導致回復結果細節不足。近年來,在影像合成﹙image synthesis﹚領域取得成功的擴散模型,也開始被運用於影像去模糊領域,不過雖然現有基於擴散模型的方法可以幫助感知問題,但在推論時,需要更多的運算量或處理時間。因此本篇論文設計了採用預訓練的大型潛在擴散模型幫助現有影像去模糊網路的方法,此作法僅會在訓練過程中,藉由潛在擴散模型提升原本影像去模糊網路的影像感知品質。預訓練的潛在擴散模型,會先經過本篇論文所提之prompt tuning方法,調整使其適配於幫助影像去模糊網路,相較於整個模型進行fine-tuning,需要訓練的參數量更少,且有效保持潛在擴散模型在預訓練時所得的先驗﹙prior﹚知識。最後本論文所提方案在GoPro資料集上,相較於原FFTformer方案,PSNR雖然下降了0.64dB,但在感知指標上有所提升,LPIPS下降了0.012,NIQE下降0.51,FID下降0.63,CLIP-IQA上升0.002以及CLIP-IQA^+上升0.01。;In the image deblurring task, most work uses pixel-level loss to reduce the distortion between the restored result and ground truth. However, these kind of methods overlook the human perception of image quality, leading to insufficient details in the restored results. Recently, diffusion models, which have achieved impressive success in image synthesis, have also been applied to the image deblurring task. Although the existing diffusion-based image deblurring methods can address the perception issue, they require more computational consumption or processing time during inference. In this paper, we propose a method that employs a pre-trained latent diffusion model to enchance the existing image deblurring model. This approach only utilizes latent diffusion model to improve perceptual quality of the result of the original image deblurring model (e.g., FFTformer) during training. And the pre-trained latent diffusion model will be adjusted to make it suitable for aiding the image deblurring network by new prompt tuning methods, as proposed in this paper. Compared with fine-tuning, the proposed method requires fewer training parameters and maintains the prior knowledge obtained during pre-training of the latent diffusion model. In experiments, our proposed method shows a 0.64 dB decrease in PSNR. However, it improves perceptual metrics, with LPIPS decreasing by 0.012, NIQE decreasing by 0.51, FID decreasing by 0.63, CLIP-IQA increasing by 0.002, and CLIP-IQA^+ increasing by 0.01.
    显示于类别:[Graduate Institute of Communication Engineering] Electronic Thesis & Dissertation

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML12检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明