An Interpretable Measurement of Membership Inference in Diffusion Model

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98274

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98274

題名:	An Interpretable Measurement of Membership Inference in Diffusion Model
作者:	翁銘禧;Weng, Ming-Hsi
貢獻者:	資訊工程學系
關鍵詞:	生成模型;資料歸屬推論;可解釋人工智慧;擴散模型;Generative Models;Membership Inference;Explainable AI;Diffusion Models
日期:	2025-07-21
上傳時間:	2025-10-17 12:34:11 (UTC+8)
出版者:	國立中央大學
摘要:	儘管擴散模型擁有強大的生成能力，但是在使用訓練資料進行生成的過程中仍大多不透明。為此，我們提出一個新的可解釋性框架，透過分析輸入樣本在模型內部反向擴散過程中的潛在軌跡，以識別其與模型所學分布的語意對齊程度。我們定義一組錨點樣本，並從 U-Net 卷積網絡的中間層提取不同時間步的表示，藉此構建潛在空間中的軌跡。接著藉由使用內部與外部距離測量，將這些軌跡與模型生成輸出之軌跡進行比較，我們可以推論模型對每個錨點的熟悉程度。本方法提供了內部與外部雙重視角的可解釋性度量，並可用於進行訓練資料的事後歸因，而無需存取原始訓練集或額外的監督資訊。我們也在 LSUN-bedroom 與 ImageNet 上訓練的模型中驗證了此方法的有效性，並進一步將分析擴展至 Consistency Models，展現出我們解釋框架的可擴展性與普適性。本研究揭示了擴散模型所學語意結構的內在特性，並為生成式人工智慧中的資料歸因問題提供新的解法方向。;Despite their remarkable generative capabilities, diffusion models remain largely opaque with respect to how they utilize training data during generation. In this work, we propose a novel explainability framework that analyzes the internal reverse diffusion trajectories of input samples to identify their semantic alignment with the model’s learned distribution. We define a set of anchor samples and extract intermediate representations from the bottleneck of the U-Net across all timesteps to form latent trajectories. By comparing these trajectories to those of generated outputs using both intrinsic (latent space) and extrinsic (output space) distance metrics, we infer the degree of model familiarity for each anchor. Our method provides a dual-perspective interpretability measure and enables post hoc identification of training data influence—without requiring access to the original training set or additional supervision. We demonstrate the effectiveness of our approach on models trained on CIFAR-10 and ImageNet, and further extend our analysis to the one- step Consistency Models, highlighting the scalability and generality of our interpretability framework. Our findings offer insights into the semantic structure learned by diffusion models and open up directions for data attribution in generative AI.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	4	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....