以大型多模態語言模型進行穿搭推薦;Using Large Multi-modality Language Model for Outfit Recommendation

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/95504

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/95504

題名:	以大型多模態語言模型進行穿搭推薦;Using Large Multi-modality Language Model for Outfit Recommendation
作者:	姜道宣;Chiang, Tao-Shuan
貢獻者:	資訊管理學系
關鍵詞:	大型多模態模型;大型語言模型;穿搭適配性;穿搭推薦;Large Multi-modal Models;Large Language Models;Outfit Compatibility;Outfit recommendation
日期:	2024-07-16
上傳時間:	2024-10-09 16:54:36 (UTC+8)
出版者:	國立中央大學
摘要:	衣著的穿搭是人們表現自我最直接的方式。然而，人們在判斷上衣和下著之間的適配性時需要從顏色、風格等多方面進行考慮，這不僅花費大量時間，也需要承受失誤的風險。近年來，隨著大型語言模型與大型多模態模型的發展，許多應用領域發生了變革，本研究旨在探討如何利用大型多模態模型在服裝時尚搭配領域達成推薦的突破。本研究結合大型語言模型Gemini於VQA(Vision Question Amswering)任務中的關鍵字回覆文本，與大型多模態模型Beit3的深層特徵融合技術，讓使用者僅需提供衣物影像資料，即可對上衣和下著的適配性進行評分，方便使用者便捷利用。我們提出的模型Large Multi-modality Language Model for Outfit Recommendation (LMLMO) 在FashionVC和Evaluation3兩個資料集中的表現優於以往提出的模型。此外，實驗結果顯示，不同種類的關鍵字回覆對於模型的影響存在差異，這為未來的研究提供了新的方向和思考。;Outfit coordination is the most direct way for people to express themselves. However, judging the compatibility between tops and bottoms requires consideration of multiple factors such as color and style. This process is time-consuming and prone to errors. In recent years, with the development of large language models and large multi-modal models, many application fields have undergone transformations. This study aims to explore how to leverage large multi-modal models to achieve breakthroughs in clothing fashion outfit recommendation. This research combines the large language model Gemini′s keyword response text in the Vision Question Answering (VQA) task with the deep feature fusion technology of the large multi-modal model Beit3. By providing only image data of the clothing, users can evaluate the compatibility of tops and bottoms, making it convenient for users to utilize. Our proposed model, Large Multi-modality Language Model for Outfit Recommendation (LMLMO), outperforms previously proposed models on the FashionVC and Evaluation3 datasets. Moreover, experimental results show that different types of keyword responses have varying impacts on the model, offering new directions and insights for future research.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	105	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....