符合時間與場景描述之自動影像生成模型;Automatic Nature Scene Image Generation with Time and Place Descriptions

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/81184

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81184

題名:	符合時間與場景描述之自動影像生成模型;Automatic Nature Scene Image Generation with Time and Place Descriptions
作者:	劉亞昇;Liu, Ya-Sheng
貢獻者:	資訊工程學系
關鍵詞:	對抗式生成網路;影像生成;注意力機制;想像力機制;GAN;Image Generation;Attention;Imagination
日期:	2019-07-25
上傳時間:	2019-09-03 15:38:44 (UTC+8)
出版者:	國立中央大學
摘要:	隨著人工智慧的蓬勃發展，無論在影像辨識、語意辨識，影像生成…等等，機器學習都取得了優異的成果，「人工智慧」四個字顧名思義是要人類所創造出來的智慧，藉由讓電腦學習的方式來讓機器或電腦獲得一定的邏輯判斷能力，是目前我們所達到的，但如果微觀的角度去看人工智慧的發展其實還是未到達真正的智慧。本篇論文主要是想模擬人類大腦的想像能力來增加生成模型的多樣性，在text-to-image這方面領域其實已經有一些研究了，像是近年的StackGAN、StackGAN++和AttnGAN，只是他們初始的目標都是針對鳥類(CUB-200)資料集和花朵(102Flowers)資料集去做訓練和優化，通常人類在想像一個事物時，通常會給予該事物一個描述，本篇最終目標是利用這個描述產生一個有故事性的圖片，目前階段以蒐集場景的資料來使神經網路有能力產生一個符合描述的場景圖並加強多樣性。為了讓生成的圖片有更多的多樣性而不是特定的單幾種圖片，本篇利用圖片的隱藏層資訊來初始化RNN的Memory Cell來產生更豐富的圖片，從實驗結果中，比起直接套用先前研究的網路架構，加入這個方法確實有助於增加生成圖片的多樣性。 ;With the rapid development of artificial intelligence, machine learning has achieved excellent results in image recognition, semantic recognition, image generation, etc. The deep meaning of the words “artificial intelligence” are the wisdom of human being. Let the computer to learn the way to get a certain logical judgment ability, which is what we have achieved at present, but if we look at the development of artificial intel-ligence from a microscopic point of view, we still have not reached the true intelligence. This paper is mainly to simulate the imagination of human brain. In the field of text-to-image, there have been some researches, such as StackGAN, StackGAN++ and AttnGAN in recent years, but their initial goal is to target bird dataset (CUB-200) and flower (102Flowers) dataset for training and optimization. Usually when people imag-ine a thing, they usually give a description of the thing. The ultimate goal of this paper is to produce a narrative photo with description. In present stage, we make neural-based network an ability of generating scene photos corresponded to the description and en-hance the diversity with our dataset. In order to make the generated images more diverse than a specific single image, this paper uses the hidden layer information of the image to initialize the RNN Memory Cell to produce a narrative photo. From the experimental results, it indeed works. Comparing to the original AttnGAN architecture, our proposed method does help to increase the diversity of generated images.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	202	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....