摘要(英) |
With the development of artificial intelligence in recent years, machine learning can be applied to more and more fields. Among them, deep learning is the most prominent, and has become the mainstream of machine learning in recent years.
This paper focuses on the automatic generation of cartoon images, and proposes a region clustering system for combined image generation. In the area of image generation, the image generation models used in most of the papers in recent years are based on deep learning, such as Generative Adversarial Network (GAN), Variational autoencoder (VAE), etc. This kind of image learning model based on deep learning has a very good generating capability, but usually requires a lot of training data and a long operation time, and the requirement of computing equipment is also expensive. For the general public, it usually depends on others to train a single-category generation model and is not possible to freely create multi-categories of images according to personal preferences.
The region clustering system proposed in this paper is intended to be applied to modular cartoon image creation We use the pre-trained convolutional neural network model to extract the features of input images’ regions, and then evaluating the cluster number of features by shallow network. At last, grouped these regions by unsupervised learning with the cluster number. Because of using shallow neural network, the computational cost and data volume requirements are lower compared to deep learning, and we don’t need any labels. By reducing the need for training data sets, the image generation system can more easily achieve multi-category image generation. The experimental results show that the system can automatically assess the number of better groupings and obtain good grouping results. |
參考文獻 |
[1] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
[2] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
[3] Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional Image Generation with PixelCNN Decoders. arXiv preprint arXiv:1606.05328, 2016.
[4] Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. Image Inpainting for Irregular Holes Using Partial Convolutions. arXiv preprint arXiv:1804.07723, 2018.
[5] Samaneh Azadi, Matthew Fisher, Vladimir Kim, Zhaowen Wang, Eli Shechtman, and Trevor Darrell. Multi-Content GAN for Few-Shot Font Style Transfer. arXiv preprint arXiv:1712.00516, 2017.
[6] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv preprint arXiv:1710.10196, 2018.
[7] Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. DRAW: A Recurrent Neural Network For Image Generation. arXiv preprint arXiv:1502.04623, 2015.
[8] Yanghua Jin, Jiakai Zhang, Minjun Li, Yingtao Tian, Huachun Zhu, and Zhihao Fang. Towards the Automatic Anime Characters Creation with Generative Adversarial Networks. arXiv preprint arXiv:1708.05509, 2017.
[9] Meng-Hang You. Automatic Cartoon Image Creation Through Learning from Examples. NCU CSIE, 2017.
[10] Nock, R. and F. Nielsen, Statistical region merging. IEEE Transactions on pattern analysis and machine intelligence, 2004. 26(11): p. 1452-1458.
[11] Samet, H.; Tamminen, M. (1988). "Efficient Component Labeling of Images of Arbitrary Dimension Represented by Linear Bintrees". IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Michael B. Dillencourt; Hannan Samet; Markku Tamminen (1992). "A general approach to connected-component labeling for arbitrary image representations". Journal of the ACM.
[13] D. H. Hubel and T. N. Wiesel, “Receptive fields of single neurones in the cat’s striate cortex,”J. Physiol. (London) 148, 574–591 (1959).
[14] leonardblier. (2016, February 29). A BRIEF REPORT OF THE HEURITECH DEEP LEARNING MEETUP #5. from https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/
[15] Karen Simonyan, and Andrew Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556, 2015.
[16] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks. Proceeding:NIPS′12 Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, Pages 1097-1105, Lake Tahoe, Nevada, December 03 - 06, 2012.
[17] Pearson, K. On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine. 1901, 2 (6): 559–572.
[18] Guenael Cabanes, and Younes Bennani, "A simultaneous two-level clustering algorithm for automatic model selection.", IEEE International Conference on Image Processing, 2007.
[19] Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59-69.
[20] MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297. |