dc.description.abstract | Scene recognition is an important part of computer vision. The efficiency of current machine learning methods is much better than traditional processing methods. However, using neural networks directly for classification often loses more information of objects, spatial layout, and background. Resulting in poor classification. Therefore, it is an important challenge in scene classification to capture the information of objects, spatial layout, and background, and use an effective method to merge these features to classify scene.
The method proposed in this paper performs semantic segmentation on the image. Use Neural network model to extract the features of the semantic segmentation image and original image respectively. And then, use the attention module to fuse the semantic segmentation features with original image features. Finally, according to these fused features to classify images.
The experiment results show that our method can achieve the best result on the Hotel Indoor Scene dataset. Furthermore, in the public 15-Scene dataset, our method can outperform existing methods. Therefore, by using semantic segmentation, the information of objects, spatial layout and background can be captured. Using the attention module to do feature fusion can achieve better accuracy in scene recognition. | en_US |