Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors

Page view(s)
72
Checked on Apr 15, 2024
Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors
Title:
Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors
Journal Title:
IEEE Transactions on Image Processing
Publication Date:
30 August 2023
Citation:
Hou, J., Lin, W., Fang, Y., Wu, H., Chen, C., Liao, L., & Liu, W. (2023). Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors. IEEE Transactions on Image Processing, 1–1. https://doi.org/10.1109/tip.2023.3308852
Abstract:
Deep learning approaches for Image Aesthetics As- sessment (IAA) have shown promising results in recent years, but the internal mechanisms of these models remain unclear. Previous studies have demonstrated that image aesthetics can be predicted using semantic features, such as pre-trained object classification features. However, these semantic features are learned implicitly, and therefore, previous works have not elucidated what the se- mantic features are representing. In this work, we aim to create a more transparent deep learning framework for IAA by intro- ducing explainable semantic features. To achieve this, we propose Tag-based Content Descriptors (TCDs), where each value in a TCD describes the relevance of an image to a human-readable tag that refers to a specific type of image content. This allows us to build IAA models from explicit descriptions of image contents. We first propose the explicit matching process to produce TCDs that adopt predefined tags to describe image contents. We show that a simple MLP-based IAA model with TCDs only based on predefined tags can achieve an SRCC of 0.767, which is com- parable to most state-of-the-art methods. However, predefined tags may not be sufficient to describe all possible image contents that the model may encounter. Therefore, we further propose the implicit matching process to describe image contents that can- not be described by predefined tags. By integrating components obtained from the implicit matching process into TCDs, the IAA model further achieves an SRCC of 0.817, which significantly outperforms existing IAA methods. Both the explicit matching process and the implicit matching process are realized by the proposed TCD generator. To evaluate the performance of the proposed TCD generator in matching images with predefined tags, we also labeled 5101 images with photography-related tags to form a validation set. And experimental results show that the proposed TCD generator can meaningfully assign photography- related tags to images.
License type:
Publisher Copyright
Funding Info:
This study is supported under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
1941-0042
1057-7149
Files uploaded:

File Size Format Action
towards-transparent-deep-image-aesthetics-assessment-with-tag-based-content-descriptors-final-submission-7.pdf 8.95 MB PDF Request a copy