S. Fan, T.-T. Ng, B. Koenig, M. Jiang, Q. Zhao“A Paradigm for Building Generalized Models of Human Visual Perception through Data Fusion,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
Abstract:
In many sub-fields, researchers collect datasets of human ground truth that are used to create a new algorithm. For example, in research on image perception, datasets have been collected for topics such as what makes an image aesthetic or memorable. Despite high costs for human data collection, datasets are infrequently reused. Moreover, the algorithms built from them are domain-specific (predict a small set of
attributes) and usually unconnected to one another. In this paper, we present a paradigm for building generalized and expandable models of human visual perception. First, we fuse multiple fragmented and partially-overlapping datasets through data imputation. We then create a theoretically structured statistical model of human visual perception that is fit to the fused datasets. The resulting model has many advantages. (1) It is generalized, going beyond the content of the constituent datasets, and can be easily expanded by fusing additional datasets. (2) It provides a new ontology usable as a network to expand human data in a cost-effective way. (3) It can guide the design of a generalized computational algorithm for multi-dimensional visual perception. Indeed, experimental results show that a model-based algorithm outperforms state-of-the-art methods on predicting visual sentiment, visual realism and interestingness. Our paradigm can be used in various disciplines.