Image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. While CNN-based descriptors are generally known for good retrieval performance, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based Fisher Vector
counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve
retrieval results every time a new type of invariance is incorporated.