When Text and Images Don’t Mix: Bias-Correcting Language-Image Similarity Scores for Anomaly Detection

Page view(s)

Checked on Aug 22, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/21445

Title:

When Text and Images Don’t Mix: Bias-Correcting Language-Image Similarity Scores for Anomaly Detection

Journal Title:

British Machine Vision Conference

DOI:

10.48550/arXiv.2407.17083

Publication URL:

https://bmvc2024.org/proceedings/406/

Authors:

Adam Goodge, Bryan Hooi, Wee Siong Ng

Keywords:

Publication Date:

25 November 2024

Citation:

Goodge, Adam, Bryan Hooi, and Wee Siong Ng. "When Text and Images Don't Mix: Bias-Correcting Language-Image Similarity Scores for Anomaly Detection." British Machine Vision Conference (2024).

Abstract:

Contrastive Language-Image Pre-training (CLIP) achieves remarkable performance in various downstream tasks through the alignment of image and text input embeddings, and holds great promise for anomaly detection. However, our empirical experiments show that the embeddings of text inputs unexpectedly tightly cluster together, far away from image embeddings, contrary to the model’s contrastive training objective to align image-text input pairs. We show that this phenomenon induces a ‘similarity bias’ - in which false negative and false positive errors occur due to bias in the similarities between images and the normal class label text embeddings. To address this bias, we propose a novel methodology called BLISS which directly accounts for this similarity bias through the use of an auxiliary, external set of text inputs. BLISS is simple, it does not require strong inductive biases about anomalous behaviour nor an expensive training process, and it significantly outperforms baseline methods on benchmark image datasets, even when access to normal data is extremely limited.

License type:

Attribution 4.0 International (CC BY 4.0)

Funding Info:

Funded by EC-2023-071 - AFMS2 Urban Freight Trip Generation (FTG) with Causal Analysis (AFMS Phase 2)

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/21445

ISBN:

10.48550

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
paper.pdf	449.53 KB	PDF	Open