ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs

Page view(s)

Checked on

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/22726

Title:

ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs

Journal Title:

Transactions on Machine Learning Research

DOI:

10.48550/arXiv.2506.01386

Publication URL:

https://jmlr.org/tmlr/papers/index.html

Authors:

Manit Baser, Dinil Mon Divakaran, Mohan Gurusamy

Keywords:

Machine Learning, model editing, large language models

Publication Date:

01 February 2026

Citation:

Baser, M., Divakaran, D.M. and Gurusamy, M., "ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs," Transactions on Machine Learning Research, 2026

Abstract:

Robust model-editing techniques are essential for deploying large language models (LLMs) in practical applications, as they enable cost-effective ways to deal with challenges such as privacy breaches, bias mitigation and misinformation spread. For example, an LLM- based healthcare assistance may need to update out-dated or incorrect knowledge to pre- vent harmful recommendations. However, many editing techniques focus on isolated facts, which critically fail to prevent indirect knowledge leakage—the unintended reconstruction of edited-out information through persistent causal links and contextual relationships. To assist users in selecting the right editing technique, we develop and present ThinkEval, a framework to systematically quantify indirect knowledge leakage and ripple effects in model- editing. ThinkEval builds and employs specialized knowledge graphs to analyze the causal structure of facts before and after editing. To support this approach, we present KnowGIC, a benchmark dataset comprising multi-step reasoning paths that precisely measure these complex knowledge transformation effects. We evaluate five editing techniques: AlphaEdit, RECT, ROME, MEMIT, and PRUNE across multiple LLMs. Our results show that these techniques struggle to balance indirect fact suppression with the preservation of related knowledge, compromising the contextual integrity of a model’s knowledge. Our dataset is available at: https://github.com/manitbaser/KnowGIC.

License type:

Publisher Copyright

Funding Info:

There was no specific funding for the research done

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/22726

ISBN:

https://doi.org/10.48550/arXiv.2506.01386

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
thinkeval-tmlr-2026.pdf	2.36 MB	PDF	Open