Towards More Efficient Data Valuation in Healthcare Federated Learning Using Ensembling

Page view(s)
14
Checked on Nov 08, 2024
Towards More Efficient Data Valuation in Healthcare Federated Learning Using Ensembling
Title:
Towards More Efficient Data Valuation in Healthcare Federated Learning Using Ensembling
Journal Title:
Lecture Notes in Computer Science
Keywords:
Publication Date:
07 October 2022
Citation:
Kumar, S., Lakshminarayanan, A., Chang, K., Guretno, F., Mien, I. H., Kalpathy-Cramer, J., Krishnaswamy, P., & Singh, P. (2022). Towards More Efficient Data Valuation in Healthcare Federated Learning Using Ensembling. Distributed, Collaborative, and Federated Learning, and Affordable AI and Healthcare for Resource Diverse Global Health, 119–129. https://doi.org/10.1007/978-3-031-18523-6_12
Abstract:
Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally – some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the National Science Foundation (NSF) - Assistive Integrative Support Tool for Retinopathy of Prematurity
Grant Reference no. : NSF1622542

This research / project is supported by the National Institutes of Health (NIH) - Quantitative MRI of Glioblastoma Response
Grant Reference no. : U01CA154601

This research / project is supported by the National Institutes of Health (NIH) - Informatics Tools for Optimized Imaging Biomarkers for Cancer Research & Discovery
Grant Reference no. : U24CA180927

This research / project is supported by the National Institutes of Health (NIH) - Quantitative Image Informatics for Cancer Research (QIICR)
Grant Reference no. : U24CA180918
Description:
This version of the article has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-031-18523-6_12
ISSN:
9783031185236
ISBN:
9783031185229
Files uploaded:

File Size Format Action
data-valuation-in-federated-learning.pdf 568.59 KB PDF Open