Wan, Z., Lin, Z., Rashid, S., Ng, S. Y.-H., Yin, R., Senthilnath, J., & Kwoh, C.-K. (2023, December 5). PESI: Paratope-Epitope Set Interaction for SARS-CoV-2 Neutralization Prediction. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). https://doi.org/10.1109/bibm58861.2023.10386059
Abstract:
Prediction of neutralization antibodies is important for the development of effective vaccines and antibody-based therapeutics. Traditional methods rely on features based on first principles derived from the binding interface. However, they are burdened by arduous data preprocessing from a limited quantity of protein structures. In comparison, deep learning allows automatic substructure characterization and representation without hand-crafted feature engineering. In particular, large language models (LLMs) based method predicts neutralization using Fv sequences of antibody and antigen. Despite LLM’s success, incorporating full-length Fv sequences suffers from: 1) inaccurate sequence-level labels in existing datasets, 2) inefficient modeling due to noisy non-contributing motifs, and 3) ignorance of non-bonded interactions that play a key role in facilitating epitope-paratope pairing. In this paper, we propose a novel approach that incorporates only the paratope and epitope for antibody-antigen neutralization prediction while adopting a novel set modeling that regards the paratope and epitope as bags of residues. Specifically, we hand-crafted a dataset containing neutralizing paratope-epitope pairs where epitopes are potentially generalizable to future unseen variants of SARS-CoV-2. Training on such a dataset enables deep learning models to predict neutralizing antibodies for prospective mutated variants of SARS-CoV-2, meanwhile addressing the problem of inaccurate sequence-level labels. A higher modeling efficiency is also achieved by disregarding non-contributing motifs. Furthermore, we also propose paratope-epitope set interaction (PESI), a set modeling model inspired by first principles that learns intra-inter non-covalent interactions through a global attention mechanism. To validate PESI, we perform a 10-fold cross-validation on our dataset. Experimental results show that PESI achieves a more balanced overall performance and a significant improvement on MCC as compared to existing architectures.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the Ministry of Education - Academic Research Fund Tier 2
Grant Reference no. : MOE2019-T2-2-175