DocOIE: A Document-level Context-Aware Dataset for OpenIE

Page view(s)
Checked on Aug 17, 2022
DocOIE: A Document-level Context-Aware Dataset for OpenIE
DocOIE: A Document-level Context-Aware Dataset for OpenIE
Other Titles:
Publication Date:
04 August 2021
Dong, K., Yilin, Z., Sun, A., Kim, J.-J., Li, X. (2021). DocOIE: A Document-level Context-Aware Dataset for OpenIE. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. doi:10.18653/v1/2021.findings-acl.210
Open Information Extraction (OpenIE) aims to extract structured relational tuples (subject, relation, object) from sentences and plays critical roles for many downstream NLP applications. Existing solutions perform extraction at sentence level, without referring to any additional contextual information. In reality, however, a sentence typically exists as part of a document rather than standalone; we often need to access relevant contextual information around the sentence before we can accurately interpret it. As there is no document-level context-aware OpenIE dataset available, we manually annotate 800 sentences from 80 documents in two domains (Healthcare and Transportation) to form a DocOIE dataset for evaluation. In addition, we propose DocIE, a novel document-level context-aware OpenIE model. Our experimental results based on DocIE demonstrate that incorporating document-level context is helpful in improving OpenIE performance. Both DocOIE dataset and DocIE model will be released upon paper acceptance.
License type:
Attribution 4.0 International (CC BY 4.0)
Funding Info:
This research / project is supported by the Agency for Science, Technology and Research - AME Programmatic Funding Scheme
Grant Reference no. : #A19E2b0098 and #A18A2b0046