Identifying the Proverbial “Needle in the Arm” with Natural Language Processing
Department of Research, United Network for Organ Sharing, Richmond, VA
Meeting: 2019 American Transplant Congress
Abstract number: D228
Keywords: Allocation, High-risk, Infection, Risk factors
Session Information
Session Name: Poster Session D: Non-Organ Specific: Public Policy & Allocation
Session Type: Poster Session
Date: Tuesday, June 4, 2019
Session Time: 6:00pm-7:00pm
Presentation Time: 6:00pm-7:00pm
Location: Hall C & D
*Purpose: Donors with a history of IVDU are associated with increased disease transmission risk; However, OPTN forms only request broad historical details relating to donor drug use. Hence, estimates of donors that present risk of unexpected transmission due to recent IVDU are not possible, despite being crucial to understanding the underlying risk of an organ offer, particularly in light of the current opioid epidemic. We propose leveraging donor text via natural language processing and big data methods to allow the OPTN to estimate prevalence rates for recent IVDU.
*Methods: Leveraging donor text data submitted as admission course notes, highlights, and med/soc histories on Deceased Donor Registration forms, we created word vectors via word vector methodology. Next, we developed ontologies for IDVU by employing seed words supplemented by the word vectors. Then, 1000 deceased donors were sampled from every year between 2014 and 2017. Applying the IVDU ontology, we assessed each donor using a term frequency-inverse document frequency weighting scheme on each donor’s text. Afterwards, we determined a cut-off score from training data via annotation and evaluation. Using bootstrapping to resample randomized subsets of donors not in the training data from each year between 2014 and 2017, we estimated the prevalence of the number/percentage of donors associated with recent IDVU.
*Results: We identified 122,219 donors with text, and developed 56,333 word vectors from 503,523 unique words and 44,873,276 words. The resulting ontologies generated for drug, modality, and context consisted of 45, 22, and 24 words respectively. The cut-off score was 43, which had a positive predictive value of 47.6%. Based on these results, the 2014 mean (95% CI): 0.028 (0.022 – 0.033) and 2017 mean (95% CI): 0.044 (0.037 – 0.053) were estimated.
*Conclusions: Despite noise within our estimates due to the complex task of identifying recent IVDU through text, we believe this novel methodology allows us to ascertain that recent IDVU has significantly increased from 2014 to 2017 and provide some directional results to the scale of that increase. Compared to observed numbers of donors with drug-related deaths during those periods, our results suggest that drug-related deaths form a larger group that may not possess the same level of risk as donors with recent IVDU. These methods show promise, leveraging donor text to provide directional answers to a pressing question about emerging issues that OPTN data collection forms have historically been unable to provide.
To cite this abstract in AMA style:
Vece GR, Foutz J, Placona A. Identifying the Proverbial “Needle in the Arm” with Natural Language Processing [abstract]. Am J Transplant. 2019; 19 (suppl 3). https://atcmeetingabstracts.com/abstract/identifying-the-proverbial-needle-in-the-arm-with-natural-language-processing/. Accessed January 18, 2025.« Back to 2019 American Transplant Congress