Skip navigation

Lehto, Nanda research published in IISE journal

Photos of Prof. Mark Lehto & Asst. Prof. Gaurav Nanda
(l) Mark Lehto & (r) Gaurav Nanda
IISE Transactions on Healthcare Systems Engineering published research by a Purdue IE professor and IE alumnus on identifying rare causes of injuries from emergency room data by combining the strengths of artificial intelligence and human expertise.

The August 2019 article is titled "Semi-automated text mining strategies for identifying rare causes of injuries from emergency room triage data", by Gaurav Nanda, Kirsten Vallmuur, and Mark Lehto. Lehto is a professor in the School of Industrial Engineering, and Nanda (PhD 2017) is a Purdue IE alumnus and an Assistant Professor of Practice in the Purdue School of Engineering Technology.


Human coders, in many organizations conducting injury surveillance, routinely assign External-cause-of-injury codes (E-codes) to short narratives describing the incident, transcribed by triage nurses or others in hospital emergency rooms or other settings. Machine learning (ML) models trained on coded injury narratives can accurately assign E-codes to a large portion of the data, but tend to poorly predict cases falling into rare categories. In this study, we examined several ways of filtering out cases for human review that were likely to belong to rare categories from the predictions of Logistic Regression and Naïve Bayes classifiers for a manually-coded emergency department triage dataset of approximately 500,000 cases, collected between years 2002–2012, provided by the Queensland Injury Surveillance Unit. The ML models were trained using 90% of the data and the filtering approaches were evaluated on a prediction set comprised of the remaining cases. Cost analysis was also performed to compare the efficiency of each filtering method. The results showed that each filtering method greatly improved the ability to detect rare categories. Filtering using expert-designed causal linguistic rules combined with Logistic Regression prediction strength was found to be the most efficient approach. Several completely automated filtering approaches were also found to be effective.

Gaurav Nanda, Kirsten Vallmuur & Mark Lehto (2019) Semi-automated text mining strategies for identifying rare causes of injuries from emergency room triage data, IISE Transactions on Healthcare Systems Engineering, 9:2, 157-171, DOI: 10.1080/24725579.2019.1567628