Developing and evaluating an automated appendicitis risk stratification algorithm for pediatric patients in the emergency department

From Clinfowiki
Jump to: navigation, search



To evaluate a proposed natural language processing (NLP) and machine-learning based automated method to risk stratify abdominal pain patients by analyzing the content of the electronic health record (EHR).


We analyzed the EHRs of a random sample of 2100 pediatric emergency department (ED) patients with abdominal pain, including all with a final diagnosis of appendicitis. We developed an automated system to extract relevant elements from ED physician notes and lab values and to automatically assign a risk category for acute appendicitis (high, equivocal, or low), based on the Pediatric Appendicitis Score. We evaluated the performance of the system against a manually created gold standard (chart reviews by ED physicians) for recall, specificity, and precision.


The system achieved an average F-measure of 0.867 (0.869 recall and 0.863 precision) for risk classification, which was comparable to physician experts. Recall/precision were 0.897/0.952 in the low-risk category, 0.855/0.886 in the high-risk category, and 0.854/0.766 in the equivocal-risk category. The information that the system required as input to achieve high F-measure was available within the first 4 h of the ED visit.


Automated appendicitis risk categorization based on EHR content, including information from clinical notes, shows comparable performance to physician chart reviewers as measured by their inter-annotator agreement and represents a promising new approach for computerized decision support to promote application of evidence-based medicine at the point of care."[1]


The authors of this paper chose to develop a method to triage abdominal pain pediatric patients into three tiers for possible appendicitis. This method used natural language processing (NLP) to gather data from the electronic health record (EHR) for analysis.

Materials and Methods

The researchers evaluated the pediatric patients by two methods. One method involved manually going through the records of the selected population, calculating the Pediatric Appendicitis Score (PAS) and then placing them into three groups (low-risk, equivocal-risk and high-risk) for appendicitis. The second method involved using an NLP algorithm developed by the team to mine the EHR for PAS elements that were then used to create a PAS. The researchers then used the computer developed PAS to assign the patients to the three above groups. Analysis was then used to compare the manual method to the computer generated method.


The most important results that I found in this study showed that the risk classification was comparably close. The computer method was best at "classifying low-risk patients (0.897 recall; 0.952 precision)" and had "lowest performance in the equivocal-risk class (0.854 recall; 0.766 precision)". Furthermore, the computer method "achieved 0.853 recall and 0.878 precision' in detecting the PAS elements.


The authors overall felt the performance of the developed computer method was very good. They note that the computer method could be used as a "second opinion" and give the attending physician another tool for evaluation and consideration. However, more needs to be done in evaluation of this tool. The authors felt the tool needs to be studied prospectively also as opposed to the retrospective nature of this study.


Having personal experience from working in the emergency department, tools such as these I think will be invaluable. These tools will help in a fast paced ED where you see many patients in a day.


  1. Deleger, L., Brodzinski, H., Zhai, H., Li, Q., Lingren, T., Kirkendall, E. S., … Solti, I. (2013). Developing and evaluating an automated appendicitis risk stratification algorithm for pediatric patients in the emergency department. Journal of the American Medical Informatics Association : JAMIA, 20(e2), e212–e220.