Problem List Automation

From Clinfowiki
Jump to: navigation, search

Introduction & Background

Problem list population in Electronic Health Records (EHR) is integral in driving many key functions of clinical decision support systems. Accurate problem lists can be utilized to facilitate many functions, some which are the following:

  1. Summarize and describe medical conditions and their longitudinal progression
  2. Initiate/encourage provider actions
    1. clinical alerts
    2. provide disease guidance
    3. health maintenance reminders
    4. billing and other administrative purposes

Despite this, problem lists are often underutilized and lack the necessary data to unlock the benefits of programmed support. The reasons for this vary, but many factors for this failure revolve around the time constraints and workflow of clinical care providers. Providers often only view patient care in an episodic fashion rather than around a longitudinal framework in which a problem list would seem to hold more merit. Methodologies surrounding secondary entry of problem list data have been proposed to address this deficit. Goals of these methodologies include timely, precise, and accurate data entry with minimal workload burden on responsible providers. Natural language processing (NLP) has been proposed as a potential solution to this issue, particularly because of the attractive notion that natural language serves as the input medium in these systems.

Natural Language Processing is a newer research technique that attempts to automate the understanding of natural human language via the processing and complex computation of unstructured free text. Several research groups have postulated and attempted to demonstrate that refined NLP systems can be successful in parsing structured data from the clinical narrative text that resides in EHRs. NLP systems process text looking for patterns and candidate phrases that can be mapped to controlled terminologies such as SNOMED and the UMLS Metathesaurus. Codified terms can then drive clinical decision system support to provide beneficial care that may otherwise have been overlooked or forgotten.

Early Attempts At Computerized Problem List Generation

Scherpbier et al (1) created a very early and very simple computerized system to capture a clinical problem list in their EHR. they utilized pick lists of frequently used diagnoses that were generated based on provider type and displayed depending on provider role. One limitation of this method is that the number of items on the pick list was limited, representing a technical limitation that was present in their clinical information system at the time. The incentive for the user to utilize the pick list was that in doing so they increase their efficiency in billing and discharge documentation. Franco et al devised a similar method for computerized entry utilizing patient data to drive pick lists in a neonatal system (2). Wang, Bates et al's group at Partners Healthcare Systemtook a different approach, developing a local problem list dictionary and a data entry tool that was integrated into an outpatient EHR system. The system employed a search algorithm that was employed via three different interfaces. Their study included an analysis of coding and utilization rates and the effect the various interfaces had on these rates.they found that the user interface had a very significant effect on user acceptance and deficiencies in the data dictionary was also a large contributor to uncoded problems. In addition, their group also struggled increasing utilization rates despite implementing a number of decision-support incentives into their system (3).

MetaMap and MedLee

Perhaps the two best known early NLP mapping systems are MetaMap and MedLee. MetaMap was developed by Alan Aronson at the US National Library of Medicine and maps free text to UMLS terms. MedLee (Medical Language Extraction and Encoding) is Friedman et al's alternative approach that differs by parsing entire documents. MedLee has been extensively studied around its abilities mammography, neuroradiology, a pathology reports, and discharge summaries (4-8). Another notable difference between the two systems is in how they handle negation; that is, when problems are listed as absent were not present. MedLee accounts for these modifiers directly while MetaMap utilizes a secondary system to handle the negated concepts. The requirement of accounting for negation underscores the complexity of the English language and the grand challenges of refining NLP systems to be as sensitive and specific as necessary to be clinically accepted and successfully utilized.

Evaluation Results of Current NLP Systems

Meystre and Haug have published a large portion of recent articles on the use of MetaMap to automate problem list population. Most of their studies have focused on performance evaluations of extraction in negation for the proposition of problems to a user for inclusion on the problem list. in short, they have found decreasing performance measures as their systems attempted to accommodate increasing number of problems. This is obviously an important issue to consider when making assumptions about generalization and scalability (9,10). Imre Solti et al have begun to address this issue with a proof of concept model that attempts to address an infinite number of problems. Their tool, however, was limited to a cardiovascular domain (11).

A 2004 article by Friedman et al led its evaluators to make the observation that most of the errors encoding were attributed to ambiguous terms. This point should lead future investigations towards improving disambiguation work. Performance measures also decreased slightly during the study, but were reported to have rebounded nicely after some minor adjustments and matched the performance of earlier studies at different institutions (12).

Current Limitations of Natural Language Processing

Until the characteristics of NLP techniques approach reasonably high thresholds, the need for validation of automatically-generated problem list candidates will be necessary. User adoption of the problem list tools will still prove a challenge unless properly incentivized and they have demonstrated their effectiveness. Encouraging users to use the tools via careful interface design and workflow efficiency incentives will lead to improved adoption and more accurate and complete problem lists.

The second issue is scalability. As mentioned above, performance has already been noted to drop when NLP techniques have been generalized to widespread domains. Adjustments have corrected some performance measures as large remains were included in studies, a full deployment in all medical arenas will require significant improvements in current methodologies.

Timeliness of NLP techniques should also be considered as a barrier to full-scale employment. current in a key techniques are computationally complex and require considerable technological power. In order to deliver advanced decision support to clinicians in a timely manner, patient problems will need to be identified as quickly as possible so that interventions and orders can be suggested that a beneficial point in the care process.


  1. Scherpbier, H., et al., A simple approach to physician entry of patient problem list. Proc Annu Symp Comput Appl Med Care, 1994: ::p. 206-10.
  2. Franco, A., et al., "NEONATE"--an expert application for the "HELP" system: comparison of the computer's and the physician's ::problem list. J Med Syst, 1990. 14(5): p. 297-306.
  3. Wang, S., et al., Automated coded ambulatory problem lists: evaluation of a vocabulary and a data entry tool. Int J Med Inform, ::2003. 72(1-3): p. 17-28.
  4. Elkins, J.S., et al., Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language ::processing and manual review. Comput Biomed Res, 2000. 33(1): p. 1-10
  5. Cimino, J.J., et al., Knowledge-based approaches to the maintenance of a large controlled medical terminology. J Am Med Inform ::Assoc, 1994. 1(1): p. 35-50.
  6. Friedman, C., et al., Automating a severity score guideline for community-acquired pneumonia employing medical language ::processing of discharge summaries. Proc AMIA Symp, 1999: p. 256-60.
  7. Jain, N.L. and C. Friedman, Identification of findings suspicious for breast cancer based on natural language processing of ::mammogram reports. Proc AMIA Annu Fall Symp, 1997: p. 829-33.
  8. Xu, H. and C. Friedman, Facilitating research in pathology using natural language processing. AMIA Annu Symp Proc, 2003: p. 1057.
  9. Meystre, S. and P. Haug, Comparing natural language processing tools to extract medical problems from narrative text. AMIA Annu ::Symp Proc, 2005: p. 525-9.
  10. Meystre, S. and P. Haug, Natural language processing to extract medical problems from electronic clinical documents: performance ::evaluation. J Biomed Inform, 2006. 39(6): p. 589-99.
  11. Solti, I., et al., Building an automated problem list based on natural language processing: lessons learned in the early phase ::of development. AMIA Annu Symp Proc, 2008: p. 687-91.
  12. Hripcsak, G., G.J. Kuperman, and C. Friedman, Extracting findings from narrative reports: software transferability and sources ::of physician disagreement. Methods Inf Med, 1998. 37(1): p. 1-7.

Submitted by Eric Kirkendall