Performance of a Web-Based Clinical Diagnosis Support System for Internists

From Clinfowiki
Jump to: navigation, search

Performance of a Web-Based Clinical Decision Support System for Internists

Mark L. Graber, MD and Ashlei Mathew

BACKGROUND: Clinical Diagnosis Support Systems (CDSS) are aimed at assisting clinicians make the correct diagnosis and reduce medical errors. The early systems, ( e.g. QMR, Iliad, DXplain) proved to be somewhat useful but very time-consuming and with limited sensitivity and specificity. A new, second generation, Web-based CDSS called Isabel was lunched in the early 2000s. Its full name is “Isabel Diagnosis Reminder and Knowledge Mobilizing System” and is used for both pediatric and adult internal medicine cases. It accepts key clinical findings or whole-text entries. It uses natural language processing and is linked to major medical journals and textbooks. It interfaces with leading outpatient and inpatient EMR systems.

According to the developer, Isabel Health Care, Inc., the CDSS Isabel “has undergone a robust peer-reviewed validation process over 7 years to demonstrate its accuracy, effectiveness, and value.” In 2005 it received the HIT Product Innovation Award from Frost & Sullivan, a global growth consulting company. It is used by hospitals and academic centers in the US and abroad.

OBJECTIVE: This article describes the investigator’s evaluation of Isabel in the diagnoses of complex internal medicine cases involving adults. The effectiveness of this CDSS and its speed are both considered.

METHOD: The investigator tested 50 consecutive internal medicine case records published in the New England Journal of Medicine. The first method involved the entering of 3-6 key clinical findings from the case. The second method used whole-text entries from the entire case history. With both methods, the intent was to establish how often the correct diagnosis was included in the list of 30 diagnoses provided by Isabel.

RESULTS: By entering key clinical findings, the correct diagnosis was suggested in 48 out of 50 cases (96%). With the entire history pasted in, it was in 37 out of 50 cases (74%). Both methods were very fast and yielded results in 2-3 seconds.

CONCLUSIONS: The sensitivity in the first generation CDSSs was in the range of 50%-60% and took several minutes to produce results. Although Google has been suggested as an alternative due to its speed, it has been found that its sensitivity is as bad as the old CDSSs. However, Isabel was found to be both extremely fast and with high sensitivity, especially when entering key clinical findings instead of whole text. Since Isabel performed so well in the experimental setting, the investigator recommended further evaluation in the natural environment of clinical practice.

Submitted by Nicolas Thireos

Performance of a Web-based Clinical Decision Support System System for Internists Mark Graber and Ashlei Matthew


Does a proprietary Clinical Decision Support System help find the correct diagnosis without extensive data input.


The goal is to measure the sensitivity and speed of “Isabel” web-based CDS Isabel” (Isabel Healthcare Inc, USA) a second generation web-based CDS system. The goal of “Isabel” is to provide CDS to Internist by entering key findings (usually 3-6) or complete clinical narrative and with 2-3 seconds up to 1-2 minutes with outcome judged by presence of the correct diagnosis minutes in the first 30 diagnosis suggested by Isabel with 10 displayed per page. For a basis of medical knowledge, Isabel uses two key textbooks == Oxford Textbooks of Medicine 4th Edition and the Oxford Textbook of Geriatric Medicine and 46 major journals including General and Subspecialty Medicine and Toxicology.


61 consecutive “Case Records of Massachusetts General Hospital” New England Journal of medicine Vol. 350 == 166-1163204 V 353 189-198,205; only patients > 10 years of age and discussed actual diagnosis based on clinical and physical findings were Included as test cases. A total of 50 case reports were used to test Isabel. Two methods of data entry were used entering 3-6 key findings or full-text of the case report was copied and pasted into Isabel.


When entering 3-6 findings : 48 of 50 cases the correct diagnosis was included in the top 30 suggestions. For 2 out of 50 of the cases not found, the correct diagnosis was not in the Isabel database; therefore, there was no way for Isabel to suggest the diagnosis. When copying and pasting the whole text, Isabel’s top 30 diagnosis list included the correct diagnosis 37 out of 50 cases. For 51% of those with correct diagnosis was found in the first 10 diagnosis displayed. For 71% of the test cases, the correct diagnosis was included in the first 20 diagnosis displayed on two separate pages. Manual entering data required ~ 1 minute and copy and paste took about 5 seconds.


Isabel was able to suggest the correct diagnosis with the first 30 choices 37 out of 50 times or 48 out 50 times. Negative findings in clinical test narrative were missed for example “No chest Pain” was translated to “ chest pain“ and misguided Isabel. Commentary by Reviewer Timothy H Hartzog MD, FAAP ==

This article reads more like a glossy sales presentation than a journal articles. The underlying technology is never discussed; the methods used are skewed by non-blinded testers-who knew the correct diagnosis before testing the system.

NEJM case reports tend to be centered around rare and unusual cases not the typical cases; Rare cases tend to have unique finding which will Increase the chances of finding a match. Second weakness, the study’s choice of success being 1 positive match in one of the top 30 displayed diagnosis. This translates into of specificity of only 3 % which is not much better than randomly choosing 30 diagnosis/pages from any large medicine textbook. Third, there is no rank-order which is a problem because ordering test to cover 30 diagnoses would be an unnecessary waste of resources and pain for the patient. Fourth, weaknesses are there is no indication of the spread of diagnosis or were the other 29 diagnosis even a reasonable choice, were diagnosis grouped by organ system or other methodology.

Timothy Hartzog MD, FAAP Pediatric Hospitalis Medical university of South Carolina Medical Director of Information Technology