Temporal reasoning with medical data

From Clinfowiki
Jump to: navigation, search

Zhou L, Hripcsak G. Temporal reasoning with medical data: a review with emphasis on medical natural language processing. J Biomed Inform. 2007 Apr;40(2):183-202.

Natural language processing (NLP) is a subfield of artificial intelligence and linguistics that strives to create computer systems that can extract computable information from human languages or convert information from computer databases into human languages.

Medical language processing (MLP) is a sub-specialty of NLP that focuses on medical narrative text. In the article “Temporal reasoning with medical data--a review with emphasis on medical natural language processing”, the authors review the history and current status of research into medical language processing with a focus on the processing of information about time.

Time is very important in medicine. The expression of time is vital in assessing the effectiveness of treatments and the progress of disease. For example, the duration of time that a patient has had symptoms may be an importance clue in diagnosis. Having the ability to extract time information from medical narratives into computable forms could be very beneficial in clinical information systems such as EHRs and decision support systems.

Unfortunately, it is very challenging to extract time information from natural language. Information about time can be conveyed through different parts of speech such as verb tense, conjunctions (while, whenever, before, as soon as), and adverbs (then, soon, and recent). Also some expressions for temporal information can be vague (e.g.by the time, hardly ever, so far). It is even more difficult to process time information in medical language than in general language. First of all, medical language can be terse. It often omits verbs or consists of sentence fragments. Abbreviations and misspellings are common. Medical language also contains specialized time terms such as “q.i.d” and “post-op #6” (6 days after surgery).

In this article, the authors follow a top-down approach in explaining medical language processing of temporal information. They first discuss the branch of artificial intelligence that studies temporal representation and reasoning (TRR). They briefly describe the main terms and concepts in TRR such as ways to classify time (natural, conventional, or logical), ways to classify the structure of time (line, branching time, circular time, parallel time), and methodologies such as Situation Calculus, Event Calculus, Fluent Calculus, and TimeML. They also provide a timeline of the major research studies of temporal processing in natural language.

Then they narrow their focus from general language to medical narrative text. There has been far less research in processing time in medical language than in general language. The authors describe various approaches to modeling temporal information in medical applications. They chronicle research MLP systems, such as the Linguistic String Project in the 1980s, the Medical Language Extraction and Encoding System (MedLEE) in the 1990s, and MedSyndikate in the 2000s. For each system, they explain if or how it handles temporal information. They also mention the efforts of groups such as CEN, HL7, and openEHR to standardize temporal concepts in the EMR to facilitate data exchange and integration.

This article provides a very good, high-level introduction to MLP, a field that draws on many disciplines “including philosophy, artificial intelligence (AI), database management, computational linguistics, and biomedical informatics”. Although the authors do not explain many of the terms they use, they provide references for further exploration into this field. The authors make clear the importance and difficulty of processing time references in medical language.

They emphasize the need for continuing research in this field such as:

  1. Study of how temporal information is conveyed in medical narratives.
  2. Analysis of text at the pragmatic level.
  3. Research into the best ways to evaluate MLP systems’ handling of temporal expressions.
  4. Integration of narrative information with structured database information.

This article could be useful to anyone interested in in natural language processing of medical data, such as biomedical informaticists, clinical information system developers, linguists, and clinicians.

Maria Reiss