Difference between revisions of "Applied ontology"

From Clinfowiki
Jump to: navigation, search
(Created page with "== Ontology == Ontology is the study of entities that exist and the properties of their existence. Applied ontology concerns itself with the application of such principles to bui...")
 
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
'''Applied ontology''' (subfield of [[ontology]]) concerns itself with the application of such principles to build knowledge frameworks for specific domains such as medicine, biology, geography, etc. In recent years the development of ontologies has been moving from the realm of Artificial-Intelligence laboratories to the desktops of domain experts and has become common on the World-Wide Web.
 +
 
== Ontology ==
 
== Ontology ==
Ontology is the study of entities that exist and the properties of their existence. Applied ontology concerns itself with the application of such principles to build knowledge frameworks for specific domains such as medicine, biology, geography, etc. The main motivations for building ontologies are to make propositions with precise meaning and to enable computers to automate data processing. ICD-9, SNOMED, and MeSH often serve as examples, but the principles of building those terminologies are not often explored. This article will review the main themes in Applied Ontology, a book by Katherine Munn and Barry Smith, which provides an overview of the philosophies and principles involved in the tasks of knowledge development and curation.
 
  
== Philosophy ==
+
The main motivations for building ontologies are to make propositions with precise meaning and to enable computers to automate data processing. [[ICD]], [[SNOMED]], and MeSH often serve as examples, but the principles of building those terminologies are not often explored. This article will review the main themes in Applied Ontology, a book by Katherine Munn and Barry Smith, which provides an overview of the philosophies and principles involved in the tasks of knowledge development and curation.  
Munn and Smith, consider two prevailing schools of thought: metaphysics (term-oriented or realism) and epistemological (concept-oriented or conceptualism). “Aristotle believed that reality in its entirety could be represented with a single system.”  Kant, however, believed that we reconcile reality through the concepts we form in our mind. For example, how can we perceive that a beaver has a flat tail without first defining a beaver as a rodent with a flat tail? The other main argument for conceptualism is that reality is too complex to be adequately specified under one terminology. An argument against conceptualism is that the ontology must fit in entities which have no actual representation in reality like the caloric or ether.
+
The response to this was that realism can be fallible and that multiple perspectives can be true. Fallible realism states that can assert propositions we believe to be true, and change our beliefs, or the ontology, when we find evidence indicating otherwise. Perspectivism acknowledges that facts can be partitioned in ways that are different but nonetheless true concerning reality. The realist perspective also deals with terms about non-existent entities an empty terms. Given this, the Munn and Smith have a preferred stance towards realism when developing a knowledgebase.
+
Definition
+
An ontology is made up of its entities and the relationships between them. An entity is specified by its preferred term, synonyms, and definition. Terms can either refer to real-world entities or concepts concerning real world entities. If we want to build a usable ontology, Cimino’s desiderata proposes some useful properties to adhere to:
+
  
* Concepts which form the nodes of the terminology must correspond to at least one meaning (non-vagueness)
+
Why would someone want to develop an ontology? Some of the reasons are:
* Concepts must correspond to no more than one meaning (non-ambiguity)
+
* Meanings must themselves correspond to no more than one concept (non-redundancy)
+
  
Munn and Smith identify Cimino’s usage of the phrase concept to mean a “plurality of words” and that context can influence the meaning of those words.
+
      • To share common understanding of the structure of information among people or software agents
 +
      • To enable reuse of domain knowledge
 +
      • To make domain assumptions explicit
 +
      • To separate domain knowledge from the operational knowledge
 +
      • To analyze domain knowledge3
  
The other aspects to consider is whether a reference to an entity is in general or specific, and whether it is an existant or continuant. A general entity or term refers to the universal representations of itself, whereas a specific entity is tied to a particular instantiation. For example, if one were to consider the term cancer, they would think of all the general characteristics ascribed to that disease. If one were to consider an individual’s cancer, then they view the disease with respect to its instantiation in the individual. An existant refers to items that have no temporal constraint, or beginning and ending. An occurant applies a time frame to the entity. The concept of a horse is timeless in the realm of the mind until one considers a particular horse that has a concrete birth and death associated with it.  
+
''Sharing common understanding of the structure of information among people or software agents'' is one of the more common goals in developing ontologies.4 For example, suppose several different Web sites contain medical information or provide medical e-commerce services. If these Web sites share and publish the same underlying ontology of the terms they all use, then computer agents can extract and aggregate information from these different sites. The agents can use this aggregated information to answer user queries or as input data to other applications.3
These aspects are actually part of a more general set of ontological relations: inhere_in, characterize, instantiate, and exemplify. These represent interactions between existants (universals) and occurants (individuals) as they occur in Aristotle’s Ontological square. Other relationships exist, such as is_a, is_not_a, part_of, has_part, located_in, has_participant, etc. Other relationships can be devised if they make sense within the context of the problem the ontology is addressing.
+
 
 +
''Enabling reuse of domain knowledge'' was one of the driving forces behind recent surge in ontology research. For example, models for many different domains need to represent the notion of time. This representation includes the notions of time intervals, points in time, relative measures of time, and so on. If one group of researchers develops such an ontology in detail, others can simply reuse it for their domains.3
 +
 
 +
''Making explicit domain assumptions'' underlying an implementation makes it possible to change these assumptions easily if our knowledge about the domain changes. Hard-coding assumptions about the world in programming-language code make these assumptions not only hard to find and understand but also hard to change, in particular for someone without programming expertise. In addition, explicit specifications of domain knowledge are useful for new users who must learn what terms in the domain mean.3
 +
 
 +
''Separating the domain knowledge from the operational knowledge'' is another common use of ontologies. We can describe a task of configuring a product from its components according to a required specification and implement a program that does this configuration independent of the products and components themselves.5
 +
 
 +
''Analyzing domain knowledge'' is possible once a declarative specification of the terms is available. Formal analysis of terms is extremely valuable when both attempting to reuse existing ontologies and extending them.6
 +
 
 +
== Developing the ontology ==
 +
 
 +
Some ontology-design ideas originated from the literature on object-oriented design. However, ontology development is different from designing classes and relations in object-oriented programming. Object-oriented programming centers primarily around methods on classes—a programmer makes design decisions based on the operational properties of a class, whereas an ontology designer makes these decisions based on the structural properties of a class.3
 +
 
 +
Pinto and Martin’s discuss generic ontology development stages: specification, conceptualization, formalization, implementation and maintenance. Kuziemsky expanded upon these concepts for ontology-based health information system design. He described using a participatory design (PD) method used not only to design a product but also to ensure the usability and utility of the product by engaging end users in the design process. Also described was grounded theory (GT) methodology which is a research method that uses a systematic set of procedures to develop an inductively derived theory about a phenomenon. Kuziemsky describes a hybrid Grounded Theory (GT) – Participatory Design (PD) that draws upon the strengths of both GT and PD. PD provides a means of user engagement to obtain a rich perspective on clinical practice and requires domain experts. GT codes data into concepts and categories based upon the understanding about the data. The methodological thoroughness of the GT-PD approach enables the ontology concepts and problem-solving approaches to be traced back to the source data. That traceability provides realistic validation to the ontology and also potentially promotes sharing of the ontology in different settings.7 8
 +
 
 +
Again, there is no one “correct” way or methodology for developing ontologies. In the ontology development methodology described here, we use an iterative approach. The best solution almost always depends on the application that you have in mind and the extensions that you anticipate. Concepts in the ontology should be close to objects (physical or logical) and relationships in your domain of interest.3
 +
 
 +
'''Step 1. Determine the domain and scope of the ontology'''
 +
We suggest starting the development of an ontology by defining its domain and scope. One of the ways to determine the scope of the ontology is to sketch a list of questions that a knowledge base based on the ontology should be able to answer competency questions.
 +
That is, answer several basic questions:
 +
 
 +
      • What is the domain that the ontology will cover?
 +
      • For what we are going to use the ontology?
 +
      • For what types of questions the information in the ontology should provide answers?
 +
      • Who will use and maintain the ontology?9
 +
 
 +
'''Step 2. Consider reusing existing ontologies'''
 +
It is almost always worth considering what someone else has done and checking if we can refine and extend existing sources for our particular domain and task. Reusing existing ontologies may be a requirement if our system needs to interact with other applications that have already committed to particular ontologies or controlled vocabularies.
 +
 
 +
'''Step 3. Enumerate important terms in the ontology'''
 +
It is useful to write down a list of all terms we would like either to make statements about or to explain to a user. What are the terms we would like to talk about? What properties do those terms have? What would we like to say about those terms?3
 +
 
 +
'''Step 4. Define the classes and the class hierarchy'''
 +
There are several possible approaches in developing a class hierarchy. [10] A top-down development process starts with the definition of the most general concepts in the domain and subsequent specialization of the concepts. A bottom-up development process starts with the definition of the most specific classes, the leaves of the hierarchy, with subsequent grouping of these classes into more general concepts. A combination development process is a combination of the top-down and bottom-up approaches: We define the more salient concepts first and then generalize and specialize them appropriately.3
 +
 
 +
'''Step 5. Define the properties of classes—slots'''
 +
The classes alone will not provide enough information to answer the competency questions from Step 1. Once we have defined some of the classes, we must describe the internal structure of concepts.
 +
 
 +
'''Step 6. Define the facets of the slots'''
 +
Slots can have different facets describing the value type, allowed values, the number of the values (cardinality), and other features of the values the slot can take. Slot cardinality defines how many values a slot can have. Slot-value type is a value-type facet describes what types of values can fill in the slot. Here is a list of the more common value types:
 +
 
 +
      • String is the simplest value type which is used for slots such as name: the value is a simple string
 +
      • Number (sometimes more specific value types of Float and Integer are used) describes slots with numeric values.  
 +
      • Boolean slots are simple yes–no flags.
 +
      • Enumerated slots specify a list of specific allowed values for the slot.
 +
      • Instance-type slots allow definition of relationships between individuals.3
 +
 
 +
'''Step 7. Create instances'''
 +
The last step is creating individual instances of classes in the hierarchy. Defining an individual instance of a class requires (1) choosing a class, (2) creating an individual instance of that class, and (3) filling in the slot values.3
  
 
== Use Case ==
 
== Use Case ==
 +
 
The goal of specifying these relations is to establish the most general language possible with which to perform set operations or searches on the data, such as first order logic (FOL). FOL is composed of individual terms, predicates, logical connectives, and quantifiers. Using RxNorm as an example, RxNorm has a set of terms related to medications and the relationships between them. So, we can search for:  
 
The goal of specifying these relations is to establish the most general language possible with which to perform set operations or searches on the data, such as first order logic (FOL). FOL is composed of individual terms, predicates, logical connectives, and quantifiers. Using RxNorm as an example, RxNorm has a set of terms related to medications and the relationships between them. So, we can search for:  
 
SBDF has_tradename(SCDF has_ingredient (x))   
 
SBDF has_tradename(SCDF has_ingredient (x))   
Line 23: Line 71:
  
 
== Conclusion ==
 
== Conclusion ==
Ontologies are similar to the applications in natural language processing (NLP), in that we develop them to discover, structure, and automate the knowledge we've thus far accumulated. The only difference is perhaps in their approach. Ontologies specify knowledge whereas NLP discovers knowledge, usually through the statistical interplay of words in a corpus. While ontologies concern themselves with the broader task of representing entities in the real world, they still play a key role in enabling the development of rich semantically rich user interfaces in clinical information systems. SNOMED, ICD-9, RxNorm, etc represent the initial steps in ultimately processing large volumes of clinical data feasible.  
+
 
 +
Ontologies are similar to the applications in [[natural language processing (NLP)]], in that we develop them to discover, structure, and automate the knowledge we've thus far accumulated. The only difference is perhaps in their approach. Ontologies specify knowledge whereas NLP discovers knowledge, usually through the statistical interplay of words in a corpus. While ontologies concern themselves with the broader task of representing entities in the real world, they still play a key role in enabling the development of rich semantically rich user interfaces in clinical information systems. SNOMED, ICD-9, [[RxNorm]], etc represent the initial steps in ultimately processing large volumes of clinical data feasible.
 +
 
 +
A well-designed ontology can address the lack of a unifying reference terminology by providing an adequate link between the data models and the terminology models. A well-designed ontology is easily extensible, localizable and maintainable. After defining an initial version of the ontology, it can be evaluated and debugged by using it in applications or problem-solving methods or by discussing it with experts in the field, or both. Design, and therefore evaluation, is an iterative process throughout the ontology lifecycle in both the management and the support activities to illustrate its importance. The purpose of ontology evaluation should be to assess a given ontology using a particular criterion of application to determine the best one for a given purpose.
  
 
== References ==
 
== References ==
[1] Munn K, Smith B. Applied ontology: an introduction. ontos verlag; 2008.
 
  
[2] Cimino JJ, others. Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine-Methodik der Information in der Medizin. 1998;37(4):394–403.  
+
# Munn K, Smith B. Applied ontology: an introduction. ontos verlag; 2008.
 +
# Cimino JJ, others. Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine-Methodik der Information in der Medizin. 1998;37(4):394–403.
 +
# Noy, N, McGuinness, L. Ontology Development 101: A Guide to Creating Your First Ontology. Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report, 2001; SMI-2001-0880.
 +
# Gruber, T. R. A translation approach to portable ontology specifications. Stanford University. Computer Science Dept. Knowledge Systems Laboratory. 1993.
 +
# McGuinness DL, Wright J. Conceptual Modeling for Configuration: A Description Logic-based Approach. Artificial Intelligence for Engineering Design, Analysis, and Manufacturing - special issue on Configuration. 1998.
 +
# McGuinness DL, Fikes R, Rice J, Wilder S. An Environment for Merging and Testing Large Ontologies. Principles of Knowledge Representation and Reasoning: Proceedings of the Seventh International Conference (KR2000). A. G. Cohn, F. Giunchiglia and B. Selman, editors. San Francisco, CA, Morgan Kaufmann Publishers. 2000.
 +
# Pinto SF, Martins JP. Ontologies: how can they be built? Knowledge Inform Syst 2004;6:441–64
 +
# Kuziemsky CE, Lau F. A four stage approach for ontology-based health information system design. Artificial Intelligence in Medicine, 2010;50(3):133-148.
 +
# Gruninger M, Fox, M.. Methodology for the Design and Evaluation of Ontologies. In: Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, IJCAI-95, Montreal. 1995.
 +
# Uschold, M. and Gruninger, M. Ontologies: Principles, Methods and Applications. Knowledge Engineering Review 1996;11(2).
  
  
Line 34: Line 93:
  
 
[[Category:BMI512-FALL-11]]
 
[[Category:BMI512-FALL-11]]
 +
 +
Submitted by Travis Gamble
 +
 +
[[Category:BMI512-FALL-12]]

Latest revision as of 20:26, 26 November 2012

Applied ontology (subfield of ontology) concerns itself with the application of such principles to build knowledge frameworks for specific domains such as medicine, biology, geography, etc. In recent years the development of ontologies has been moving from the realm of Artificial-Intelligence laboratories to the desktops of domain experts and has become common on the World-Wide Web.

Ontology

The main motivations for building ontologies are to make propositions with precise meaning and to enable computers to automate data processing. ICD, SNOMED, and MeSH often serve as examples, but the principles of building those terminologies are not often explored. This article will review the main themes in Applied Ontology, a book by Katherine Munn and Barry Smith, which provides an overview of the philosophies and principles involved in the tasks of knowledge development and curation.

Why would someone want to develop an ontology? Some of the reasons are:

     •	To share common understanding of the structure of information among people or software agents
     •	To enable reuse of domain knowledge
     •	To make domain assumptions explicit
     •	To separate domain knowledge from the operational knowledge
     •	To analyze domain knowledge3

Sharing common understanding of the structure of information among people or software agents is one of the more common goals in developing ontologies.4 For example, suppose several different Web sites contain medical information or provide medical e-commerce services. If these Web sites share and publish the same underlying ontology of the terms they all use, then computer agents can extract and aggregate information from these different sites. The agents can use this aggregated information to answer user queries or as input data to other applications.3

Enabling reuse of domain knowledge was one of the driving forces behind recent surge in ontology research. For example, models for many different domains need to represent the notion of time. This representation includes the notions of time intervals, points in time, relative measures of time, and so on. If one group of researchers develops such an ontology in detail, others can simply reuse it for their domains.3

Making explicit domain assumptions underlying an implementation makes it possible to change these assumptions easily if our knowledge about the domain changes. Hard-coding assumptions about the world in programming-language code make these assumptions not only hard to find and understand but also hard to change, in particular for someone without programming expertise. In addition, explicit specifications of domain knowledge are useful for new users who must learn what terms in the domain mean.3

Separating the domain knowledge from the operational knowledge is another common use of ontologies. We can describe a task of configuring a product from its components according to a required specification and implement a program that does this configuration independent of the products and components themselves.5

Analyzing domain knowledge is possible once a declarative specification of the terms is available. Formal analysis of terms is extremely valuable when both attempting to reuse existing ontologies and extending them.6

Developing the ontology

Some ontology-design ideas originated from the literature on object-oriented design. However, ontology development is different from designing classes and relations in object-oriented programming. Object-oriented programming centers primarily around methods on classes—a programmer makes design decisions based on the operational properties of a class, whereas an ontology designer makes these decisions based on the structural properties of a class.3

Pinto and Martin’s discuss generic ontology development stages: specification, conceptualization, formalization, implementation and maintenance. Kuziemsky expanded upon these concepts for ontology-based health information system design. He described using a participatory design (PD) method used not only to design a product but also to ensure the usability and utility of the product by engaging end users in the design process. Also described was grounded theory (GT) methodology which is a research method that uses a systematic set of procedures to develop an inductively derived theory about a phenomenon. Kuziemsky describes a hybrid Grounded Theory (GT) – Participatory Design (PD) that draws upon the strengths of both GT and PD. PD provides a means of user engagement to obtain a rich perspective on clinical practice and requires domain experts. GT codes data into concepts and categories based upon the understanding about the data. The methodological thoroughness of the GT-PD approach enables the ontology concepts and problem-solving approaches to be traced back to the source data. That traceability provides realistic validation to the ontology and also potentially promotes sharing of the ontology in different settings.7 8

Again, there is no one “correct” way or methodology for developing ontologies. In the ontology development methodology described here, we use an iterative approach. The best solution almost always depends on the application that you have in mind and the extensions that you anticipate. Concepts in the ontology should be close to objects (physical or logical) and relationships in your domain of interest.3

Step 1. Determine the domain and scope of the ontology We suggest starting the development of an ontology by defining its domain and scope. One of the ways to determine the scope of the ontology is to sketch a list of questions that a knowledge base based on the ontology should be able to answer competency questions. That is, answer several basic questions:

     •	What is the domain that the ontology will cover?
     •	For what we are going to use the ontology?
     •	For what types of questions the information in the ontology should provide answers?
     •	Who will use and maintain the ontology?9

Step 2. Consider reusing existing ontologies It is almost always worth considering what someone else has done and checking if we can refine and extend existing sources for our particular domain and task. Reusing existing ontologies may be a requirement if our system needs to interact with other applications that have already committed to particular ontologies or controlled vocabularies.

Step 3. Enumerate important terms in the ontology It is useful to write down a list of all terms we would like either to make statements about or to explain to a user. What are the terms we would like to talk about? What properties do those terms have? What would we like to say about those terms?3

Step 4. Define the classes and the class hierarchy There are several possible approaches in developing a class hierarchy. [10] A top-down development process starts with the definition of the most general concepts in the domain and subsequent specialization of the concepts. A bottom-up development process starts with the definition of the most specific classes, the leaves of the hierarchy, with subsequent grouping of these classes into more general concepts. A combination development process is a combination of the top-down and bottom-up approaches: We define the more salient concepts first and then generalize and specialize them appropriately.3

Step 5. Define the properties of classes—slots The classes alone will not provide enough information to answer the competency questions from Step 1. Once we have defined some of the classes, we must describe the internal structure of concepts.

Step 6. Define the facets of the slots Slots can have different facets describing the value type, allowed values, the number of the values (cardinality), and other features of the values the slot can take. Slot cardinality defines how many values a slot can have. Slot-value type is a value-type facet describes what types of values can fill in the slot. Here is a list of the more common value types:

     •	String is the simplest value type which is used for slots such as name: the value is a simple string
     •	Number (sometimes more specific value types of Float and Integer are used) describes slots with numeric values. 
     •	Boolean slots are simple yes–no flags. 
     •	Enumerated slots specify a list of specific allowed values for the slot.
     •	Instance-type slots allow definition of relationships between individuals.3

Step 7. Create instances The last step is creating individual instances of classes in the hierarchy. Defining an individual instance of a class requires (1) choosing a class, (2) creating an individual instance of that class, and (3) filling in the slot values.3

Use Case

The goal of specifying these relations is to establish the most general language possible with which to perform set operations or searches on the data, such as first order logic (FOL). FOL is composed of individual terms, predicates, logical connectives, and quantifiers. Using RxNorm as an example, RxNorm has a set of terms related to medications and the relationships between them. So, we can search for: SBDF has_tradename(SCDF has_ingredient (x)) to find the semantic branded drug form of the semantic clinicial dose forms containing the ingredient x. Their representation is useful, not only in finding ingredients of a given drug, but also of being able to seamlessly transition between the different forms and representations a drug or drug order might take on.

Conclusion

Ontologies are similar to the applications in natural language processing (NLP), in that we develop them to discover, structure, and automate the knowledge we've thus far accumulated. The only difference is perhaps in their approach. Ontologies specify knowledge whereas NLP discovers knowledge, usually through the statistical interplay of words in a corpus. While ontologies concern themselves with the broader task of representing entities in the real world, they still play a key role in enabling the development of rich semantically rich user interfaces in clinical information systems. SNOMED, ICD-9, RxNorm, etc represent the initial steps in ultimately processing large volumes of clinical data feasible.

A well-designed ontology can address the lack of a unifying reference terminology by providing an adequate link between the data models and the terminology models. A well-designed ontology is easily extensible, localizable and maintainable. After defining an initial version of the ontology, it can be evaluated and debugged by using it in applications or problem-solving methods or by discussing it with experts in the field, or both. Design, and therefore evaluation, is an iterative process throughout the ontology lifecycle in both the management and the support activities to illustrate its importance. The purpose of ontology evaluation should be to assess a given ontology using a particular criterion of application to determine the best one for a given purpose.

References

  1. Munn K, Smith B. Applied ontology: an introduction. ontos verlag; 2008.
  2. Cimino JJ, others. Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine-Methodik der Information in der Medizin. 1998;37(4):394–403.
  3. Noy, N, McGuinness, L. Ontology Development 101: A Guide to Creating Your First Ontology. Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report, 2001; SMI-2001-0880.
  4. Gruber, T. R. A translation approach to portable ontology specifications. Stanford University. Computer Science Dept. Knowledge Systems Laboratory. 1993.
  5. McGuinness DL, Wright J. Conceptual Modeling for Configuration: A Description Logic-based Approach. Artificial Intelligence for Engineering Design, Analysis, and Manufacturing - special issue on Configuration. 1998.
  6. McGuinness DL, Fikes R, Rice J, Wilder S. An Environment for Merging and Testing Large Ontologies. Principles of Knowledge Representation and Reasoning: Proceedings of the Seventh International Conference (KR2000). A. G. Cohn, F. Giunchiglia and B. Selman, editors. San Francisco, CA, Morgan Kaufmann Publishers. 2000.
  7. Pinto SF, Martins JP. Ontologies: how can they be built? Knowledge Inform Syst 2004;6:441–64
  8. Kuziemsky CE, Lau F. A four stage approach for ontology-based health information system design. Artificial Intelligence in Medicine, 2010;50(3):133-148.
  9. Gruninger M, Fox, M.. Methodology for the Design and Evaluation of Ontologies. In: Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, IJCAI-95, Montreal. 1995.
  10. Uschold, M. and Gruninger, M. Ontologies: Principles, Methods and Applications. Knowledge Engineering Review 1996;11(2).


Submitted by Nathan Bahr

Submitted by Travis Gamble