Language Design

In this section, we will design an ontology language which satisfies the requirements of our application and agrees with the high level design. The section does not cover formal syntax and semantics, and thus does not deliver a complete specification. However, the description here lends itself to a formal specification easily. First, we will start with a description language. Then this small language will be extended to a generic frame based description language. Consequently, we add conceptualization components to this language. At a fourth iteration, we add a modularity construct. Finally, we discuss some possible enhancements.

A Description Language

The purpose of this language is to denote what exists, entities and relations among entities. We limit our attention to only unary and binary predicates. Thus the logical constants are those of entites and relations. description sentences in the language consist of either entity descriptions, and unary or binary relations among entities.


john;
mary;

Asserts that entities referenced by constants john and mary exist.

Unary predicates are written in prefix form:


Girl mary;

Binary predicates are written in infix form, or in an arrow form


john Loves mary;
Loves john -> mary;
Loves mary <- john;

There are also logical variables in the base language. A variable is denoted by a variable specifier var. In var symbol , a reference to variable symbol is being expressed. In the following unsaturated expression a use is illustrated.


var x Loves mary;

We have described the atomic formulas in the language. We are leaving compositional formulas for negation, conjunction, disjunction and implication for a subsequent discussion.

A Frame-Based Description Language

The basic description language will now be extended to a frame based one.

The frames in the language are simply scopes which enclose frames and atomic description language formulas. The frames provide a way to gather descriptions in a convenient way by decomposing the symbol space. In the language, a frame is referenced by a constant symbol or a variable. The contents of a frame is described by a collection of formulas in an enclosing scope.

The following is a frame declaration


frame girls [
  jane;
  cindy;
] 

A frame also lets referring to components within it. The expression girls.jane refers to jane within girls frame. Now let us describe John's emotional life in a new way.

john loves girls.jane;

Frames can also contain other frames


frame To-Fly [
  var flier;
  frame Physical [
    moves flier;
  ]
  frame Mind [
    similar flier -> "Salvador Dali"
  ]
] 

Outside the frame, this is equivalent to


similar var To-Fly.Mind.flier -> "Salvador Dali"

The default scoping rules are similar to object oriented languages. Wiithin a scope the expressions within the scope are visible to enclosed scopes. Of course, a self-reference construct is required. We conveniently define this to be a special constant called "self" whose interpretation depends on the scope.


frame Container [
  frame element;
  self contains element;
] 

Here, we note two features of the language. First, a frame does not need to be fully defined upon declaration. And second, self refers to the enclosing frame. When the scope depth is bigger, the inferior scopes may want to refer to enclosing scopes. That we accomplish by cascading self's.


frame Animal [
  frame Cute [
     bird1;
     bird2;
     rabbit1;
     self flees-from self.self.Predator;
     frame Predator [
       kitty1;
       kitty2;
     ]
  ]
  frame Predator [
    tiger1;
    lion1;
    eats self -> Animal.Cute;
  ]
] 

Minimal Ontology Support

We shall first introduce categories and relations into this language. Categories and relations are frames with special semantics.

Categories

The following is an example category declaration.


category genre;

Declarations do not need definitions to be effective, and also note that declaration order should not be relevant as in some programming languages; ie C++.

A special relation among categories is subcategory relation. The subcategory relation is a binary relation from the subcategory to the supercategory. A subcategory in ontology means only a subset. Assume that the interpretation of a category is a set of entities. The interpretation of a subcategory would consist of a subset of that set. In other words, subcategories may be overlapping, and non-exhaustive. In many ontologies, there is explicit support for defining whether the subcategories define an exhaustive partition of a category or whether the subclasses are disjoint. In our language, we avoid these languages and stick with the most general description that assumes nothing about whether subcategories overlap or completely define the category. In the following example we declare Pop to be a subcategory of Genre.


category pop;
subcat pop->genre;

At this stage, we introduce something that was not obvious before. subcat relation, like categories is in fact a frame. Internally perhaps the subcat is interpreted as frame subcat-impl. Thus the following declarations are valid.


category rock;
category electronica;
subcat [
  self rock -> genre;
  self electronica -> genre;
]

Indeed, we could have even declared rock and electronica as opaque categories as in the following.


category [
  rock;
  electronica;
]

Let us now add another genre to this ontology.


category industrial [
  subcat self -> rock;
  subcat self -> electronica;
]

As corollary to our general subcategory definition, the language supports multiple inheritance. Industrial music is a subcategory of both rock and electronica genres. Implicit in this definition is that


subcat industrial -> genre;

This need not worry us though. The semantics of subcategory in our language convey this fact without explicit definition. The following material implication holds, which formalizes our subset semantics.

For all categories A, B and C, if A is a subcategory of B and B is a subcategory of C then A is a subcategory of C.

A second theorem states that within a subcategory frame, all that is true within the supercategory frame is true.

For all categories A, and B, if A is a subcategory of B then every fact that is in B's frame is true in A's frame.

Another relation among categories is part-of relation. Part-of relation is a binary relation from category A to category B, meaning that instances of A are components with the structure of instances of B. Note that there are severe metaphysical mistakes in this definition and is thus only partially valid. In our definition, this relation does not generally mean that every instance of A is part of an instance of B, however we do mean that every instance of B has a part which is an instance of A. Another visible complication is the arity of part-of relation. The inverse relation "contains" from category B to category A indicates that instances of B contain instances of A. Arity is considered as a frame variable. Note the following brief example which depicts the structural ontology of body.


category body [
  category arm;
  category leg;
  arm part-of self [ arity = 2 ];
  self contains leg [ arity = 2 ];
]

Let not the arity=2 expression be confusing. We assume here that numbers are interpreted by the ontology language internally, which may be considered to be generally useful. We do not address arithmetic expressions, an internal ontology of numbers, or otherwise algebra currently. The "=" is just another relation.

Relations

The subcategory and merelogical relations are relations among categories. Definitely, that is not our sole ontological commitment. There are also relations that are described explicitly. The following is an example of a relation so depicted.


relation loves;

which is equivalent to the relations in the base language. Nevertheless, the elegance of relations in an ontology language is due to the fact that a relation is a frame. The arguments to relation are contained within the frame of relation which are variable entities arg1 and arg2.


relation eats [
  arg1 subcat Animal;
  arg2 subcat Animal;
]

means that the arguments of eats is constrained to subcategories of Animals. The definitions of relation frames are not mere constraints on arguments.


relation similar [
  category var common-feature;
  var similarity;
]

This relation may be interpreted as a complex relation that declares arg1 is similar to arg2 in some aspect, and with a given degree of similarity. The following example demonstrates a use of this relation:


cat similar dog [
  category canine-teeth;
  category a-lot;
]

would mean that cat is a lot similar to dog in the common feature of canine teeth.

Note that a relation applies to any frame or atomic entity.

Instances

Declaring an entity to be an instance of a certain category or relation is the last step in bringing classification capability to an ontology language. There is quite a difference from other object centered systems. In fact the use of a relation which we had given in the previous section is an example for the instance of a relation.

Philosophically speaking, the categories and relations are universals, while an entity is a particular. The entities in the basic language are type-less, no concept is associated with them, we simply do not give any information on what form they have. On the other hand, we would like to use the machinery of ontological extensions in classifying entities.

Declaring instances is putting in actual data in our ontology. Here is an example of categorization using the music related ontology in the preceding sections.


-- a category for music band, using
-- previously defined categories

category band [
  var plays self -> genre;
  relation recorded [
    arg1 subcat band;
    arg2 subcat album;
  ]
]

-- some plain entities

faith-no-more;
nine-inch-nails;

-- let's say what their categories are

band faith-no-more  [
  plays self -> rock; 
]

band nine-inch-nails [
  self plays industrial;
  album broken;
  recorded self -> broken;
]

Modularity

Modularity in a language facilitates ontology re-use. We define a package construct for modularity. A typical use of packages is to include a more general ontology to define a more specific one. Consider the following things ontology:


ontology things;

category thing;
category abstract-thing [
  subcat self -> thing;
]
category concrete-thing [
  subcat self -> thing;
]

Now let us have a look at plants ontology:


ontology plants;
import things;

category plant [
  subcat self -> concrete-thing;
]

category tree [
  subcat self -> plant;
]
category flower [
  subcat self -> plant;
]

Possible Enhancements

The language we have is now quite capable of expressing ontologies for a class of repositories. Still, there is room for a lot of extensions. In the base language, we may consider adding compositional expressions with negation, conjunction, disjunction and implication. We might also have quantification. However, this would turn the language into predicate-logic, and computational problems would appear. On the other hand, such capability might be essential for defining axioms in the language. In fact, frame-based languages are usually derivatives of first-order logic and thus we might find this desirable. Especially being able to infer facts rather than defining them manually would be desirable. Note however that there is already a lot of inference for classification which is our primary aim.

A second set of enhancements might come from more ontological commitment. Especially, part-whole relations can be given a more serious treatment along with a facility to distinguish sortal and non-sortal properties. While it is not clear how sortals and non-sortals could be made different kinds of properties in this language, identity of individuals may be addressed better. In particular, there can be an identity relation which defines the properties of a person that is essential for his identity. The difficulty with this is that this would be a k-ary undirected relation which is not supported by the base language.

Another improvement may be in the simple ontology inclusion system which we have presented. There can be a symbol renaming mechanism that allows fine-grained control over imported packages. Also a symbol export mechanism can be used to control interface symbols for a package.