Annotation classes

One annotation class is a unique lexically sorted combination of GO terms from one ontology, for example the terms GO:0000001
and GO:0000147 constitute a biological process class (BPclass). In addition to BPclass, MFclass, and CCclass, we define a GO
annotation class (GOclass) as a unique combination of one BPclass, one MFclass, and one CCclass. Each protein and protein family
is assigned to one GOclass, one BPclass, one MFclass, and one CCclass in agreement with its annotation with GO terms. Every protein
and protein family is then assigned the annotation class that corresponds to its annotated GO terms.

Each annotation class is given an unique numerical id. This id is stable between different releases. Ids from annotation classes
that become obsolete in a release are not reused. New annotation classes receive an id that is numerically larger than the largest id
from the last release.


We developed a hierarchical network of annotation classes, in which each class is represented as node. In this graph, two classes c1 and c2
are connected by an edge if the following two conditions are satisfied:

In this case, the edge is directed from c1 to c2. In the resulting network, annotation classes of length 1 are the source nodes and the leafs
are the annotation classes that are not contained in another class that is one term larger. These leaf classes are called superclasses.
If viewed from an ontological perspective, the larger classes can be seen as specifications of smaller classes and the edges can be
labeled as is-a relationships. In effect, the most specific classes are called superclasses.

When comparing a protein or protein family to a list of proteins or families, the results table contains the comparison to the GOclasses that map to at least on
entry in the list. The GOclasses in turn consist of one BPclass, one CCclass, and one MFclass. If the comparison is restricted to superclasses, the result set is
confined to GOclasses that contain at least one superclass.


