Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

FunSimMat - Functional Similarity Matrix

Query options

FunSimMat offers several different types of queries that are available through the web front-end and the XML-RPC server.
The different query types are described in more detail in the following. There are some options that are common to most
of the queries:


Semantic similarity search

This query option allows for measuring the semantic similarity of the concepts represented by two Gene Ontology terms.
A space- or tab-delimited list of GO terms to be compared should be entered into the textbox in the query form,
e.g. GO:0000001 GO:0004567. The results table returns the all-against-all comparison of the GO terms based on four
different similarity measures, simRel, Lin, Resnik, and Jiang & Conrath. See Results page for more information on the
results. More information on the scores can be found on the Scores page.

Comparing one protein / protein family with a list of proteins / protein families

This query option allows for the comparison of one protein, protein family or disease to the given list of proteins,
protein families, or diseases. There are several possibilities for specifying this list. First, a list of accessions may
be entered into a text field. Second, a file with accession numbers may be uploaded. The file should contain only the
accession numbers separated with spaces or tabs. Third, by entering an OMIM accession, all proteins annotated in UniProtKB
with this OMIM entry are selected. Fourth, an arbitrary taxon may be entered in the text field. It is required to use the
NCBI Taxonomy accession number of the taxon. Fifth, a pre-defined taxon can be selected from the drop-down list. Sixth, the
query protein / protein family / disease can be compared to the whole database. The results table returns different scores.
The BPscore measures the similarity of the biological processes annotated to the two proteins, protein families or diseases.
Likewise MFscore and CCscore measure the similarity of the molecular functions and the cellular components, respectively.
The funSim scores are computed from BPscore and MFscore. The funSimAll scores combine all three BPscore, MFscore, and
CCscore, which measures the overall functional similarity of the two proteins, protein families, or diseases.

Comparing a list of GO terms with a list of proteins / protein families

This option allows for defining a functional profile and finding similar proteins, protein families or diseases. The
functional profile is defined through a list of GO terms, which are to be entered as space- or tab-delimited list into
the text field. Then the ontology has to be selected: biological process, molecular function, or cellular component. It
is not possible to define a mixed profile of these three types.
There are several possibilities for specifying the list of proteins or protein families to compare to. First, by entering
an OMIM accession, all proteins annotated in UniProtKB with this OMIM entry are selected. Second, an arbitrary taxon may
be entered in the text field. It is required to use the NCBI Taxonomy accession number of the taxon. Third, a pre-defined
taxon can be selected from the drop-down list. Fourth, the functional profile can be compared to the whole database.
Depending on the selected GO term type, BPscores, MFscores, or CCscores are computed.

Disease Candidate Prioritization

This option allows for prioritizing candidate disease proteins with respect to a OMIM entry of interest. The list of
candidates can be defined in several ways. First, a list of UniProt accessions may be entered into a text field. Second, a
file with accession numbers may be uploaded. Third, the input disease can be compared to all human proteins in the database.
Depending on the selected GO term type, BPscores, MFscores, or CCscores are computed.