D3.1 (Month 6)

Domain architecture (DA) for a first set of proteins with reliable annotations(Reliably annotated genes)

Goals

STOCKHOLM will have developed a protocol to obtain several different sets of Domain Architectures (DA) for a set of proteins using either publicly available databases of domain assignments or by applying domain detection algorithms in house. These algorithms will be applied to a first set of proteins with reliable Gene Ontology (GO) annotations obtained from SwissProt, MIPS or other sources. The protocol as well as the DA and GO annotations will be available from the website.

SBC have also used the domain annotations to assign functions to eukaryotic and prokaryotic genes. This has been the basis for studies on the evolution of functional changes (Björklund, manuscript) In the first reporting period BioAlma investigated the following scheme. Given a (multi-domain) protein family the proteins in this group (set A) are composed of a number of domains, some of them they share with one set of proteins (set B) and others with another group (set C). If we now analyze the literature corresponding to these sets we will find - things that are shared between all three of them (non-specific features) - things that are only shared between sets A and B (potentially specific for the domains that sets A and B share) - things that are only shared between sets A and C (potentially specific for the domains that sets A and C share) This way it is possible to separate what corresponds to a set of proteins and what specifically refers to a domain.

This deliverables has been finalized and used in a publication (Ekman, et al in press). A web-server implementing the algorithm has been made publicly available at: http://sbcweb.pdc.kth.se/cgi-bin/diaek/domsearch.cgi

The Meta-DP server predicts  domain(s) of the query  protein sequence. It uses ten different domain prediction methods plus a consensus prediction and is available at: http://meta-dp.bioinformatics.buffalo.edu/


Arne Elofsson
Last modified: Thu Mar 17 21:41:16 CET 2005