Coordinatore | UNIVERSITY OF HAIFA
Organization address
address: "Mount Carmel, Abba Khoushi Blvd." contact info |
Nazionalità Coordinatore | Israel [IL] |
Totale costo | 100˙000 € |
EC contributo | 100˙000 € |
Programma | FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013) |
Code Call | FP7-PEOPLE-2007-4-3-IRG |
Funding Scheme | MC-IRG |
Anno di inizio | 2008 |
Periodo (anno-mese-giorno) | 2008-05-01 - 2012-11-13 |
# | ||||
---|---|---|---|---|
1 |
UNIVERSITY OF HAIFA
Organization address
address: "Mount Carmel, Abba Khoushi Blvd." contact info |
IL (HAIFA) | coordinator | 0.00 |
Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.
'Finding the structural neighbors of a protein is an essential task in computational structural Biology. In cases where there is no detectable sequence similarity, the identification of structural neighbors offers a powerful approach to predicting structure and function. State-of-the-art protein structural searches find structural neighbors by comparing a protein to all proteins in their database. As this is an expensive computation that depends directly on the size of the database, such searches consider only a representative subset of the Protein Data Bank (PDB). The PDB is growing dramatically to include many structures of uncharacterized function solved by high-throughput methods developed by the Structural Genomics (SG) initiative. Characterizing these structures, and addressing questions raised by the improved coverage of structure space, mandates better structure search tools. I propose to develop a search tool for protein structure space that is analogous to web search tools such as Google. The system is designed to be fast and interactive, to cover the whole data set, and to have a clear and simple interface so that it can serve as a navigation interface to structure space. For this, I propose adapting a data structure called an inverted index, which is used for fast retrieval in web search. Protein structures are described as strings of letter strings based on a structural alphabet, and placed in an inverted index. Then, given a query structure and its string description, one can quickly retrieve a short list of candidate structural neighbors. Similar to web search, I suggest using query expansion to improve the retrieval performance. The proposal also includes an application of structural search for comparing the structural novelty of contributions from different SG centers. This project will enable me to establish a new Computational Biology research team at my host institution: it preeminently fulfills the goals of the Work Program.'
Facilities such as the Protein Data Bank store vast amounts of molecular data. An effective large-scale protein search system will sift through the information for structural solutions for the biotech industry.
The EU-funded 'Searching protein structure space' (SPSS) project has fulfilled this need using an innovative 'filter-and-refine' paradigm on large datasets. For fast and accurate retrieval of data on structural similarity, the filter methods need to be indexable. Structural alignment heuristics were used to identify the most promising candidates in the refine stage.
Project members successfully designed and validated the novel FragBag filter method where proteins are represented in terms of their backbone fragments.
Receiver operating characteristic curve analysis validated the FragBag method for accuracy. Comparisons with other filter methods such as SGM, STRUCTAL and PRIDE demonstrated the superiority of FragBag in terms of speed and accuracy. Moreover, this method covers structure prediction as a structure is represented as a collection of sub-structures.
Fixed-size vectors enable 3D spatial representation of structure. The SPSS scientists revealed previously undescribed fundamental properties by mapping the proteins to study spatial distribution as well as structural and functional diversity. A key finding was that functional diversity varies considerably across structure space.
SPSS members have delivered an innovative tool to study and describe protein structures in spatial dimensions. In the future, researchers can study protein evolution andstructure-function relationship using this novel and unique tool.