Project: Helmholtz Network for Bioinformatics (HNB)
Motivation
Apart from existing Bioinformatics software, a lot of new tools are being developed by prominent research groups at several locations in Germany. Traditionally, access to these software and databases are through independant webservers hosted by individual institutions. But the current situation is not very attrative for the Biologists, Biochemists and other potential users and unacceptable for beginners. Eg. for a protein sequence analysis one has to visit several sites, understand what programs are needed for this task, learn to use them with all possible options and parameters and be able to interpret the results. This is not trivial.
Purpose of the project
The aim of the Helmholz Network for Bioinformatics (HNB) is to unifiy the german bioinformatic scene and provide the wide spectrum of methods through a common and userfriendly interface. The HNB-user takes advantage of the fact that for a given task, appropriate software acts in concert to compute high quality results.
Contribution of MIPS technology to the project
MIPS as member of the HNB actually provides the following software packages to the project:
BioRS is a powerful and fast retrieval system offering state of the art integration of heterogeneous databanks with in-house proprietary databases. BioRS enables researches to work with hundreds of biological databases (e.g. sequence, structure, literature) through easy-to-use WWW interfaces.
-
The Database Proxy is a CORBA server that provides access to different databases through a generic interface. It is built on top of BioRS, a database indexing and retrieval system. The developer need not spend time writing different parsers for different flatfiles. As and when new databases are added or updated at the MIPS BioRS installation, the same is immediately available through the database proxy without any change or addition of code. The server is aimed to serve as a data source for third party applications. It is a developer to developer facility. Anybody who is interested or in need to fetch data from biological databases, can easily access the same through the interfaces and methods defined in the server's CORBA IDL. In the HNB project, DBResolver of the BS model uses the database proxy for its database needs. The IDL file and other additional developer documentation is available for download from the project's website.
-
miniPEDANT displays primary information such as statistical evaluation of protein properties, best BLAST hits, multiple alignment to homologous sequences, presence of sequences domain, domain patterns etc. Additionally, secondary structures and trans-membrane segments are predicted and a 3D-structure is assigned to the sequence wherever possible. Integration of specific in-house databases, such as the Yeast Functional Catalogue, expand the scope of miniPEDANT, thus becoming one of the most complete and easy-to-use sequence analysis tools available on the web.
To avoid time-consuming on-the-fly calculations, a new fast sequence similarity search algorithm is implemented in the CORBA-based miniPEDANT framework. MIPS databases (actualy PEDANT, others will be added soon) are scanned for identical or highly similar entries respectly to the query sequence, and the result is displayed immediately. The CPU-time for an on-the-fly miniPEDANT analysis is also drasticaly reduced by substitution of a typical BLAST search against large databases by our new algorithm. The result in most cases are identical to the best BLAST hits.
-
The Pedant Sequence Analysis Tool is a highly sophisticated and powerful bioinformatics suite offering maximum convenience for analyzing DNA and protein sequences. Unlike single-gene sequence analysis tools, Pedant allows for fully automated annotation of individual sequences as well as complete genomes. Highly complex questioning is made possible with the Pedant system while results are delivered in clear, well-structured graphical interfaces. MIPS Functional Catalogue The automatic functional catalogue assignment of proteins is an automated method to assign functional properties to proteins based on sequence similarity to manually annotated proteins. Manually annotated proteins from Athaliana, Yeast and Tacidophilum were manually annotated using the MIPS functional catalogue. A new protein gets an automatic functional assignment of the same functional catalogue entry if there is a direct sequence similarity between this new protein and a manually annotated protein reported by PSI-Blast. This database contains the automatic functional assignment for a non redundant protein database derived from Pir, Swissprot, Swissnew, Trcdsembl and Trcdsemblnew containing approximately 630000 proteins.
