Characterization of novel proteins based on known protein structures

W A Koppensteiner, P Lackner, M Wiederstein, M J Sippl

Research output: Contribution to journalArticle


The genome sciences face the challenge to characterize structure and function of a vast number of novel genes. Sequence search techniques are used to infer functional and structural information from similarities to experimentally characterized genes or proteins. The persistent goal is to refine these techniques and to develop alternative and complementary methods to increase the range of reliable inference.Here, we focus on the structural and functional assignments that can be inferred from the known three-dimensional structures of proteins. The study uses all structures in the Protein Data Bank that were known by the end of 1997. The protein structures released in 1998 were then characterized in terms of functional and structural similarity to the previously known structures, yielding an estimate of the maximum amount of information on novel protein sequences that can be obtained from inference techniques. The 147 globular proteins corresponding to 196 domains released in 1998 have no clear sequence similarity to previously known structures. However, 75 % of the domains have extensive structure similarity to previously known folds, and most importantly, in two out of three cases similarity in structure coincides with related function. In view of this analysis, full utilization of existing structure data bases would provide information for many new targets even if the relationship is not accessible from sequence information alone. Currently, the most sophisticated techniques detect of the order of one-third of these relationships.

Translated title of the contributionCharacterization of novel proteins based on known protein structures
Original languageEnglish
Pages (from-to)1139-1152
Number of pages14
JournalJournal of Molecular Biology
Issue number4
Publication statusPublished - 3 Mar 2000

Bibliographical note

Copyright 2000 Academic Press.

Fields of Science and Technology Classification 2012

  • 106 Biology


  • Amino Acid Sequence
  • Bacterial Proteins/chemistry
  • Carrier Proteins/chemistry
  • Desulfovibrio vulgaris
  • Flavoproteins
  • Models, Chemical
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Conformation
  • Sequence Homology, Amino Acid

Cite this