Computational structure biology
Protein structure modeling
The main interest of our group is the development of methods and algorithms for molecular modeling and simulations of three-dimensional protein structures and their interactions. One of the major limitations for using structure-based methods in biomedical research is the limited availability of experimentally determined protein structures. Prediction of the 3D structure of a protein from its amino acid sequence remains a fundamental scientific problem, and it is considered as one of the grand challenges in computational biology. Comparative or homology modeling, which uses experimentally elucidated structures of related protein family members as templates, is currently the most accurate and reliable approach to model the structure of the protein of interest. Template-based protein modeling techniques exploit the evolutionary relationship between a target protein and templates with known experimental structures, based on the observation that evolutionarily related sequences generally have similar 3D structures.
The SWISS-MODEL expert system developed by our group is a fully automated web-based workbench, which greatly facilitates the process of computing of protein structure homology models. With more than 300’000 registered users and about 2 models calculated per minute, SWISS-MODEL is one of the most widely used modelling servers world-wide.
Model quality estimation
Ultimately, the quality of a model determines its usefulness for different biomedical applications such as planning mutagenesis experiments for functional analyses or studying protein-ligand interactions, e.g. in structure based drug design. The estimation of the expected quality of a predicted structural model is therefore crucial in structure prediction. Especially when the sequence identity between target and template is low, individual models may contain considerable errors. To identify such inaccuracies, scoring functions have been developed which analyze different structural features of the protein models in order to derive a quality estimate. To this end, we have introduced the composite scoring function QMEAN, which consists of four statistical potential terms and two components describing the agreement between predicted and observed secondary structure and solvent accessibility. QMEANBrane further extends the approach to membrane protein structures, which play crucial roles in many biological processes and are important drug targets. Recently, we have developed an approach for dynamically combining the knowledge-based statistical potentials of QMEAN with distance constraints derived from homologous template structures (QMEANDisCo). This method significantly increases the accuracy of the local per-residue quality estimates at a relatively small computational cost, as demonstrated by the results of the community wide CAMEO and CASP XIII experiments.
CASP and CAMEO: Critical assessment of structure prediction methods
Methods for structure modeling and prediction have made substantial progress over the last decades, but still fall short in accuracy compared to high-resolution experimental structures. Assessing the quality of a blind prediction in comparison to experimental reference structures allows benchmarking the state-of-the-art in structure prediction and identifying areas which need further development. The Critical Assessment of Structure Prediction (CASP) experiment has for the last 25 years assessed the progress in the field of protein structure modeling. To this end, every two years, about 100 blind prediction targets are carefully evaluated by human experts. The “Continuous Automated Model EvaluatiOn” (CAMEO) project aims to provide a fully automated blind assessment for prediction servers based on weekly pre-released sequences of the Protein Data Bank PDB. CAMEO requires the development of novel scoring methods such as lDDT, which are robust against domain movements to allow for automated continuous operation without human intervention. CAMEO is currently assessing predictions of 3-dimensional structures, residue-residue contact prediction, and model quality estimation.
Comparison of a predicted two-domain protein structure model (colored according lDDT score) with its reference structure (shown in gray). The model is shown in full length (A), with the first domain superposed to the target. For graphical illustration, (B) shows the two domains in the prediction separated and superposed individually to the target structure. (Bioinformatics, 2013, 29:2722-2728).
Variant interpretation: From protein engineering to drug resistance
Understanding the phenotypic effect of genetic variants is one of the challenges in molecular biology. Three-dimensional models of proteins are valuable tools for the interpretation of amino acid variations – for protein engineering, in-vitro evolution experiments, the analysis of disease related mutations, or variations leading to drug resistance.
For example, studying the Zinc-selective inhibition of the promiscuous bacterial amide-hydrolase DapE highlights the implications of metal heterogeneity for evolution and drug design of this novel anti-biotic drug target (Marc Creus). Engineering of the enzyme N-oligosaccharyltransferase PglB allowed to improve the efficiency of in vivo synthesis (E.Coli) of novel and well characterized immunogenic polysaccharide/protein complexes for use in vaccines (collaboration with GlycoVaxyn AG, Schlieren, and EMPA).
Drug resistance in the bacterial pathogen M. tuberculosis (TB) is a threat to human health. The experimental analysis of all drug resistance variants identified on a genetic level is typically not feasible in a reasonable time due to slow growth rates and difficult handling in the laboratory. A computational approach for assessing the resistance potential of newly discovered sequence variants would address this problem. In collaboration with the TB Research Unit (PI: Sébastien Gagneux) at the Swiss Tropical Disease and Public Health Institute we work on collecting and analyzing resistance associated variants in their structural context. Our hypothesis is that the mechanisms of resistance in TB can be characterized with protein structure information, and that this information can then be used to predict drug resistance potential of new variants, including hypotheses for the molecular mechanism of resistance.