Home Research Services Publications People Teaching Job Opening News Forum Lab Only
Online Services


TM-score TM-align MMalign NWalign EDTSurf MVP MVP-Fit SPICKER HAAD PSSpred 3DRobot I-TASSER-MR NeBcon


We are interested in:

Protein Structure Prediction
Protein Design
    Protein design refers to the effort to design new protein molecules of a desired 3D structure and function. It is a reverse procedure of protein structure prediction, and the solution of the problem therefore highly relies on the extent of our understanding on the principle of protein folding (Figure 2).

    Figure 2. Protein design is a reverse procedure of protein structure prediction.

    We successfully designed a number of new protein sequences based on a physics-based atomic force field with the lowest free-energy state searched by Monte Carlo simulation, followed by sequence-based clustering. The designed protein sequence can be folded by I-TASSER with a RMSD <2 Angstroms in 62% of cases, despite that the I-TASSER force field differs significantly from that used in the design. Figure 3 shows three representative examples of the target protein structure and I-TASSER model of the designed sequences.

    Figure 3. I-TASSER models of design sequences (red) versus crystal structure of target proteins (green)
    for calcium-binding domain of Calx (3E9TA), odorant binding protein (2ERBA), and peptidyl-tRNA
    hydrolase (1WN2A). The sequence identities of the designed and target sequences are all below 30%.

    Recently, we proposed a new protocol, EvoDesign, which uses evolutionary profiles to guide folding refinement of new designs, with biological functions introduced by protein-interface binding profiles and interactions. The protocol was recently used to successfully design functional XIAP (X-linked Inhibitor of Apoptosis Protein) BIR3 domains capable of binding Smac peptides but not inhibiting caspase-9 proteolytic activity in vitro, which demonstrated the potential to change apoptosis pathways through computational protein design (Figure 4).

    Figure 4. Sequence and structure of two XIAPs designed by EvoDesign which binds with
    Smac peptides but not inhibiting caspase-9 proteolytic activity in vitro.

Protein Function Prediction
    Given the amino acid sequence, can we tell what the protein molecule does in living cells? We have developed COFACTOR for protein function prediction, based on the sequence-to-structure-to-function paradigm. From the amino acid sequence, 3D structures are first constructed by I-TASSER. The functional insights (including enzyme classification, gene ontology, and ligand binding specificity) are then deduced by the local and global comparison of the structural models with proteins of known functions (Figure 5).

    Figure 5. Protein function annotation based on the sequence-to-structure-to-function paradigm. The right
    panel is the funcation homologs identified by global (a) and local (b) matches of I-TASSER models.

    The COFACTOR was tested in the community-wide CASP9 experiment as "I-TASSER_FUNCTION" in the Server section and as "ZHANG" in the Human section, which were ranked at the first two positions in both Z-score and the Matthews correlation coefficient (MCC) compared with the experimental data (Figure 6).

    Figure 6. Mean MCC Z-scores of the best ten groups in the Function Prediction in CASP9.
    (The picture was taken from the presentation by the CASP9 assessor Dr. T Schwede).

SNP Mutation and Genetic Disease
    Mutation and evolution in the human genome are mainly through single nucleotide polymorphisms (SNPs), i.e., replacements of a single nucleotide in the DNA sequence. Athough many SNPs have no effect on human health, some SNPs can result in abnomal fold and function of proteins and serious human diseases. Studies have shown that more than 6,000 human diseases are due to SNP mutations, and nearly all human cancers are caused by gene mutations, some from congenital inherition and some occurring during cell division (Figure 7).

    Figure 7. Many human diseases are caused by single nucleotide polymorphisms (SNPs).

    We have recently studied the impact of SNP mutations on the protein folding stability, and found that the SNP-induced free-energy changes (i.e., ddG, Figure 8A), calculated from protein structure prediction, are closely correlated with the experimental measurement, demonstrating the feasibility of using low-resolution structure prediction information to examine the effect of gene mutations (Figure 8B). In another study, we investigated the impact of SNP mutations on the stability of protein-protein interactions (PPI). It was found that the interface structural profiles, collected from homologous PPI interfaces, can be used to accurately calibrate the changes of protein-protein binding affinity by SNP mutations (Figure 8C).

    Figure 8. Modeling the impact of SNP mutations on protein folding and protein-protein interactions. (A) Definition of
    stability change upon mutation in a two-state model. (B) Impact of protein structure prediction on stablity change
    calculations. (C) Binding free-energy changes calculated by interface profile versus mutagenesis experimental data.

    We are now working on the use of protein structure modeling techniques to predict what mutations would and what would not cause human diseases (in particular cancer). One goal is to deduce the quantitative relation of SNP mutations and specific human diseases, which should significantly enhance the impact of protein structure prediction on human disease diagnosis and treatment studies.

Modeling of Protein-Protein Interactions
G Protein-Coupled Receptor and Ligand-Receptor Interactions
Ligand Screening and Structure-Based Drug Design
    In terms of the lock-and-key metaphor, drug design is essentially a procedure to find an appropriate compound molecule (the key) which can match well with the active site pocket of the target protein (the lock). Therefore, an important step of structure-based rational drug design is to use the experimental or predicted 3D structure of the target protein to screen compound databases with the purpose of identifying appropriate drugs which can inhibit or activate the protein (Figure 13).

    Figure 13. A successful example of structure-based drug design by Bugg et al. in 1990s in designing a molecule
    that inhibits enzyme purine nucleoside phosphorylase (PNP). PNP normally takes up individual nucleosides (a)
    and cleaves the purine from the sugar, giving rise to a free purine base and a phosphorylated sugar (b).
    A tightly fitting compound blocks the binding pocket and therefore inhibits the acitivity of the PNP enzyme (c).

    We recently developed a composite approach for druglike compound identification, which combines structure-based virtual screening with quantitative structure-activity relationship (QSAR). When using the approach to the epidermal growth factor receptor (EGFR), an important target protein associated with brain, lung, bladder and colon tumors, we found that two compounds (2 and 21) have significant EGFR-inhibitory activities (Figure 14). The experimental assay to test the ability of the compounds in inhibiting the receptor proteins is in progress.

    Figure 14. Binding structure of two compounds screened from the ZINC library which have inhibitory
    activity on the epidermal growth factor receptor (EGFR), an important tumor target protein.

yangzhanglabumich.edu | (734) 647-1549 | 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218