Home Research Services Publications People Teaching Job Opening News Lab Only
Online Services

I-TASSER QUARK LOMETS COACH COFACTOR MUSTER SEGMER FG-MD ModRefiner REMO SPRING COTH BSpred SVMSEQ ANGLOR BSP-SLIM SAXSTER ThreaDom EvoDesign GPCR-I-TASSER BindProf BindProfX ResQ IonCom STRUM

TM-score TM-align MMalign NWalign EDTSurf MVP MVP-Fit SPICKER HAAD PSSpred 3DRobot I-TASSER-MR

BioLiP E. coli GLASS GPCR-HGmod GPCR-RD GPCR-EXP TM-fold DECOYS POTENTIAL RW HPSF CASP7 CASP8 CASP9 CASP10 CASP11 CASP12

Protein Design


We developed a protein design algorithm that selects the final designed sequences from clusters of low-free-energy decoy sequences.


Test set
Click on the test proteins and browse the designed sequences and their I-TASSER predicted 3D models.

1GUTA	2CMPA	3G36A	3FILA	1OAIA	2VPBA	2V1QA	1KQ1A	2P5KA	1TUKA	2O9SA	1UTGA	1V5IB
2B97A	2QCPX	2CVIA	3G21A	2J8BA	2D3DA	3FEAA	2ZXYA	2GPIA	2FTRA	1IUJA	1MG4A	2PV2A
1VQSA	3IV4A	3CTGA	1NZ0A	3E9TA	1O7IA	3H7HA	1WN2A	2F01A	1DBWA	2ERBA	1EAQA	1OH0A
1VZIA	2VZCA	1ZHVA	1JF8A	3EBTA	2PR7A	1QHQA	2O1QA	2WLVA	2ANXA	3FH2A	2V0UA	3EF8A	

Sequence clustering program
CLUSEQ is a program to cluster a set of amino acid sequences based on their BLOSUM62 similarity.
  INSTALLATION (Linux)
  1. cd to the directory where you want to install the program.
  2. Download and unpack "cluseq.tar.gz" into the installation directory.
  3. Change the path to input file "blosum62.txt" in source file "cseqlongset/Amino1CharSeq.cpp" (line 6). 
     The path must be changed to the location you choose for "blosum62.txt" in your file system.
  4. Run the "build" script from the installation directory.
  5. You are ready to run CLUSEQ.
 
  USAGE
  cluseq <path_to_the_input_file>
  - The input file (example) contains the set of amino acid sequences to cluster. 
    The sequences are all of the same length and occupy consecutive lines in the file. 
    Each sequence lies on a single line and is immediately followed by a line terminator.
  - The output (example) is printed to screen. First it lists the cluster centers (tags), then it lists
    the entire clusters. The clusters are expressed in terms of sequence indices in the input file. 
    Both the sequence indices and the cluster indices start at 0. Cluster 0 is the largest cluster. 
    The first index of each cluster represents the tag of the cluster.                               


Reference:
A. Bazzoli, A. Tettamanzi, Y. Zhang. Computational protein design and large-scale assessment by I-TASSER structure assembly simulations. (2011). Journal of Molecular Biology 407, 764–776.

yangzhanglabumich.edu | (734) 647-1549 | 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218