Zhang Lab Service System Discussion Board - New forum at https://zhanglab.dcmb.med.umich.edu/forum/

This forum is intended for questions and discussion regarding the service systems developed in the Zhang Lab. New forum at https://zhanglab.dcmb.med.umich.edu/forum/

ResPRE-training set

Hi,
In the website of ResPRE, you provide the list of the training set. I'm wondering where I can download the true contact maps for those targets.
Thanks

BSP-SLIM

Hi Everyone

I did molecular docking with BSP-SLIM server in my protein that predicted in I-TASSER Coach and I uploaded my ligand in sdf format. My docking score is nan but when I analysis it, There is an interaction between ligand and protein. Where is the problem on it? I have another molecular docking that docking score=4.704

BioLip: Why the ligand_id for 5JER is III?

Can anybody explain why the ligand_id for 5JER is called 'III'? It appears that 'III' is used by PDB to represent a different ligand (see https://www.rcsb.org/ligand/III). The actual ligand on 5JER is 'Rotavirus NSP1 peptide' containing 19-amino acids.

I tried to look up information for 'III' but the following URL returns a '500 internal server error':
https://zhanglab.ccmb.med.umich.edu/BioLiP/qsearch.cgi?lig3=III

Thanks.

Server error for BioLip

It appears that the following URL returns '500 internal server error'. Could someone have a look?
https://zhanglab.ccmb.med.umich.edu/BioLiP/qsearch.cgi?lig3=III

Weird MM-align behaviour with two monomers

Hi,

I am running into an issue with MM-align. I have two monomers that I want to superposed, saved in PDB format, attached. For a reason I don't understand, MM-align tries to align two chains of model01.pdb. I can see that the sequence is duplicated in UPPER case, with a space in between. However model01 only has one chain, as it is a monomer.

Steps to reproduce:

./MMalign model01.pdb model02.pdb

Generates the following output:

**************************************************************************
* MM-align *
* Aligning protein complex structure by Dynamic Programming *
* Comments on the program, please email to: zhng@umich.edu *
**************************************************************************

Protein 1:model01.pd Size= 261
Protein 2:model02.pd Size= 258 (TM-score is normalized by 258)

Aligned length= 257, RMSD= 1.57, TM-score=0.94660, ID=0.996

-------- rotation matrix to rotate Chain-1 to Chain-2 ------
i t(i) u(i,1) u(i,2) u(i,3)
1 32.9895352759 -0.1835116774 0.2813259282 -0.9419018985
2 25.2317022225 -0.1682269668 0.9350513192 0.3120556330
3 12.4437071360 0.9685159534 0.2157191520 -0.1242662284

(":" denotes the residue pairs of distance < 5.0 Angstrom)
ppyslfeawakpvqpfaiwpgvwyvgtenlssvllttpqghilidagldasapqirrniealgfrmadiryiansharldqaggiarlkawsgarviashanaeqmarggkedfalgdalpfppvtvdmeaqdgqqwhlggvtlaaiftpghlpgatswkvtladgktliyadslatpgyplinnrnyptlvedirrsfarleaqqvdiflankgerfglmdkmarkargennafidkaglaryvaqsraafekqlaaqra- PPYSLFEAWAKPVQPFAIWPGVWYVGTENLSSVLLTTPQGHILIDAGLDASAPQIRRNIEALGFRMADIRYIANSHARLDQAGGIARLKAWSGARVIASHANAEQMARGGKEDFALGDALPFPPVTVDMEAQDGQQWHLGGVTLAAIFTPGHLPGATSWKVTLADGKTLIYADSLATPGYPLINNRNYPTLVEDIRRSFARLEAQQVDIFLANKGERFGLMDKMARKARGENNAFIDKAGLARYVAQSRAAFEKQLAAQRA

Verifying sequence-based domain prediction with I-TASSER

I work mostly on DNA genomic sequences, and have limited experience with and knowledge of protein structure and homology modeling. The purpose of this message to the forum is to request feedback and advice regarding my proposed use of I-TASSER as a solution to my research problem - from both practical and theoretical points of view. Thank you!

A. In one "test" species, I have ~800 domains predicted by HMMER3, to a profile HMM from Pfam.
But across ~ 50 species, the total number of predicted domain sequences is ~ 17K. (length range is ~17aa - 103aa, with majority in 35-45aa range).
This is my full dataset.

B. Prediction domain sequences show big variations in score, length and sequence composition
- consequently, my domain-based phylogenetic tree has too many branches with zero bootstrap support.
I need some independent way to verify sequence-based predictions.

C. In RCSB PDB database, there are 7 PDBs that contain my domain of interest. I trimmed these PDB file down to domain only coordinates. Then using UCSF-CHIMERA, I could see that even though pairwise sequence identity is as low as 17%, but 3D overlap seems good.

D. Therefore, I want to use protein structure to segregate my sequence-based predictions into 4 categories - full-length, partial-length, degraded and false positives, or something on those lines.

From the I-TASSER tutorial (https://www.youtube.com/watch?v=quF4dqLGKFM), my understanding is that the longest / most computationally intensive step is structure assembly via Monte Carlo simulation.

With that as background info, these are my questions for your feedback, please:

1. Since my ~ 17K sequences are domain-only sequences, rather than full-length protein sequences, will it make homology modeling quicker and more practically feasible than multi-day runs for each query, i.e. domain sequence?

Different results for Local and Online when using ANGLOR

Excuse me, I input same sequences when I using ANGLOR in local pc and online web page respectively, then I get different results. I want to know if the version is different or something wrong? And we I run the script locally, sometimes there is a warnging that

"[blastpgp] WARNING: posPurgeMatches: Due to purging near identical sequences, only the query is used to construct the position-specific score matrix".

How can I fix this or just ingore it?

Thank you!

InterProScan XML

Hi everyone,

I used Cofactor, Coach and I-TASSER tools. I need to output results as InterProScan XML format in predicted protein

ThreaDomEx is stalling on results display

Hi:

I can't get my results from ThreaDomEx runs that have completed. The output page stalls at 13%.

thank you!

- Amy

Differences in results between standalone and server

Hi,

I have a sequence of about 150 residues that I've run on a local installation of I-TASSER 5.1 and also on the online server. I do not get the same model results between local and server. When I run TMalign to quantify differences between model1.pdb for the two runs, I get a TM-score of 0.89785. While this still indicates strong similarity, my initial expectation was that the two runs would have identical results.

My downloaded libraries are a few weeks old - could not having the most up-to-date, matching libraries result in differences between local vs server?

Thanks in advance for your time!!!

-Hollister

FG-MD energy minimization parameters

Dear all,

I am using your very useful FG-MD tool to refine some structures obtained by a homology model approach.

However, in order to interpret correctly the results, I need some details on the parameters used in the simulations (used force field, number of steps, time intervals, etc), and I noticed that these are not provided by the web interface.

The simulation was performed on the models identified by codes: S9829, S9849, S9867, S9882, S9907, S9921 and S10012.
Is there a way to access to these parameters?

I look forward to hearing from you.

Yours faithfully,

Alberto Toffano

Combining PDB 2 COACH ligands

I have a model I created in I-TASSER of a cystine-rich zinc finger (C1) type domain. I know these domains contain 2 zincs. I ran the model through the COACH server and it found both sites. I do not know, however, how to download a PDB containing both zincs or, if that is not possible, how to combine them later.

BioLip Annotation File Header

Hi,

I downloaded the BioLip Annotation file 'BioLiP_2013-03-6.txt'. However, there is no header information.

I can see that there are twenty columns. I can guess what some columns are but not all of them.

Could someone provide information on what these 20 columns are?

Thanks a lot.

Bingwen Lu

Error running I-TASSER on Win 10 UBUNTU app (Final message; only 3 threading programs have output, please check threading)

Hello,
I unsuccessfully run I-TASSER5.1 locally on UBUNTU app in Windows 10 and the run log is as below;

atajera@LAPTOP-2STV4UM6:/mnt/c/I-Tasser/Suite/I-TASSER5.1$ I-TASSERmod/runI-TASSER.pl -libdir /mnt/c/I-Tasser/Library -seqname PmRBP1 -datadir /mnt/c/I-Tasser/Proteins/seq.fasta

Your setting for running I-TASSER is:
-pkgdir = /mnt/c/I-Tasser/Suite/I-TASSER5.1
-libdir = /mnt/c/I-Tasser/Library
-java_home = /usr
-seqname = PmRBP1
-datadir = /mnt/c/I-Tasser/Proteins
-outdir = /mnt/c/I-Tasser/Proteins
-runstyle = serial
-homoflag = real
-idcut = 1
-ntemp = 20
-nmodel = 5
-light = false
-hours = 50
-LBS = false
-EC = false
-GO = false

1. make seq.txt and rmsinp
Your protein contains 984 residues:
> PmRBP1
MFFFKGQTRFRNIEQFFFFFFFIKNLRTTNGENVLEDIKGKDGNLDFYSLDYNSKKANKL
KYNREKKIYIKNMNFLGEKGNIRNIDNVQSEDVVVAHSSNDISFENLKGSNIGKNLSNHV
YIKKDSLYNSNENNNIKGEINNEKNYIAKSFIHNEEEAYNIRKSLYDKMDQHAVFNPFID
MEIDFIDLQYFKDILDLIPGNTSYSKYYNEEFKKIIDEYSDILSNLVKTCITEKMELIKL
EHEIKYPKKDSMEKKLEEKSIELESKRELYHNKLNNYYKNIKPKMDEVRNKGHALLQESY
CTENCSTYIAKYDDLVEKILLDIKNYGNKGHVVLEKSINDFSFLDMILQYSNQQNNDMRE
SINTLQLLGEEIKEISEIYLINSTLINDLTNFFLEIKKIKEPIDSKQFTEKLKTLIRNSC
LFRHIHKFNEPITQIYETKVTSSNKLFTSVIEKLQNETESLIKTLLDDLEFHEIRKNSEE
ITNYVKNMYDKNKELYDSMIKGLDENINLKIDKLDEFLSYQYYINDDIVNYNDFISLKYK
HIYIHLKAECEKDLRERKLLPNNVLKAKLFLEIIIKIKADEKIISESFDTAKKFYEKIKD
LKKEFEEAYVEFEEVKNEINKMKNIDDNRDKNIEEIKDKLEHITKKQNNLKEVAALKDKG
NVEITASDELLAIIPNKNENEFKNKIDKTRDDMNSLFKSLNNDNINRLIESTEEFVNQKK
NISFEDMASEVIENHLQDIRNKFDKINFISDDKMKELHDKMKEQVTIAENIKKEIVKKEI
ENLQKELAKFFSTFNSEQQELLISLEEYKREGEKIQKYRDSLLKRENEYFSSVHVDTNDL
EKTKNQAELDKLQDTFAKKKDEISRKINNIKDLINTANPHLNFYSFVEKYFNINEDKKGV
ENIKALKDKINNVDMNKQVLELENDFKRKTDALENDIISIKSITKKLISLSTLNKNINQC
DENNGAAELLKNNAKRLREEVEKE
2.1 run Psi-blast
2.2 Predict secondary structure with PSSpred...
2.3 Predict solvent accessibility...
2.4 run pairmod
2.4.1 Use all templates
2.4.2 running pair ................

Multidomain protein server?

I made models of 2 domains present in my protein using I-TASSER. I now need a model that contains both domains as well as the unstructured loop connecting them. I know I-TASSER is not optimized for this. What is the best method moving forward to create a multi-domain model? Should I use AIDA to improve model afterwards? Is there something else I should use that would be better? Please note that I have a Mac and no knowledge of command line so my options are very limited.

TM-score/TM-align between two proteins of identical sequence

Hello,
I am not sure how to calculate the TM-score between two proteins of identical sequences. The TM-score server seems to insert gaps into one of the sequence so does misalign the proteins and produces
RMSD of the common residues= 20.260

The TM-align does a more drastic re-alignment of the sequences and produces
Aligned length= 301, RMSD= 3.09

Idependent verification of the RMSD with Swiss-pdb viewer gives 18.42 Angstrom over all residues, which I believe is the correct number. I have no means of veryfying the TM-score, but I assume something is not working correctly here. In particular the TM-score server should not attemp alignment of the sequences.
I would be grateful for any help.

Best wishes
Andreas

Output:
*****************************************************************************
* TM-SCORE *
* A scoring function to assess the similarity of protein structures *
* Based on statistics: *
* 0.0 < TM-score < 0.17, random structural similarity *
* 0.5 < TM-score < 1.00, in about the same fold *
* Reference: Yang Zhang and Jeffrey Skolnick, Proteins 2004 57: 702-710 *
* For comments, please email to: zhng@umich.edu *
*****************************************************************************

Structure1: A950606 Length= 460
Structure2: B950606 Length= 460 (by which all scores are normalized)
Number of residues in common= 452
RMSD of the common residues= 20.260

TM-score = 0.3555 (d0= 7.67)
MaxSub-score= 0.1396 (d0= 3.50)
GDT-TS-score= 0.1913 %(d<1)=0.1000 %(d<2)=0.1261 %(d<4)=0.2087 %(d<8)=0.3304
GDT-HA-score= 0.1207 %(d<0.5)=0.0478 %(d<1)=0.1000 %(d<2)=0.1261 %(d<4)=0.2087

-------- rotation matrix to rotate Chain-1 to Chain-2 ------

How to export 3D gif file in I-TASSER?

Hello, I tried to export protein 3D GIF files by right-click, but I failed... A new window occurred and said 'need some information from you', and then nothing happened. So how to export these files? I can find one GIF picture in the tarball file folder, but it cannot rotate.

BioLip relational logic

Hello..
Is it possible for you to send me the entity-relationship diagram of the Biolip database?

COACH algorithm

can someone tell me which is the COACH algorithm because i need it for my study? thanks

low rank model with slightly better c-score

I was trying to model the PH domain of my target protein using I-TASSER. I have a situation where one of the lower ranking models has a slightly higher C-score (see image). Since this is the case, and I am very new to modeling, I do not know which model use moving forward. In addition, if I should move forward for the lower ranking model I do not know how to figure out the estimated TM-score and RMSD since they are not listed below the model.

Biolip architecture

Hello,
I need information about BioLip database architecture (tables etc.)but there isn't something about it in your paper. Where can i find them?

why Proline in beta strand in predicted structure

Hi, I submitted a sequence for QUARK protein prediction. I noticed that a proline amino acid residue is in the middle of a beta strand. I learned that proline is the disruptor of beta strand (sheet) and helix, why here proline is predicted to form beta strand. Could you guys explain that? Thanks!

BSP Slim shows ligand docking in the model PDB but only the ligand in the poses.sdf

Hi:

My BSP Slim results often show ligand docking in the model PDB but --only the ligand alone appears in the poses.sdf when viewed in pymol.

Is this usual? thank you, Amy

also, xdrawchem pulls a blank screen when I load poses.sdf but other types of .sdf files are displayed.

iTasser - how to define "restraint" parameters i.e. use of a specific PDB template for modelling?

Hello,

We are attempting to use a specific PDB for the threading/modelling.

In the optional arguments it appears that one is able to use "-restaint1/2/3/4", however when run with this argument it is not taking it but instead utilizing the other arguments (i.e. homoflag, idcut, ntemp etc.)

Could anyone please advise how to use the "restraint" argument.

Thank you kindly,
JN

BioLip

Hello,
I need informations about BioLip database architecture (tables, relations of tables), but there isn't something about it in original paper. Where can i find these informations?

How to Increase C-score? and accuracy on local I-Tasser 5.1

Hi,
I have tried same sequence on both local I-Tasser5.1 and web I-Tasser. I realized my C-score for the same sequence model 1 on both web and local is a bit different. For example on web I-Tasser I might have -1.74 but the local I-Tasser5.1 might have -2.7 C-score. Is there any setting I can change to increase the C-score on the local I-Tasser5.1? The setting I have for local I-Tasser 5.1 is -light -hours 20, which is the closer structure to the Web I-Tasser result. If I use default setting 50 hours the structure looks very different and with a very low C-score (model 1), but model 2 is similar to the Web I-Tasser, but with a low C-score (-3.6). Thank you for your time to read this message.

Kind Regards,
Heng Ku

the interaction of transmembrane and its substrate

I have a question about designing the interaction of transmembrane and its substrate which is a drug. How Can I design the drug passing through its specific transmembrane with interaction active side and atomic bond? Is there any advisable tool or software to make it.

ERROR! Cannot find (PSI-)BLAST library at /data3/yzz/machine/nr/

I have downloaded the file from the website https://zhanglab.ccmb.med.umich.edu/PSSpred/ and do everything as the tips. But there was an error when I try to run the script PSSpred.pl(I have changed the $db: $db="/data3/yzz/machine/nr/"; #where NR sequence database is).
I guess it may caused by the file named nr.tar.gz in the wedsite https://zhanglab.ccmb.med.umich.edu/PSSpred/. Because when I open a file in the dolder, the text is : Warning! This is NOT an NR database.This is a fake NR database generated with swissprot release 2017_06. It is compatible with PSSpred program. But it is known to generate worse result than the original nr database.
So what can I do to fix the issue?

Protein structure prediction

what should we do in protein structure prediction via homology modeling, if the identity of similarity between target and template sequence is above %30?

Issue with I-TASSER: wMuster "Illegal division by zero"

Hello,

I am trying to run I-TASSER locally. Everything works well until it gets to the threading portion with wMuster:

horto@LAPTOP-MPAF31S5:/mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1$ $pkgdir/mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1/I-TASSERmod/runI-TASSER.pl -libdir /mnt/c/Users/horto/Downloads/I-TASSER/ITLIB/ -seqname P03126 -datadir /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA -ntemp 1 -nmodel 1

Your setting for running I-TASSER is:
-pkgdir = /mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1
-libdir = /mnt/c/Users/horto/Downloads/I-TASSER/ITLIB
-java_home = /usr
-seqname = P03126
-datadir = /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA
-outdir = /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA
-runstyle = serial
-homoflag = real
-idcut = 1
-ntemp = 1
-nmodel = 1
-light = false
-hours = 50
-LBS = false
-EC = false
-GO = false

1. make seq.txt and rmsinp
Your protein contains 158 residues:
> P03126
MHQKRTAMFQDPQERPRKLPQLCTELQTTIHDIILECVYCKQQLLRREVYDFAFRDLCIV
YRDGNPYAVCDKCLKFYSKISEYRHYCYSLYGTTLEQQYNKPLCDLLIRCINCQKPLCPE
EKQRHLDKKQRFHNIRGRWTGRCMSCCRSSRTRRETQL
2.1 run Psi-blast
2.2 Secondary structure prediction was done before.
2.3 Predict solvent accessibility...
2.4 run pairmod
2.4.1 Use all templates
2.4.2 running pair ................
FORTRAN STOP
30000 6379375 total lib str & residues
number of observations 15.84338 1186542.
pair done
3.1 do threading
start serial threading PPAS
/tmp/horto/ITP03126
/mnt/c/Users/horto/Downloads/I-TASSER/ITDATA/PPAS_P03126
hostname: LAPTOP-MPAF31S5
starting time: Wed Apr 24 15:57:36 DST 2019
pwd: /tmp/horto/ITP03126
running zalign .....
ending time: Wed Apr 24 16:04:14 DST 2019

start serial threading dPPAS
/tmp/horto/ITP03126
/mnt/c/Users/horto/Downloads/I-TASSER/ITDATA/dPPAS_P03126
hostname: LAPTOP-MPAF31S5
starting time: Wed Apr 24 16:04:16 DST 2019
pwd: /tmp/horto/ITP03126

Syndicate content