Hi,
In the website of ResPRE, you provide the list of the training set. I'm wondering where I can download the true contact maps for those targets.
Thanks
Hi Everyone
I did molecular docking with BSP-SLIM server in my protein that predicted in I-TASSER Coach and I uploaded my ligand in sdf format. My docking score is nan but when I analysis it, There is an interaction between ligand and protein. Where is the problem on it? I have another molecular docking that docking score=4.704
Can anybody explain why the ligand_id for 5JER is called 'III'? It appears that 'III' is used by PDB to represent a different ligand (see https://www.rcsb.org/ligand/III). The actual ligand on 5JER is 'Rotavirus NSP1 peptide' containing 19-amino acids.
I tried to look up information for 'III' but the following URL returns a '500 internal server error':
https://zhanglab.ccmb.med.umich.edu/BioLiP/qsearch.cgi?lig3=III
Thanks.
It appears that the following URL returns '500 internal server error'. Could someone have a look?
https://zhanglab.ccmb.med.umich.edu/BioLiP/qsearch.cgi?lig3=III
Hi,
I am running into an issue with MM-align. I have two monomers that I want to superposed, saved in PDB format, attached. For a reason I don't understand, MM-align tries to align two chains of model01.pdb. I can see that the sequence is duplicated in UPPER case, with a space in between. However model01 only has one chain, as it is a monomer.
Steps to reproduce:
./MMalign model01.pdb model02.pdb
Generates the following output:
**************************************************************************
* MM-align *
* Aligning protein complex structure by Dynamic Programming *
* Comments on the program, please email to: zhng@umich.edu *
**************************************************************************
Protein 1:model01.pd Size= 261
Protein 2:model02.pd Size= 258 (TM-score is normalized by 258)
Aligned length= 257, RMSD= 1.57, TM-score=0.94660, ID=0.996
-------- rotation matrix to rotate Chain-1 to Chain-2 ------
i t(i) u(i,1) u(i,2) u(i,3)
1 32.9895352759 -0.1835116774 0.2813259282 -0.9419018985
2 25.2317022225 -0.1682269668 0.9350513192 0.3120556330
3 12.4437071360 0.9685159534 0.2157191520 -0.1242662284
(":" denotes the residue pairs of distance < 5.0 Angstrom)
ppyslfeawakpvqpfaiwpgvwyvgtenlssvllttpqghilidagldasapqirrniealgfrmadiryiansharldqaggiarlkawsgarviashanaeqmarggkedfalgdalpfppvtvdmeaqdgqqwhlggvtlaaiftpghlpgatswkvtladgktliyadslatpgyplinnrnyptlvedirrsfarleaqqvdiflankgerfglmdkmarkargennafidkaglaryvaqsraafekqlaaqra- PPYSLFEAWAKPVQPFAIWPGVWYVGTENLSSVLLTTPQGHILIDAGLDASAPQIRRNIEALGFRMADIRYIANSHARLDQAGGIARLKAWSGARVIASHANAEQMARGGKEDFALGDALPFPPVTVDMEAQDGQQWHLGGVTLAAIFTPGHLPGATSWKVTLADGKTLIYADSLATPGYPLINNRNYPTLVEDIRRSFARLEAQQVDIFLANKGERFGLMDKMARKARGENNAFIDKAGLARYVAQSRAAFEKQLAAQRA
I work mostly on DNA genomic sequences, and have limited experience with and knowledge of protein structure and homology modeling. The purpose of this message to the forum is to request feedback and advice regarding my proposed use of I-TASSER as a solution to my research problem - from both practical and theoretical points of view. Thank you!
A. In one "test" species, I have ~800 domains predicted by HMMER3, to a profile HMM from Pfam.
But across ~ 50 species, the total number of predicted domain sequences is ~ 17K. (length range is ~17aa - 103aa, with majority in 35-45aa range).
This is my full dataset.
B. Prediction domain sequences show big variations in score, length and sequence composition
- consequently, my domain-based phylogenetic tree has too many branches with zero bootstrap support.
I need some independent way to verify sequence-based predictions.
C. In RCSB PDB database, there are 7 PDBs that contain my domain of interest. I trimmed these PDB file down to domain only coordinates. Then using UCSF-CHIMERA, I could see that even though pairwise sequence identity is as low as 17%, but 3D overlap seems good.
D. Therefore, I want to use protein structure to segregate my sequence-based predictions into 4 categories - full-length, partial-length, degraded and false positives, or something on those lines.
From the I-TASSER tutorial (https://www.youtube.com/watch?v=quF4dqLGKFM), my understanding is that the longest / most computationally intensive step is structure assembly via Monte Carlo simulation.
With that as background info, these are my questions for your feedback, please:
1. Since my ~ 17K sequences are domain-only sequences, rather than full-length protein sequences, will it make homology modeling quicker and more practically feasible than multi-day runs for each query, i.e. domain sequence?
Excuse me, I input same sequences when I using ANGLOR in local pc and online web page respectively, then I get different results. I want to know if the version is different or something wrong? And we I run the script locally, sometimes there is a warnging that
"[blastpgp] WARNING: posPurgeMatches: Due to purging near identical sequences, only the query is used to construct the position-specific score matrix".
How can I fix this or just ingore it?
Thank you!
Hi everyone,
I used Cofactor, Coach and I-TASSER tools. I need to output results as InterProScan XML format in predicted protein
Hi:
I can't get my results from ThreaDomEx runs that have completed. The output page stalls at 13%.
thank you!
- Amy
Hi,
I have a sequence of about 150 residues that I've run on a local installation of I-TASSER 5.1 and also on the online server. I do not get the same model results between local and server. When I run TMalign to quantify differences between model1.pdb for the two runs, I get a TM-score of 0.89785. While this still indicates strong similarity, my initial expectation was that the two runs would have identical results.
My downloaded libraries are a few weeks old - could not having the most up-to-date, matching libraries result in differences between local vs server?
Thanks in advance for your time!!!
-Hollister
Dear all,
I am using your very useful FG-MD tool to refine some structures obtained by a homology model approach.
However, in order to interpret correctly the results, I need some details on the parameters used in the simulations (used force field, number of steps, time intervals, etc), and I noticed that these are not provided by the web interface.
The simulation was performed on the models identified by codes: S9829, S9849, S9867, S9882, S9907, S9921 and S10012.
Is there a way to access to these parameters?
I look forward to hearing from you.
Yours faithfully,
Alberto Toffano
I have a model I created in I-TASSER of a cystine-rich zinc finger (C1) type domain. I know these domains contain 2 zincs. I ran the model through the COACH server and it found both sites. I do not know, however, how to download a PDB containing both zincs or, if that is not possible, how to combine them later.
Hi,
I downloaded the BioLip Annotation file 'BioLiP_2013-03-6.txt'. However, there is no header information.
I can see that there are twenty columns. I can guess what some columns are but not all of them.
Could someone provide information on what these 20 columns are?
Thanks a lot.
Bingwen Lu
Hello,
I unsuccessfully run I-TASSER5.1 locally on UBUNTU app in Windows 10 and the run log is as below;
atajera@LAPTOP-2STV4UM6:/mnt/c/I-Tasser/Suite/I-TASSER5.1$ I-TASSERmod/runI-TASSER.pl -libdir /mnt/c/I-Tasser/Library -seqname PmRBP1 -datadir /mnt/c/I-Tasser/Proteins/seq.fasta
Your setting for running I-TASSER is:
-pkgdir = /mnt/c/I-Tasser/Suite/I-TASSER5.1
-libdir = /mnt/c/I-Tasser/Library
-java_home = /usr
-seqname = PmRBP1
-datadir = /mnt/c/I-Tasser/Proteins
-outdir = /mnt/c/I-Tasser/Proteins
-runstyle = serial
-homoflag = real
-idcut = 1
-ntemp = 20
-nmodel = 5
-light = false
-hours = 50
-LBS = false
-EC = false
-GO = false
1. make seq.txt and rmsinp
Your protein contains 984 residues:
> PmRBP1
MFFFKGQTRFRNIEQFFFFFFFIKNLRTTNGENVLEDIKGKDGNLDFYSLDYNSKKANKL
KYNREKKIYIKNMNFLGEKGNIRNIDNVQSEDVVVAHSSNDISFENLKGSNIGKNLSNHV
YIKKDSLYNSNENNNIKGEINNEKNYIAKSFIHNEEEAYNIRKSLYDKMDQHAVFNPFID
MEIDFIDLQYFKDILDLIPGNTSYSKYYNEEFKKIIDEYSDILSNLVKTCITEKMELIKL
EHEIKYPKKDSMEKKLEEKSIELESKRELYHNKLNNYYKNIKPKMDEVRNKGHALLQESY
CTENCSTYIAKYDDLVEKILLDIKNYGNKGHVVLEKSINDFSFLDMILQYSNQQNNDMRE
SINTLQLLGEEIKEISEIYLINSTLINDLTNFFLEIKKIKEPIDSKQFTEKLKTLIRNSC
LFRHIHKFNEPITQIYETKVTSSNKLFTSVIEKLQNETESLIKTLLDDLEFHEIRKNSEE
ITNYVKNMYDKNKELYDSMIKGLDENINLKIDKLDEFLSYQYYINDDIVNYNDFISLKYK
HIYIHLKAECEKDLRERKLLPNNVLKAKLFLEIIIKIKADEKIISESFDTAKKFYEKIKD
LKKEFEEAYVEFEEVKNEINKMKNIDDNRDKNIEEIKDKLEHITKKQNNLKEVAALKDKG
NVEITASDELLAIIPNKNENEFKNKIDKTRDDMNSLFKSLNNDNINRLIESTEEFVNQKK
NISFEDMASEVIENHLQDIRNKFDKINFISDDKMKELHDKMKEQVTIAENIKKEIVKKEI
ENLQKELAKFFSTFNSEQQELLISLEEYKREGEKIQKYRDSLLKRENEYFSSVHVDTNDL
EKTKNQAELDKLQDTFAKKKDEISRKINNIKDLINTANPHLNFYSFVEKYFNINEDKKGV
ENIKALKDKINNVDMNKQVLELENDFKRKTDALENDIISIKSITKKLISLSTLNKNINQC
DENNGAAELLKNNAKRLREEVEKE
2.1 run Psi-blast
2.2 Predict secondary structure with PSSpred...
2.3 Predict solvent accessibility...
2.4 run pairmod
2.4.1 Use all templates
2.4.2 running pair ................
I made models of 2 domains present in my protein using I-TASSER. I now need a model that contains both domains as well as the unstructured loop connecting them. I know I-TASSER is not optimized for this. What is the best method moving forward to create a multi-domain model? Should I use AIDA to improve model afterwards? Is there something else I should use that would be better? Please note that I have a Mac and no knowledge of command line so my options are very limited.
Hello,
I am not sure how to calculate the TM-score between two proteins of identical sequences. The TM-score server seems to insert gaps into one of the sequence so does misalign the proteins and produces
RMSD of the common residues= 20.260
The TM-align does a more drastic re-alignment of the sequences and produces
Aligned length= 301, RMSD= 3.09
Idependent verification of the RMSD with Swiss-pdb viewer gives 18.42 Angstrom over all residues, which I believe is the correct number. I have no means of veryfying the TM-score, but I assume something is not working correctly here. In particular the TM-score server should not attemp alignment of the sequences.
I would be grateful for any help.
Best wishes
Andreas
Output:
*****************************************************************************
* TM-SCORE *
* A scoring function to assess the similarity of protein structures *
* Based on statistics: *
* 0.0 < TM-score < 0.17, random structural similarity *
* 0.5 < TM-score < 1.00, in about the same fold *
* Reference: Yang Zhang and Jeffrey Skolnick, Proteins 2004 57: 702-710 *
* For comments, please email to: zhng@umich.edu *
*****************************************************************************
Structure1: A950606 Length= 460
Structure2: B950606 Length= 460 (by which all scores are normalized)
Number of residues in common= 452
RMSD of the common residues= 20.260
TM-score = 0.3555 (d0= 7.67)
MaxSub-score= 0.1396 (d0= 3.50)
GDT-TS-score= 0.1913 %(d<1)=0.1000 %(d<2)=0.1261 %(d<4)=0.2087 %(d<8)=0.3304
GDT-HA-score= 0.1207 %(d<0.5)=0.0478 %(d<1)=0.1000 %(d<2)=0.1261 %(d<4)=0.2087
-------- rotation matrix to rotate Chain-1 to Chain-2 ------
Hello, I tried to export protein 3D GIF files by right-click, but I failed... A new window occurred and said 'need some information from you', and then nothing happened. So how to export these files? I can find one GIF picture in the tarball file folder, but it cannot rotate.
Hello..
Is it possible for you to send me the entity-relationship diagram of the Biolip database?
can someone tell me which is the COACH algorithm because i need it for my study? thanks
I was trying to model the PH domain of my target protein using I-TASSER. I have a situation where one of the lower ranking models has a slightly higher C-score (see image). Since this is the case, and I am very new to modeling, I do not know which model use moving forward. In addition, if I should move forward for the lower ranking model I do not know how to figure out the estimated TM-score and RMSD since they are not listed below the model.
Hello,
I need information about BioLip database architecture (tables etc.)but there isn't something about it in your paper. Where can i find them?
Hi, I submitted a sequence for QUARK protein prediction. I noticed that a proline amino acid residue is in the middle of a beta strand. I learned that proline is the disruptor of beta strand (sheet) and helix, why here proline is predicted to form beta strand. Could you guys explain that? Thanks!
Hi:
My BSP Slim results often show ligand docking in the model PDB but --only the ligand alone appears in the poses.sdf when viewed in pymol.
Is this usual? thank you, Amy
also, xdrawchem pulls a blank screen when I load poses.sdf but other types of .sdf files are displayed.
Hello,
We are attempting to use a specific PDB for the threading/modelling.
In the optional arguments it appears that one is able to use "-restaint1/2/3/4", however when run with this argument it is not taking it but instead utilizing the other arguments (i.e. homoflag, idcut, ntemp etc.)
Could anyone please advise how to use the "restraint" argument.
Thank you kindly,
JN
Hello,
I need informations about BioLip database architecture (tables, relations of tables), but there isn't something about it in original paper. Where can i find these informations?
Hi,
I have tried same sequence on both local I-Tasser5.1 and web I-Tasser. I realized my C-score for the same sequence model 1 on both web and local is a bit different. For example on web I-Tasser I might have -1.74 but the local I-Tasser5.1 might have -2.7 C-score. Is there any setting I can change to increase the C-score on the local I-Tasser5.1? The setting I have for local I-Tasser 5.1 is -light -hours 20, which is the closer structure to the Web I-Tasser result. If I use default setting 50 hours the structure looks very different and with a very low C-score (model 1), but model 2 is similar to the Web I-Tasser, but with a low C-score (-3.6). Thank you for your time to read this message.
Kind Regards,
Heng Ku
I have a question about designing the interaction of transmembrane and its substrate which is a drug. How Can I design the drug passing through its specific transmembrane with interaction active side and atomic bond? Is there any advisable tool or software to make it.
I have downloaded the file from the website https://zhanglab.ccmb.med.umich.edu/PSSpred/ and do everything as the tips. But there was an error when I try to run the script PSSpred.pl(I have changed the $db: $db="/data3/yzz/machine/nr/"; #where NR sequence database is).
I guess it may caused by the file named nr.tar.gz in the wedsite https://zhanglab.ccmb.med.umich.edu/PSSpred/. Because when I open a file in the dolder, the text is : Warning! This is NOT an NR database.This is a fake NR database generated with swissprot release 2017_06. It is compatible with PSSpred program. But it is known to generate worse result than the original nr database.
So what can I do to fix the issue?
what should we do in protein structure prediction via homology modeling, if the identity of similarity between target and template sequence is above %30?
Hello,
I am trying to run I-TASSER locally. Everything works well until it gets to the threading portion with wMuster:
horto@LAPTOP-MPAF31S5:/mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1$ $pkgdir/mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1/I-TASSERmod/runI-TASSER.pl -libdir /mnt/c/Users/horto/Downloads/I-TASSER/ITLIB/ -seqname P03126 -datadir /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA -ntemp 1 -nmodel 1
Your setting for running I-TASSER is:
-pkgdir = /mnt/c/Users/horto/Downloads/I-TASSER/I-TASSER5.1
-libdir = /mnt/c/Users/horto/Downloads/I-TASSER/ITLIB
-java_home = /usr
-seqname = P03126
-datadir = /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA
-outdir = /mnt/c/Users/horto/Downloads/I-TASSER/ITDATA
-runstyle = serial
-homoflag = real
-idcut = 1
-ntemp = 1
-nmodel = 1
-light = false
-hours = 50
-LBS = false
-EC = false
-GO = false
1. make seq.txt and rmsinp
Your protein contains 158 residues:
> P03126
MHQKRTAMFQDPQERPRKLPQLCTELQTTIHDIILECVYCKQQLLRREVYDFAFRDLCIV
YRDGNPYAVCDKCLKFYSKISEYRHYCYSLYGTTLEQQYNKPLCDLLIRCINCQKPLCPE
EKQRHLDKKQRFHNIRGRWTGRCMSCCRSSRTRRETQL
2.1 run Psi-blast
2.2 Secondary structure prediction was done before.
2.3 Predict solvent accessibility...
2.4 run pairmod
2.4.1 Use all templates
2.4.2 running pair ................
FORTRAN STOP
30000 6379375 total lib str & residues
number of observations 15.84338 1186542.
pair done
3.1 do threading
start serial threading PPAS
/tmp/horto/ITP03126
/mnt/c/Users/horto/Downloads/I-TASSER/ITDATA/PPAS_P03126
hostname: LAPTOP-MPAF31S5
starting time: Wed Apr 24 15:57:36 DST 2019
pwd: /tmp/horto/ITP03126
running zalign .....
ending time: Wed Apr 24 16:04:14 DST 2019
start serial threading dPPAS
/tmp/horto/ITP03126
/mnt/c/Users/horto/Downloads/I-TASSER/ITDATA/dPPAS_P03126
hostname: LAPTOP-MPAF31S5
starting time: Wed Apr 24 16:04:16 DST 2019
pwd: /tmp/horto/ITP03126