SEGMER for protein sub-structure prediction version 1.0(64bit) written by Sitao Wu and Yang Zhang INSTALL THE SOFTWARE(64bit in Linux): 1. uncompress SEGMER_source.tar.bz2 to a directory like xxxx/SEGMER_source. 2. change path settings of xxxx/SEGMER_source/library/bin/psipred24/runpsipred: dbname=xxxx/SEGMER_source/library/data/nr/nr.filter ncbiname=xxxx/SEGMER_source/library/bin/blast/bin execname=xxxx/SEGMER_source/library/bin/psipred24/bin dataname=xxxx/SEGMER_source/library/bin/psipred24/data 3. change the setting of xxxx/SEGMER_source/SEGMER/SEGMER.pl: $user="username"; # Your user name $data_dir_ori="xxxx/SEGMER_source/example"; # predictions are saved in # "xxxx/SEGMER_source/example/$s" $libdir="xxxx/SEGMER_source/library"; #SEGMER libary, which can be downloaded from http://zhanglab.dcmb.med.umich.edu/library/. $run="benchmark" (filter out templates with sequence identity > 30%) or "real" (use all templates). 4. make complete library: download PDB file from http://zhanglab.dcmb.med.umich.edu/library/PDB.tar.bz2 and decompress PDB.tar.bz2 to "xxxx/SEGMER_source/library/PDB" download MTX file from http://zhanglab.dcmb.med.umich.edu/library/MTX.tar.bz2 and decompress MTX.tar.bz2 to "xxxx/SEGMER_source/library/MTX" download DEP file from http://zhanglab.dcmb.med.umich.edu/library/DEP.tar.bz2 and decompress DEP.tar.bz2 to "xxxx/SEGMER_source/library/DEP" download summary1 file from http://zhanglab.dcmb.med.umich.edu/library/summary1.tar.bz2 and decompress summary1.tar.bz2 to "xxxx/SEGMER_source/library/summary" download summary2 file from http://zhanglab.dcmb.med.umich.edu/library/summary2.tar.bz2 and decompress summary2.tar.bz2 to "xxxx/SEGMER_source/library/summary" download nr database from ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz and decompress nr.gz to "xxxx/SEGMER_source/library/nr" then go to "xxxx/SEGMER_source/library/nr" directory, type "xxxx/SEGMER_source/library/bin/psipred24/bin/pfilt nr > nr.filter" finally use "xxxx/SEGMER_source/library/blast/bin/formatdb -i nr.filter -o T" to create blastable nr files HOW TO RUN THE SOFTWARE: Suppose the sequence file is saved in xxxx/SEGMER_source/example/e01/seq.txt. First Change current directory to xxxx/SEGMER_source/SEGMER. Then type: "./SEGMER.pl e01". After the prediction is finished, the threading results are saved in xxxx/SEGMER_source/example/$pdb/init.QQQr: whole_chain threading results by SEGMER. xxxx/SEGMER_source/example/$pdb/init.YYY2: segmental threading results by SEGMER(2 continuous segment) xxxx/SEGMER_source/example/$pdb/init.YYY3: segmental threading results by SEGMER(3 continuous segment) xxxx/SEGMER_source/example/$pdb/init.YYY4: segmental threading results by SEGMER(4 continuous segment) xxxx/SEGMER_source/example/$pdb/init.YYYS2: segmental threading results by SEGMER(2 discontinuous segment) xxxx/SEGMER_source/example/$pdb/init.YYYS3: segmental threading results by SEGMER(3 discontinuous segment) xxxx/SEGMER_source/example/$pdb/init.YYYS4: segmental threading results by SEGMER(4 discontinuous segment) Format of init.QQQr or init.YYY* (see xxxx/SEGMER_source/example/e01/init.QQQr or init.YYY*): 1. The 1st line: number of templates, protein length 2. For each template, the 1st line: for init.QQQr: Length_of_alignment, Z-score, rank, template name, sequence identity, coverage for init.YYY*: Length_of_alignment, Z-score, rank, template name, segment no. and template no(see following example). for instance, "CHU_5_2" means the 2nd template of 5th segment. For each template, the rest of lines before "TER" are threading results (The last two columns are aligned residue no. and residue names in templates). All questions to wusitao2000@gmail.com 03/10/2010