We have used the DUD-E* database as a standard dataset to examine the LS-align algorithm for virtual screening prediction. DUD-E contains a list of 102 proteins, each with on average 224 active ligands and 50 challenging decoys for each active ligand. The 102 proteins span diverse categories, including 26 kinases, 15 proteases, 11 nuclear receptors, 5 GPCRs, 2 ion channels, 2 cytochrome P450s, 36 other enzymes, and 5 miscellaneous proteins. More details of the DUD-E database can be found at http://dude.docking.org/.
In this study, to avoid bias of test, we remove the duplicate entries to make sure that each ligand has only one conformer in the database. Please click https://zhanglab.ccmb.med.umich.edu/LS-align/dataset.tar.gz to download the final dataset without duplication.
yangzhanglabumich.edu | (734) 647-1549 | 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218