I've written a couple of docking/virtual screening workflows using SMINA, a freely available tool DOI. There are a number of other alternatives and it is very difficult to get good comparisons which is why the Grand Challenges are useful.
The Grand Challenge 3 (GC3) is a blinded prediction challenge for the computational chemistry community, with components addressing pose-prediction, affinity ranking, and free energy calculations. GC3 is based on six different protein targets, Cathepsin S and five different kinases, and is separated into five subchallenges, some of which involve multiple protein targets. Only Cathepsin S is associated with cocrystal structures, so the kinase components of this challenge focus on affinity ranking and/or free energy predictions. Three of the datasets, Cathepsin S, JAK2 and TIE2, include a free energy prediction component.
This is an ideal opportunity to test novel algorithms on a carefully curated dataset.
Computational practices often employ a number of computational algorithms and dataset preparation steps for meaningful results. D3R will provide a forum for deposition, dissemination and discussion of such workflows through this website. Workflows will represent methods used successfully in the blinded challenges and methods donated by our pharmaceutical and/or academic collaborators. GitHub.
In Stage 1 September 1 - October 1, your predicted poses for the 24 ligands, in a coordinate system aligned with the S04-bound Cathepsin S structure provided in the inputs. Your predicted affinities, or affinity rankings, for all 136 compounds and/or your predicted absolute or relative binding affinities (in kcal/mol) for the free energy subset of 33 compounds. When Stage 1 closes, we will release the crystallographic poses of the 24 ligands.
In Stage 2 October 1 - December 1, your predictions of the affinity rankings of all 136 compounds and/or absolute or relative binding affinities (in kcal/mol) for the free energy subset of 33 compounds.
In the previous workflow I described docking a set of ligands with known activity into a target protein, in this workflow we will be using a set of ligands from the ZINC dataset searching for novel ligands. Once docked the workflow moves on to finding vendors and selecting subsets for purchase.
SeeSAR 6.1 has been released, looking at the release notes there are a couple of useful additions.
- Multiple protein alignment, Since version 5.6 it has been possible to load and work with multiple proteins. So far this feature could only be utilized with pre-aligned structures. Now you can do the 3D alignment in SeeSAR itself. The alignment is based on and optimized according to the superposition of related active sites. Therefore, once you have selected a binding site, just one click is all that is needed to superpose all related binding sites at once. Note that the superposition is limited to highly homologous proteins (>90% sequence identity).
- SeeSAR/StarDrop interface. We have implemented a new function that greatly improves the interaction between the two software packages. Using the option in the molecule table toolbar, you may now transfer all (or the subset of favorite) molecules directly to StarDrop, which is launched automatically. This interface is supported in StarDrop starting with the recently released StarDrop version 6.4 and StarDrop now analogously supports launching and submitting data to SeeSAR. So it is in fact possible to transfer data back and forth and exploiting maximum synergy to make the best of both worlds. Note that this feature may require a few adjustments in your SeeSAR settings to become fully functional.
- Shortcut to copy protein ligands. Usually among the first tasks after loading proteins is to copy the related protein ligands to the molecules table for further processing (docking, editing, re-scaffolding, etc.). Especially with multiple proteins this turned out to be a quite cumbersome procedure. Therefore we have implemented a shortcut function in the toolbar of the proteins tab to copy all protein ligands at once to the molecules table. Note that this function will copy all ligands irrespective of their position in relation to the common binding site that is used in the context of the molecules table. So some of the copied ligands may lie well outside the common binding site.
SeeSAR is a software tool for interactive, visual compound prioritization as well as compound evolution. Structure-based design work ideally supports a multi-parameter optimization to maximize the likelihood of success, rather than affinity alone. Having the relevant parameters at hand in combination with real-time visual computer assistance in 3D is one of the strengths of SeeSAR.
Whilst high-throughput screening (HTS) has been the starting point for many successful drug discovery programs the cost of screening, the lack of access to a large diverse sample collection, or the low throughput of the primary assay may preclude HTS as a starting point and identification of a smaller selection of compounds with a higher probability of being a hit may be desired. Directed or Virtual screening is a computational technique used in drug discovery research designed to identify potential hits for evaluation in primary assays. It involves the rapid in silico assessment of large libraries of chemical structures in order to identify those structures that most likely to be active against a drug target. The in silico screen can be based on known ligand similarity or based on docking ligands into the desired binding site.
I've updated the description to give more information about preparing the target protein.
StarDrop 6.4 now links prepared 3D docking and alignment models with data visualisation, 2D SAR analyses and predictive models in a single interface.
Computational chemists can make their validated 3D models available to their colleagues via StarDrop’s Pose Generation Interface, which is compatible with software from major computational chemistry providers, including:
- FlexX™ – BioSolveIT
- Gold™ – Cambridge Crystallographic Data Centre
- MOE™ – Chemical Computing Group
- AutoDock Vina – The Scripps Research Institute
- POSIT™ – OpenEye Scientific
- …extendable to other third party applications.
The Pose Generation Interface communicates with a Pose Generation Server, on which computational chemists can easily publish their validated docking or 3D alignment models. These are made instantly available for StarDrop users to submit their compounds and the resulting poses, protein structures and scores are returned directly to StarDrop for visualisation and analysis.
The Pose Generation Server can be installed wherever you run your 3D modelling software, supporting Linux, Windows® and Mac®
There are more details in the poster presented at the Spring ACS 2017.
The all new SeeSAR 6 provides you with a completely redesigned and now fully customizable GUI. You can choose between different bright and dark themes and GUI layouts so that you can optimally adapt SeeSAR for different use cases.
The new design is more streamlined and customizable. Instead of having 8 different kinds of buttons in different regions of the application, we now have just a main menu top left and a toolbar top right. The main menu changes depending on the mode of use (editing, site definition, ...), while the toolbar stays the same throughout. This way you are never overwhelmed with choices, but are only presented with options that you may need. Depending on you current use case, you may also want to change the overall layout (many molecules ⇒ tables to the left; many properties ⇒ tables below to make use of the whole width; 2 monitors ⇒ tables docked out) and/or the overall appearance (bright theme for presentations; dark theme for desktop work; we have also integrated a color blindness mode just in case).
In order to give you a jump start when you begin working with SeeSAR (both as a newcomer, as well as a seasoned user of the old GUI design), we have introduced an in-application help facility in this new version. First of all, upon starting the tool for the first time or after a long break in use, SeeSAR offers you a short introductory slide show, reminding you of a few basics that can make life a lot easier. But you can also now request help from within the application with a click on the lifesaver button. The help window then shows you – context dependent – explanations on the mode in which you are currently working or on the functions that you are trying to use so you can leave the help window open, consulting it when you need it. Of course you may also navigate between help pages in the help window and from there access online resources such as tutorial videos.
There is also a free webinar: introduction to SeeSAR 6.0
The generation of multiple conformations is an important step in a number of operations from input to ab initio calculations to providing input files for docking studies. A recent paper compared seven freely available conformer ensemble generators: Balloon (two different algorithms), the RDKit standard conformer ensemble generator, the Experimental-Torsion basic Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK DOI, and also provided a dataset of ligand conformations taken from the PDB.
A recent twitter discussion involving Greg Landrum and David Koes prompted Greg to publish a blog post describing conformation generation within RDKit. The post compares using distance geometry to select diverse conformations versus an approach that combines the distance geometry approach with experimental torsion-angle preferences obtained from small-molecule crystallographic data (ETKDG). He also looks at the impact of force-field minimisation.
A really interesting read with code provided.
This is a new release of DOCK with updated scoring functions including the new pharmacophore matching similarity score and a completely revamped Descriptor Score that allows for different combinations of various scoring functions to be used simultaneously From the readme
NEW IN DOCK 6.8
New Scoring Functions Two new scoring functions were added: pharmacophore score , and descriptor score [see the manual].
Pharmacophore: Calculates the pharmacophore overlap between a candidate and a reference molecule. In addition, the python wrapper, mol2bild.py, can be employed to visualize the pharmacophores.
Descriptor Score: Descriptor Score was completely overhauled. Presently, descriptor score is now a wrapper that allows multiple scoring functions to be employed simultaneously. Please refer to the manual for the complete list of scoring functions supported by descriptor score.
Four similarity-based scoring functions were added as part of descriptor score: pharmacophore score, Tanimoto score, Hungarian matching similarity score, and volume overlap score. Hungarian matching similarity score, Tanimoto score and volume overlap score, new to DOCK, can only be called using descriptor score.
Internal Energy Scoring Function Optimization of the internal energy scoring function code and the addition of two parameters:
a) The code has been optimized for calculating the repulsive VDW term. "internalenergyrep_exp", when set to its default value of 12, results in a significant speedup in certain cases. Values other than 12 are computed as in previous versions of DOCK.
b) The term "internalenergycutoff" has been added such that all conformers with an internal energy greater than the cutoff are pruned.
c) The addition of the term "pruningconformerscorescalingfactor", a divisor of the pruningconformerscore_cutoff, ensures that pruning becomes more stringent as flexibly-grown molecules proceed layer-by-layer.
Miscellaneous DOCK now supports builds using Intel compilers with either MPICH2 or Intel MPI parallelism.
A new version of SeeSAR is now available for download.
Version 5.5 includes several new features and has undergone some tweaks under the hood to improve speed.
From the release notes:-
2D browsing featuring in-view molecule properties
To further enhance the 2D browsing, we have added an illustration of the molecules' key properties in the form of a radar plot. A thumbnail of the plot is embedded in each of the 2D molecule pictures, providing a quick overview. it enlarges upon mouse-over and provides access to the configuration dialog. Add or remove property-axes, optionally fine-tune the scales and set 'desired' value ranges. A hit or miss of the latter is indicated by green or red dots on the corners of the color-coded characteristic shape of the molecule on the plot (the greener the better).
Detecting novel/unoccupied binding sites
Now SeeSAR can search your protein for unoccupied pockets based on the world-renown DoGSite-Algorithm. You may then select these to become the binding site, within which to generate poses and calculate binding affinities for your molecules. The new binding site definition feature lets you either use a selected molecule from the table (based on a 6.5Å shell around it, as before) or will detect and visualize empty pockets for you to select instead.
Multiple reference molecules
The reference molecule in SeeSAR always stays in view even when you select other entries from the molecule tables. Now, however, you are able to set - and keep in view - as many reference molecules as you like. Either set them individually - in the selected molecule menu (as before) - or mark several as favorites and set them all as references at once, via the new menu button below the table.
Multiple core replacements with just one click
With the new multiple solutions button for ReCore in the molecule editor, brainstorming new scaffold ideas became yet easier. You can now generate 10 new alternative core replacements at once. The new molecules are saved directly to the table so that you can immediately see their estimated binding affinity and view all structures in 2D at a glance.
MolSoft have announced the release of ICM version 3.8-5.
- Generate a 2D Interaction Diagram of a ligand with the binding pocket. The image is annotated with hydrogen bonds and interacting residues.
- 3D ligand editor is a powerful tool for the interactive design of new lead compounds in 3D
- Support for MMTF format. The Macromolecular Transmission Format (MMTF)
- Support for Mac retina display
- Add docking restraints by selecting atoms in the receptor
- Updates to protein modelling, bioinformatics and cheminformatics
An overview of the BioExcel Project.
BioExcel is based on improving three aspects of biomolecular research. Firstly, improving the performance and scalability of the most commonly used software, such as GROMACS (www.gromacs.org), HADDOCK (www.haddocking.org) and CPMD (www.cpmd.org), to take advantage of next-gen HPC systems and the expected increase in the amount of data produced. It’s also important to improve how easy it is for users to access and use these types of software. Not all researchers have experience in efficiently handling data and software. BioExcel aims to provide customisable workflow environments, which will allow relatively novice HPC/HTC users take advantage of the analysis software provided in ways that suit their specific research. In addition to this, hands-on training and public webinars are already underway, aiming to teach researchers best practices and how to best utilise the software and resources available.
OpenEye have announced the release of OEDocking v3.2. This is an upgrade that adds new features and fixes several bugs.
- POSIT is now integrated with the OEDocking suite. A POSIT license is still required to run POSIT.
- POSIT‘s clash detection algorithm has been enhanced. It now ignores clashes with the protein on a per atom basis if the crystallographic ligand makes these same clashes.
- OpenMPI version 1.6 is now supported on all platforms. The -mpinp and -mpihostfile flags are now used to run FRED, HYBRID and POSIT in MPI mode. These new flags replace the oempirun script.
- PDB2RECEPTOR now incorporates all of MAKEPOSERECEPTOR functionality. MAKEPOSERECEPTOR is no longer supported.
- PDB2RECEPTOR now supports the identification and selection of a desired ligand in a protein-ligand complex.
MAKERECEPTOR, PDB2RECEPTOR, APOPDB2RECEPTOR, DOCKINGREPORT, RECEPTORSETUP and RECEPTORTOOLBOX now all accept either a POSIT or FRED2 license.
Support for Mac OS X 10.9 and 10.10 was added.
- Mac OS X 10.6 and 10.7 are no longer supported.
Chemical Computing Group have just released an up date to MOE, version 2015.10 includes:-
- Generate docked poses using FFT followed by all atom minimization
- Define receptor and ligand sites to focus docking
- Automatically detect antibody CDR sites
Integrated Alignment, Consensus and Superposition in the Sequence Editor
- Manipulate multimeric protein sequences using split side-by-side Sequence Editor panes
- Use dendrograms to visualize pairwise similarity, identity and RMSD relationships
- Select residues based on plotted values using resizable sequence editor plots
Distributed Pharmacophore Searching
- Run pharmacophore searches on a cluster directly from MOE GUI
- Perform fast corporate database searches
- Access multiple databases stored on a central server
Covalent Docking and Electron Density Docking
- Use reaction-based organic transformations to covalently docking
- Minimize ligand strain energy while maximizing ligand fit to electron density
- Run docking through an enhanced streamlined scenario-based interface
Extended Hückel Descriptors and pKa Model
- Compute molecular properties such as logP, logS and molar refractivity
- Determine populations of ligand protonation states at a given pH
- Calculate the pKa and pKb of small molecules
13C NMR Analysis
- Apply QM conformation refinement to calculate 13C NMR shielding
- Convert computed shieldings and predict 13C NMR chemical shifts
- Compare computed chemical shifts to experimental shifts for structure determination
I'll write a review in the New Year.
OpenEye have announced the release of OpenEye Toolkits v2015.October. These libraries include the usual support for C++, Python, C# and Java.
- FastROCS TK was added to the OpenEye toolkits collection
- Molecule reading performance improvement in OEChem TK
- The capabilities of the OEBio-Fragment Network have been expanded
- 213 new ring templates have been added to the OEChem TK built-in ring dictionary
In particular note the 2015.Oct release is the last to support Mac OSX 10.8 so time to upgrade if you have not already done so.
Grand Challenge 2015: Prediction of ligand poses, and affinity rankings, for the protein targets HSP90 and MAP4K4. Stage 1 predictions are due November 16, 2015; Stage 2 predictions are due February 1, 2016. For details, and to sign up and participate, please see https://drugdesigndata.org/about/grand-challenge-2015
SAMPL5: Prediction of aqueous host-guest binding free energies and, optionally, enthalpies for three host-guest series. A series of aqueous-organic partition coefficients may also be added in the next several weeks. Predictions are due February 1, 2016. For details, and to participate, please see https://drugdesigndata.org/about/sampl5.
These challenges are organized by the Drug Design Data Resource, which is based at UC San Diego and supported by a grant (U01GM111528) from the NIH's National Institute of General Medical Sciences. They are made possible by generous donations of data, pre-publication, from AbbVie, Genentech, the CSAR initiative at U. Michigan, and Professors Lyle Isaacs (U. Maryland) and Bruce Gibb (Tulane U.)
BioSolveIT has just announced the release of SeeSAR 3.0.
This update of SeeSAR qualifies as major release 3, since it covers two milestones in its development. So far every SeeSAR session has started from scratch. The only way to retain molecules was to save them to file and re-load them again in a subsequent session. Needless to say that loading meant recalculating all Hyde-scores again...
Project files Starting with Version 3.0, SeeSAR allows you to store all session data in a project file. This includes the protein, ligands loaded from file and new (edited) ligands. Resuming your work on a project is now as easy as double-clicking on the project-file. As a result, everything just got a hell of a lot faster! Whilst calculating Hyde-scores for say 1000 compounds took around half an hour (depending on your hardware), loading the same information from a project file now takes only a few seconds. Note that you can also generate a project file on the command line, allowing you to outsource the calculation of Hyde-scores to a different machine. This enhancement is also a great way to exchange data and ideas with a colleague! Simply store your SeeSAR session as a project file in a commonly accessible location (e.g. a network drive). Your colleague can take a look with just a double-click.
Hyde update Hyde is quite sensitive with regards to the precise geometry of a binding pose. Even the tiniest difference in a pose can distort an anyway stretched hydrogen bond just so much that it is not recognized anymore - thereby leaving you with a huge desolvation penalty for such atoms, without the gain from the h-bond. This "sharpness" of Hyde is its greatest strength (for example by highlighting real activity cliffs), but also its greatest weakness (especially if the structure has flaws or is of low resolution). In order to minimize such troubles, we optimize each pose before the Hyde affinity assessment. We improved this optimization significantly. It is now fully flexible and with sharper clash criteria, making it suitable for docked poses as well as edited compounds. All of this as efficient as before, just perfect for interactive use.
There is a review of an earlier version of SeeSAR here
MOE2014.0901 Update is now available. MOE is a fully integrated molecular modelling and drug discovery software package.
MOE 2014.0901 updates:
- Option for AMBER residue name
- Append/prepend multiple residue sequence specified by single-letter names Builder:
- Added H’s inherit color if there is a consistent coloring in the residue
sddesc: New -smi:p option causes field headers to be written to the output ASCII file
- MOESVLRUNPATH now properly honored
- Combinatorial Builder now honors different attachment point locations on the same R-group
- Database Save As one entry per file mode now properly generates unique filenames
- Dock Template Forcing batch file now correctly generated
- Saved views in .moe files now properly restored
- Auto-save when Database Viewer display attributes are changed can now be disabled to prevent changes to the database file modification date when only the display is changed and not the database content
- SVL function Deprotonate now works properly
- Various MOE Project and Project Database Update bugs
- Various minor bug fixes
There are reviews of MOE available here
Moe:- Molecular modeling
Moe Update (Jan 2009):- Molecular modeling
Review of MOE (2009.10 release):- Molecular modeling
Moe Update (December 2010.10 release):- Molecular modeling
Moe Update (December 2011 release):- Molecular modeling
Moe Update (December 2012 release):- Molecular modeling
SeeSAR it is an interactive tool for designing/improving ligands for drug discovery from BioSolve-it, that has been recently updated. Looking through the updates it is clear they have been very receptive to user feedback.
- Solution filter, Finding interesting solutions in larger sets of compounds has become much easier in SeeSAR. Compounds can now be filtered based on any available property – allowing you to easily trim down the compound set to the most interesting subset. As before, you can browse through and sort the remaining table entries to further refine your selection. Properties can be those generated by SeeSAR (such as the Hyde affinity assessment, the torsional strain, TPSA, logP, ...) or alternatively properties loaded from an SD file.
- Joined poses, You can now join the ligands found in the protein structure, compounds loaded from file and compounds newly generated within the SeeSAR editor into one “super” table and now provide quick-links to the previous view of only those from a certain origin. This allows you to see in one table the bound ligand as a reference, the project compounds and your last round of designs. You can then select your favorites from the entire table and export these (for example for an upcoming team meeting).
- Defining the protein, SeeSAR decomposes the contents of a PDB file into chains, small molecules, waters and ions. Until now, users had to accept SeeSAR's default assignments, which is fine in the majority of cases. However, there is no rule without exception, e.g., the peptide inhibitor which is mistaken as a short chain, the small molecule which is actually a co-factor, or the solvent molecule that should be ignored. With this update, SeeSAR allows you to change these default assignments to better handle these exceptional cases, allowing you to categorize a short chain as a bound ligand, or re-assign a co-factor as a permanent part of the protein. You can also eliminate protein elements altogether.
- SMILES, PDB and MOL file support, SeeSAR now comes with with additional molecule readers that broaden the scope of the application. Aside from standard 3D molecule file formats (SDF and mol2), SeeSAR now supports 1D and 2D file formats as well as reading small molecules from PDB format. If no 3D coordinates are given, SeeSAR will calculate a clash-free, low energy conformation on the fly: with the SeeSAR positioning function you can then place such input molecules in the active site of interest. Amongst other things, this feature facilitates the importing of molecules straight from your favourite chemical drawing program and assessing such structures in in the context of your protein of interest.
DOCK 6 is written in C++ and is functionally separated into independent components, allowing a high degree of program flexibility. Accessory programs are written in C and Fortran 77. Source code for all programs is provided. Read the FAQ for details of installation under MacOSX.
Allen, W. J.; Balius, T. E.; Mukherjee, S.; Brozell, S. R.; Moustakas, D. T.; Lang, P. T.; Case, D. A.; Kuntz, I. D.; Rizzo, R. C. DOCK 6: Impact of New Features and Current Docking Performance. J. Comput. Chem. Submitted.
OpenEye ihave announce the release of POSIT v3.1, the component of the OEDocking suite devoted to pose prediction.
This update includes:
- The HYBRID and FRED algorithms have been incorporated into POSIT, the appropriate method is determined by analyzing the ligand to pose against the input receptors.
- Multiprocessing has been enabled through the use of MPI, to speed calculations.
- POSIT now supports a list of receptors files or .lst file as input. This overcomes command-line limitations for the number of receptors that can be used simultaneously.
- Added a MEDIOCRE result rating for results between 33% and 50% probability.
- Command line parameters have been simplified and updated to be compatible with the OEDocking Suite of tools.
POSIT is designed for the posing problem in lead optimization, i.e. how best to leverage project information from previous protein-ligand structures to predict the pose of a new ligand. It does this by assessing the similarity of the new ligand to known bound structures. Performance degrades as similarity decreases and so at some point it is worth searching more exhaustively.
SeeSAR 1.5 has been released. SeeSAR it is intended as an interactive tool for designing/improving ligands for drug discovery.
The latest release covers two major topics: 1. A series of features that make the editing more swift and easier. To this end they introduced hot-keys, context menus and drawing a bond by drag&drop. 2. Often times people use SeeSAR for visual inspection e.g. after docking. Now normally you'll have multiple poses per compound. For a better overview the Table now allows you to collapse all poses to just one line per compound.
Furthermore you can set a bookmark to indicate what you like and export only the ones on the wish list.
There is a review of SeeSAR here.
I just got this email
Thank you for your collaboration in helping us to test the beta version of the FORECASTER Suite 2014. From your feedback and bug reports, we have now released the final version of the Suite. The files were updated and posted on the download page. Please send us any bugs that you might have not yet reported.
The FITTED docking tool was initially been developed as a suite of three programs: SMART (used to prepare the small molecules for docking), PROCESS (used to prepare the protein files for docking) and the docking program FITTED. More recently, these three programs together with several others have been integrated into a single package, namely FORECASTER.
More information can be found here http://fitted.ca/download.html#forecaster
DOT is a software package for docking macromolecules, including proteins, DNA, and RNA. DOT performs a systematic, rigid-body search of one molecule translated and rotated about a second molecule. The intermolecular energies for all configurations generated by this search are calculated as the sum of electrostatic and van der Waals energies. These energy terms are evaluated as correlation functions, which are computed efficiently with Fast Fourier Transforms. In a typical run, energies for about 108 billion configurations of two molecules can be calculated in a few hours on a few desktop workstations working in parallel.
Roberts, Victoria A. and Thompson, Elaine E. and Pique, Michael E. and Perez, Martin S. and Ten Eyck, L. F., (2013) "DOT2: Macromolecular docking with improved biophysical models" Journal of Computational Chemistry, Volume 34, Issue 20, pages 1743-1758, 30 July 2013 DOI:
Now that I have my new MacPro I thought it might be interesting to try out a couple of the software packages that I’ve previously reviewed. ForgeV10 allows the scientist to use Cresset’s proprietary electrostatic and physicochemical fields to align, score and compare diverse molecules. It allows the user to build field based pharmacophores to understand structure activity and then use the template to undertake a virtual screen to identify novel scaffolds. I’ve previously reviewed ForgeV10 and as it was formally known FieldAlign so I’m going to focus on the support for multiple processors and a few of the new features.
There is a compilation of software reviews here
A recently publication “High Performance in silico Virtual Drug Screening on Many-Core Processors” DOI describes porting BUDE (Bristol University Docking Engine) to OpenCL.
Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single NVIDIA GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, includ- ing GPUs from NVIDIA and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets.
BUDE is now one the fastest HPC applications ever developed and nicely demonstrates the portability of OpenCL across different architectures.
There is a list of GPU accelerated applications here.
I was at the Cresset UGM last week and had a chance to hear more about BlazeGPU. The original CPU application Blaze uses the shape and electrostatic character of known ligands to rapidly search large chemical collections for molecules with similar properties. The latest version BlazeGPU runs at 40 times the speed of the CPU version of Blaze but loses nothing in accuracy. At a fraction of the hardware cost, BlazeGPU delivers the same effective, ligand based virtual screening as Blaze, based on the shape and electrostatic nature of molecules.
BlazeGPU is written in OpenCL and OpenCL libraries are available from NVidia and AMD for their graphics cards, but also from Intel for the CPU and for their new Xeon Phi coprocessor cards. BlazeGPU is currently designed only to run on the GPU - for CPU-only clusters the original code is just as fast, and on a machine with a reasonably fast GPU or two the CPU tends to run flat out just feeding data to the graphics card, so there's not that much gain running on the CPU as well as the GPU.
Currently the conformer generation still runs on the CPU, but they are looking at the possibility of porting that to OpenCL as well in the future.
The relative performance is shown in the plot below, it is worth noting that these are relatively inexpensive graphics cards that you can pick up on Amazon or ebay for a few hundred pounds. Also note for a $2.10/hour GPU instance on AmazonEC2 you can process 2m conformations.
There are more examples of GPU science here.
OpenEye has announced the release of OEDocking v3.0.1. This is a bug fix release to the FRED, HYBRID, and POSIT programs. Of note, the report generated by both FRED and HYBRID has been significantly improved with this release
- The program dockreport has been renamed to DOCKINGREPORT
NEW FEATURES AND IMPROVEMENTS
- The formatting of the DOCKING_REPORT has been significantly improved and now includes:
- Added a protein interaction fingerprint
- Polar Surface Area (PSA)
- Improved the geometry detection for hydrogen bond protein constraints in FRED and HYBRID. These constraints should now be tighter.
- Stereo isomer detection in POSIT was not handling bridgeheads properly, this caused some non-stereo molecules to be identified as such.
- Fixed a bug in FRED and HYBRID where clash detection between hydrogen bonding groups was occasionally too strict.
DOCK is a suite of programs for molecular docking. In version 6.6 two new scoring functions are available: Grid-based footprint scoring and SASA-based scoring.
The MultiGrid Footprint Score calculates the pair-wise interaction energies over multiple grids. Important receptor residues are initially identified with a reference ligand, and individual grids are generated to model such residues.
The SASA score calculates the percent exposure of a ligand, and the percentage of the hydrophobic portion of a ligand and the receptor that are buried in the pocket.
In addition, a symmetry corrected RMSD (Hungarian matching) method was added to facilitate pose reproduction studies.
Full information on what is new in DOCK 6.6
CLC bio is pleased to announce a new release of Molegro Virtual Docker , an integrated platform for computational drug design available for Windows, Linux, and Mac OS X. Molegro Virtual Docker offers high-quality protein-ligand docking based on novel optimization techniques combined with a user interface experience focusing on usability and productivity.
New features in version 5.5:
A new 'Energy Maps' tool provides volumetric visualization of protein force fields. This makes it possible to understand why a compound interacts with a given receptor, and may provide insights on how to improve the binding.
We also added a new execution mode in the Docking Wizard: 'Run Docking in Multiple Processes'. This makes it possible to run medium sized jobs on a local machine, while utilizing multiple CPU cores and even multiple GPU graphics cards. For large jobs on multiple machines, Molegro Virtual Grid should still be used.
The ray-tracer has been improved to more closely match the live 3D view output. This makes it possible to create high resolution renderings of the 3D view.
Cresset have announced the formal release of sparkV10 the replacement for FieldStere.
- Updated molecular mechanics force field that uses a single analogue nitrogen atom and updates the field patterns for many functional groups including aromatic halides
- Added capability to read protein excluded volumes from pdb files
- Added new cluster algorithms for clustering of results
- Added option to edit reference molecules in the molecular editor
- Added capability to manage columns in the results table
- New optional module for scoring results using StarDrop models, this does not require access to a StarDrop server, simply place StarDrop model files in a directory and they automatically get used if you have the right license. The standard ADMET models that Optibrium have created are supplied but it works equally well with any models created by StarDrop.
- Added fragment import option in database generator
- Added capability to rescore all results against a 3D QSAR model using Forge or Torch
- Added capability to search databases for a particular fragment or substructure
- Added option to delete entire clusters from results
- Added depth cue to 3D window
- Added a GUI interface for selecting a portion of a molecule and writing command line arguments
- Cleaner GUI with improved buttons
Users should note:-
SparkV10 completely replaces Cresset’s previous “FieldStere” application. If FieldStere is currently installed then it is recommended to uninstall the binary to avoid confusion over which application should be used to open FieldStere project files
This is a review of ForgeV10 the latest offering from Cresset, whilst a new product those familiar with FieldAlign and FieldTemplater will recognise much of the functionality. ForgeV10 allows the scientist to use Cresset’s proprietary electrostatic and physicochemical fields to align, score and compare diverse molecules. It allows the user to build field based pharmacophores to understand structure activity and then use the template to undertake a virtual screen to identify novel scaffolds.
There is a compilation of software reviews here.
SZMAP uses semi-continuum Poisson-Boltzmann electrostatics to map variations in solvent properties in a protein binding site. It identifies key waters, shows their interactions, compares them to the corresponding ligand atoms, and determines whether neighboring waters aid or hinder binding. The newly released tool GAMEPLAN, suggests ways to modify ligand chemistry based on this understanding of water structure in the immediate environment of the ligand.
- The Water Orientation VIDA Extension has been completely rewritten to be easier to use and more feature-rich, making it simple to find key waters and understand their interactions. Each water site can be labeled by its energy, van der Waals energy, and degree of order. The 3D representation shows whether a site is disordered, an acceptor, a donor, or both. Individual waters can be exported for use elsewhere. The other extensions have also been improved.
- A new command-line program called GAMEPLAN has been released. GAMEPLAN runs several quick SZMAP calculations and analyzes the results to examine how the existing ligand chemistry aligns with the pocket environment. It also produces hypotheses of ligand modifications to improve its affinity, based on the energetics of the water environment directly adjacent to the ligand.
- SZMAP output has been simplified: sections are clearly identified, the water orientation data is less obtrusive, and an updated set of grids is produced (neutral difference free energy, van der Waals, order, and mask). The Watercolor VIDA Extension now sets contour levels to emphasize significant results.
- The speed of SZMAP stabilization calculations for both grids and arbitrary coordinates has been increased. Results from an existing apo protein calculation can be re-used, speeding up calculations for a series of compounds and/or poses in a single binding site. The speed of stabilization calculations is improved by avoiding extra calculations on the isolated ligand.
- It is now easy to produce SZMAP results for just the region in the apo pocket where water has been displaced by the bound ligand, clarifying the analysis of water in the apo protein.
- The programs SZMAP and GAMEPLAN will check to make sure input files contain partial charges and explicit hydrogens to avoid wasting time on meaningless calculations when the input is incorrect.
- Protein preparation is easier because PCH (which adds partial charges to molecules and separates protein from ligand) now provides more control over the process and can work around structures that contain unsupported elements. PCH can now split out waters into a separate file.
POSIT - Ligand guided pose prediction POSIT is designed to use bound ligand information to improve pose prediction. Using a combination of OpenEye approaches, including structure generation, shape alignment and flexible fitting, it produces a predicted pose whose accuracy depends on similarity measures to known ligand poses. As such, it produces a reliability estimate for each predicted pose.
The optimizer has been enhanced to produce better aligned structures in certain cases.
A memory leak in the optimizer was fixed, POSIT should now properly handle large streams of molecules. The -mcs flag is now turned off by default. In some cases, the mcs was taking far too long for no real benefit in pose prediction.
FITTED is a suite of programs to dock flexible ligands into flexible proteins. This software relies on a genetic algorithm to account for flexibility of the two molecules and location of water molecules, and on a novel application of a switching function to retain or displace water molecules and to form potential covalent bonds (covalent docking) with the protein side-chains.
The Suite includes many new features and implementations:
FITTED is a suite of programs (FITTED, PREPARE, ProCESS and SMART), JAVA GUI for easy keyword file editing and docking, Fully automated and flexible protein docking program, Automated covalent docking, Automatic protein preparation from pdb to mol2, Multi-mol2 support for docking and ligand processing, Uses an evolutionary algorithm, Semi-flexible protein docking with flexible waters, Has the ability to consider water molecules displaceable, Keyword files are simpler than ever, Support for Windows, Linux 32 and 64 bits, Mac OSX.
OpenEye has to announced the release of OEDocking v3.0.0. OEDocking is a suite of well-validated molecular docking applications (FRED, HYBRID, POSIT) and their associated workflows. This release features the official introduction of HYBRID, as well as a major upgrade to FRED.
POSIT - Ligand guided pose prediction FRED - Fast exhaustive docking HYBRID - Ligand guided docking
I was reading the announcements of new products from OpenEye and I thought I should update the listings.
AFITT from OpenEye is the only software to offer a fully automatic ligand fitting process that optimizes a real-space fit to density while keeping conformational strain to a minimum. It capitalizes on a combination of core technologies that OpenEye has developed, specifically conformer generation, shape potential, high quality small molecule structure minimization, and visualization. The key step, after finding the appropriate conformers and aligning them to density, is the implementation of a refinement that combines force field and shape potentials, via a series of adiabatic optimizations . The AFITT distribution includes both a GUI and a collection of command-line applications.
BROOD is a software application designed to help project teams in drug discovery explore chemical and property space around their hit or lead molecule. BROOD generates analogs of the lead by replacing selected fragments in the molecule with fragments that have similar shape and electrostatics, yet with selectively modified molecular properties. BROOD fragment searching has multiple applications, including lead-hopping, side-chain enumeration, patent breaking, fragment merging, property manipulation, and patent protection by SAR expansion.
FILTER is a very fast molecular filtering and selection application. It uses a combination of physical property calculations and functional group knowledge to remove undesirable compounds before they enter experimental or virtual screening. Undesirable properties may include: toxic functionalities, a high likelihood of binding covalently with the target protein, interfering with the experimental assay, and/or a low probability of oral bioavailability.
QUACPAC provides pKa and tautomer enumeration in order to get correct protonation states. It also offers multiple partial charge models (including MMFF94 , AM1-BCC , and AMBER ) that cover a range of speed and quality in order to allow appropriate charging for every end use. QUACPAC's approach to tautomeric enumeration is to provide multiple tautomeric states rather than one "correct" tautomer. Subsequent downstream processes are then used to identify the appropriate tautomeric form.
SZYBKI optimizes molecular structures with the Merck Molecular Force Field, either with or without solvent effect, to yield quality 3D molecular structures for use as input to other programs. Since the chemistry of molecular interactions is a matter of shape and electrostatics, it is impossible to consider either without reasonable 3D molecular structures. SZYBKI also refines portions of a protein structure and optimize ligands within a protein active site, making it useful in conjunction with docking programs.
I just heard about a platform - FORECASTER - that includes programs for drug discovery and process chemistry, these include
- FITTED, a docking program
- PREPARE, PROCESS and SMART, programs that can prepare protein and ligand files automatically
- CONVERT, a program that converts 2D molecules to energy-minimized 3D molecules (adds hydrogens, generates tautomers and protomers)
- SELECT, a program that computes compound similarity, extracts focused highly diverse libraries or identifies analogues
- REDUCE, a program that filters using descriptors and functionnal groups
- REACT, a program that performs combinatorial chemistry in silico from user-defined chemical schemes
- IMPACTS, a sites of metabolism prediction program (CYP 450)
- ACE, a program that predicts the stereochemical outcome of reactions
All the programs are integrated into a new web-based graphical interface that allows complete automation of the different workflows.
You can read more details here, Integrating Medicinal Chemistry, Organic/Combinatorial Chemistry, and Computational Chemistry for the Discovery of Selective Estrogen Receptor Modulators with Forecaster, a Novel Platform for Drug Discovery
CCG have announced the release of MOE 2011.10. This includes a new license manager compatible with LIon.
Some of the new and enhanced features in MOE include:
Non-Bonded Interaction Visualization Model - Visualize halogen bonds, H-bonds, CH-X, proton- for interactive modeling - Calculate strengths using Extended Hckel Model - Display strengths and interactions in 2D Ligand Interaction Diagrams Sequence Editor Redesign - Wrapped view, zoom, chain name/tag, etc. - Synchronized coloring (% identity, similarity, Clustal X, RMSD) - Cut and paste for loop grafting, inserting linkers, filling gaps, etc. Combinatorial Build in Pocket - Add R-groups to one or more attachment points in 3D pocket - Apply 2D and 3D filters, refine in (flexible) pocket and score - Use Builder to scan fragments for interactive ligand optimization Analysis of Solvent in Binding - Calculate within minutes a solvent binding free energy map using 3D-RISM - Calculate water, salt and hydrophobe solvation densities in complex or apo receptor - Diagnose how well alternate groups take advantage of water upon binding Macromolecular System Preparation - Correct common problems in protein structures automatically - Browse alternate conformations, cap termini, build missing loops - Optimize hydrogen bond network by flipping residues and adjusting states GPCR Family Database and Alignment Tools - Identify and annotate transmembrane regions of GPCRs - Add alignment constraints to improve GPCR sequence alignments - Augment a database of GPCR crystal structures with in-house data
Molegro is pleased to announce a new major release of Molegro Virtual Docker, an integrated platform for computational drug design available for Windows, Linux, and Mac OS X. Molegro Virtual Docker offers high-quality protein-ligand docking based on novel optimization techniques combined with a user interface experience focusing on usability and productivity.
Major new features in version 5.0: -GPU-accelerated docking on CUDA supported hardware making it possible to screen drug-like compounds up to 30 times faster than using conventional CPU-based methods. The GPU implementation builds upon and extends the research described in the paper "GPU-Accelerated High-Accuracy Molecular Docking using Guided Differential Evolution" (http://dl.acm.org/citation.cfm?id=2001576.2001818). -The new 2D Ligand Map provides an easy way to inspect and visualize protein-ligand interactions.
For more information, or to download a trial version, please visit our company website at: http://www.molegro.com
Molegro Virtual Grid creates an infrastructure for distributing docking runs on multiple machines. By simply installing the MVG agent on a computer, its resources can be used transparently by the grid controller. Virtual Grid support is built into Molegro Virtual Docker: for instance, to dock a library of compounds against a receptor, simply setup a compound data source, and select 'start job on Virtual Grid' in the Docking Wizard. Molegro Virtual Grid is multi-core aware and can be installed on any platform: Linux, Windows, and Mac. The machines in the grid do not need to run the same operating system. Now added to the alphabetical listing
The performance gains were very impressive, what was equally striking was the efficiency gains as measured by electricity usage, it looks like several thousand pounds will be saved for every million compound docking run.
He also showed the portability of OpenCL code, allowing efficient use of both the GPU and CPU.
He has a report on “The GPU Computing Revolution” available online
If you would like to learn more Apple have a OpenCL section in the Developer library, and Simon’s website is an invaluable resource, and there a couple of recommended books (links to Amazon)