There is an interesting thread on the Open Source Antibiotics GitHub repository.
Jan Jensen is looking at docking and scoring molecules into Mur Ligase C (MurC). In preparation for looking at a genetic algorithm to score docked poses they have docked 260K ligands using Glide. All details are in a Google CoLab Notebook. https://github.com/opensourceantibiotics/murligase/issues/46
As with all the work on OSA everything is in the public domain.
If you want an opportunity to test out your docking algorithm, scoring function or binding affinity prediction tool this looks like a great opportunity.
The latest in the Open Source tools for Chemistry, fantastic workshop using GNINA for docking.
GNINA 1.0 by David Koes. The use of docking to predict ligand binding to a receptor is now well established, this workshop will cover docking and structure-based virtual screening, with an introduction to the theory followed by practical examples. The slides for the Molecular Docking with GNINA 1.0 workshop are here and the Jupyter Notebooks are here.
The videos for the first two workshops sponsored by LiverpoolChiroChem are now on the CICAG YouTube channel.
21 April Chemical Structure validation/standardisation (Greg Landrum) Possibly the most important step in model or database building is data curation, this workshop will deal with chemical structure validation and standardisation. https://youtu.be/eWTApNX8dJQ.
The next workshops are
27 May GNINA 1.0 (https://chemrxiv.org/articles/preprint/GNINA10MolecularDockingwithDeep_Learning/13578140) (David Koes) The use of docking to predict ligand binding to a receptor is now well established, this workshop will cover docking and structure-based virtual screening, with an introduction to the theory followed by practical examples.
24 June Advanced DataWarrior (http://www.openmolecules.org/datawarrior/) Isabelle Giraud. The previous very popular introductory workshop brought DataWarrior to a new, wider audience, this workshop will highlight advanced features, macros and other topics that were brought up by users. So feel free to submit requests.
Registration is free and you will be sent login details at a later date.
Here are two variations of a Jupyter Notebook to help with docking experiments. The first version runs locally and requires the user to install RDKit, OpenBabel, SMINA and py3Dmol, the second version can be run using Google CoLab and thus all you require is a web browser.
The fpocket suite of programs is a very fast open source protein pocket detection algorithm based on Voronoi tessellation. The platform is suited for the scientific community willing to develop new scoring functions and extract pocket descriptors on a large scale level.
What's new compared to fpocket 2.0 (old sourceforge repo)
- is now able to consider explicit pockets when you want to calculate properties for a known binding site
- cli changed a bit
- pocket flexibility using temperature factors is better considered (less very flexible pockets on very solvent exposed areas)
- druggability score has been reoptimized vs original paper. Yields now slightly better results than the original implementation.
- compiler bug on newer compilers fixed
- can now read Gromacs XTC, netcdf and dcd trajectories
- can also read prmtop topologies
- if topology provided, interaction energy grids can be calculated for transient pockets and channels (experimental)
The GitHub page https://github.com/Discngine/fpocket contains detailed instructions for installation. This project is licensed under the MIT License
I recently wrote a review of Flare Version 2 which is a recent extension to the Cresset portfolio with the introduction of Electrostatic Complementarity (EC), i.e. a comparison of electrostatics on both the small molecule ligand and the target protein. In addition Flare version 2 includes a new Python API, that allows users to automate tasks by scripting, but also integration with other Python packages such as RDKit cheminformatics toolkit, Python modules for graphing, statistics (NumPy, SciPy, MatPlotLib), and Jupyter notebook integration, it is this aspect of Flare that is the subject of this review.
Cresset provide a variety of software packages to support small molecule design, built on the foundation of their extended forcefield XED forcefield. When I first reviewed a couple of Cresset products FieldView, FieldAlign and Forge the forcefield was only applicable to small molecules. However the forcefield has been constantly developed and can now be applied to proteins.
Flare Version 2 is a recent extension to the portfolio with the introduction of Electrostatic Complementarity (EC), i.e. a comparison of electrostatics on both the small molecule ligand and the target protein DOI.
Electrostatic interactions between small molecules and their respective receptors are essential for molecular recognition and are also key contributors to the binding free energy. Assessing the electrostatic match of protein-ligand complexes therefore provides important insights into why ligands bind and what can be changed to improve binding. Ideally, ligand and protein electrostatic potentials at the protein-ligand interaction interface should maximize their complementarity while minimizing desolvation penalties.
In addition Flare version 2 includes a new Python API, that allows users to automate tasks by scripting, but also integration with other Python packages such as RDKit cheminformatics toolkit, and Python modules for graphing, statistics (NumPy, SciPy, MatPlotLib), and Jupyter notebook integration.
Flare gives access to a very powerful set of tools designed to aid ligand binding, docking, electrostatic modelling and WaterSwap, all within a well thought-out interface. The storyboard feature also allows the user to store snapshots of progress and coupled with the log acts like a notebook.
You can read the full review here.
A little while back I described a docking workflow including a rescoring script for Vortex, so I thought it might be useful to include this on a separate page.
Recently, machine-learning scoring functions trained on protein-ligand complexes have shown significant promise an example being (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets DOI.
Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and −0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results.
Binaries for RF-Score-VS are available https://github.com/oddt/rfscorevs_binary.
DOCK 6.9 has been released.
This is a release of the new ligand searching method DOCKDN: De Novo design using fragment-based assembly. De novo design can be used to explore vast areas of chemical space in computational lead discovery. DOCKDN is an iterative fragment growth method, in which new molecules are built using rules for allowable connections based on known molecules.
For full information on what is new in DOCK 6.9
I've written a couple of tutorials on docking here and here that have been popular pages.
The tools used for docking are being regularly updated and so the D3R Grand Challenge 4, a new blinded prediction challenge for protein-ligand poses and affinities is an invaluable data point for comparison of the current state of play.
The Grand Challenge 4 (GC4) will open on September 4, with the following submission deadlines:
- Stage 1a, cross-docking challenge: October 4
- Stage 1b, self-docking challenge: October 19
- Stage 2. affinity ranking and free energies: December 4
Challenge components will include:
- Affinity ranking of ~450 Cathepsin S inhibitors from the same large dataset drawn from in GC3
- Affinity ranking of ~150 beta secretase 1 (BACE) inhibitors
- Pose prediction of 20 BACE inhibitors
- Free energy prediction challenges suitable for alchemical free energy methods, for both Cathepsin and BACE
Full details will be published on the Drug Design Data Resource site.
BioSolveIT have announced significant changes and improvements in SeeSAR resulting in another major release to version 8. The biggest change is that they now provide full protein visualization support. While the focus of the tool is for the most part still on the defined binding site, you can now...: see the whole protein in all its glory! As always, a major update means that HYDE scores must be re-calculated to stay in line with the changes made in the underlying structures. We certainly believe that these enhancements are well worth it:
- improved alignment
- full protein support in the seqence view
- search&find specific amino acids, waters or other protein components
- all protein visualization controls bundled
- enhanced pharmacophore handling
- fragment growing for covalent binders
For details see: https://www.biosolveit.de/SeeSAR/changes.html
They also have two new tools:
REALSpaceNavigator is the world's largest, ultra-fast searchable chemical space developed in collaboration with Enamine Ltd. It comprises roughly 3.8 billion compounds today, which will be delivered on demand in less than 4 weeks with an exceptional success rate of 80% and above.
PepSee is a software tool for interactive, visual compound prioritization as well as the design of next-generation peptide therapeutics. Peptide design ideally supports a multi-parameter optimization to maximize the likelihood of success. PepSee visualizes the relevant parameters at hand, side by side with the sequence data. Color-coded display stimulates SAR exploration. The main features of PepSee comprise:
- comfortable sequence & data import (from Excel, FASTA, PLN, Text, even PDF)
- automated as well as manual sequence alignment
- various data coloring and plotting options
- organizing and annotating your compounds
- interactive design of novel peptides
A new version of SeeSAR is available (7.3), this update includes.
- Easy mode switching: from the molecules table to the editor or the inspirator and back in just one click...
- Automated workflows: in the settings you can now decide about which calculations should happen automatically
- Menus re-organized: buttons are grouped for better overview and almost all table entries obtained a convenient context menu, simply right-click to give it a try
- Excel export: this is one of the rather hidden Easter Eggs. Besides SDF you may save tables now as XLSX (including the 2D depiction)
- Saved settings: user settings (the layout, background color, etc.) are now saved separately from project settings (filters and visualization features)
SeeSAR has been updated.
Get fresh inspiration from this huge update of SeeSAR! We realized, on the one hand, that the functionality of the editor was growing and growing, making it more and more complicated to use. On the other hand, access to the full functionality of ReCore demands a different kind of user interface. So we "took the bull by the horns" and, akin to the editor, created the new Inspirator which you can use to do:
- Core replacement This feature is the same but with a much improved UI. You are able to directly select and visualize the bonds that will be clipped to carve out a core fragment for replacement. The clipped bonds now remain in place (even while you define sphere constraints) up until you define a new query. Also the display of results is much enhanced, as you can see the new core fragments highlighted in 2D as well as in 3D. For reference, your query molecule stays visible as well.
- Fragment linking and merging You may of course launch the Inspirator with more than just one molecule. In this case, you can define bonds to clip on different molecules, thereby requesting linker fragments that will connect the remaining pieces. Note that it is not mandatory to clip a terminal part of each molecule to create the query, you may replace a core part in one and connect it to another fragment at the same time.
- Fragment growing This was possibly the most frequently requested functionality in ReCore: Cut just one bond and grow onto this bond using a fragment library of typical side chains. In this way, you can, for example, reach out to nearby subpockets. The new growing algorithm can very quickly scan through a (for now) ready-made library of typical fragments. You may of course define sphere constraints at the same time in order to target particular locations in the bi
You can download SeeSAR here and use it for free for 7 days.
I see that SeeSAR now supports a parallelized 'real' fragment growing.
SeeSAR is a software tool for interactive, visual compound prioritisation as well as compound evolution. Structure-based design work ideally supports a multi-parameter optimization to maximise the likelihood of success, rather than affinity alone. Having the relevant parameters at hand in combination with real-time visual computer assistance in 3D is one of the strengths of SeeSAR. Stimulating exploration with SeeSAR, we have embarked on pursuing a new cheminformatics compute paradigm of "Propose & Validate".
You can download SeeSAR here and use it for free for 7 days.
A little while back I mentioned BioConda. You can read more details in this publication "Bioconda: A sustainable and comprehensive software distribution for the life sciences", DOI. Conda is a platform- and language-independent package manager that sports easy distribution, installation and version management of software.
The conda package manager has recently made installing software a vastly more streamlined process. Conda is a combination of other package managers you may have encountered, such as pip, CPAN, CRAN, Bioconductor, apt-get, and homebrew. Conda is both language- and OS-agnostic, and can be used to install C/C++, Fortran, Go, R, Python, Java etc
The bioconda channel is a Conda channel providing bioinformatics related packages for Linux and Mac OS. Looking through the packages it is clear there it already contains a number of chemistry packages. These include: Updated 24 November 2017
- Autodock Vina
Bioconda offers a collection of over 3100 software tools, which are continuously maintained, updated, and extended by a growing global community of more than 330 contributors. Rather than try to duplicate this effort for a "Chemconda" it seems more efficient to encourage chemists to contribute to Bioconda. If you do package a chemistry application for Bioconda please let me know and I'll publicise it on my blog and add it to the list above. To start things rolling I've added PubChem.py to Bioconda and I've written a page describing how to create a bioconda recipe.
LigandScout has been updated.
Even more efficient and intuitive:
- New buttons for ligand-based modeling
- New features for MD trajectory analysis
- New data management capabilities
- New interactive charts
The LigandScout software suite comprises the most user friendly molecular design tools available to chemists and modelers worldwide. The platform seamlessly integrates computational technology for designing, filtering, searching and prioritizing molecules for synthesis and biological assessment.
There is a review of LigandScout here
SeeSAR 7.1 has just been released.
- 3D-pharmacophores to identify compounds of interest By now, you can run so many bulk actions, your solution list will grow by the minute. This made it just the right time to implement another filter option to help you keep track! Pharmacophore filters can now be defined using so-called sphere constraints. You can apply these pharmacophores at lightning speed to the molecule table and drill down solutions sets quickly and effectively with queries such as: Which molecules have an acceptor at this position? Filter out molecules that occupy this position! Give me all molecules with an aromatic moiety here!
- Linking and merging fragments with the integrated ReCore It is now possible to enter the 3D molecule editor with more than just one molecule. Among other things, this facilitates linking and merging operations with ReCore. Simply select the atoms you seek to replace (eg the terminal atoms of two fragment binders that should be linked). A click on the ReCore button retrieves for you in seconds fragments that link the two binders, leaving them as closely as possible in place.
- Better measure and label-options (with adjustable font size) Partially hidden labels in 3D won't bother you anymore! Instead, the simple labels for showing distances, angles, and so on have been upgraded to the more advanced labels which we have always used for Hyde and more recently for displaying torsion information. These labels are movable (simply click and drag) and are always at the front. Plus, as a bonus feature, you may now also adjust the font size on the labels in the appearance settings menu in the toolbar.
- Parallel high-throughput docking A lot of users have been waiting for this feature! Now bulk docking can be carried out with just one click. Plus we have parallelized the docking calculation so that it now uses all processors on your computer, providing you with solutions swiftly — just as your hardware permits.
- Multiple selections for bulk actions Frankly speaking we previously "abused" the favorite icon for making selections to initiate bulk actions. This itself undermined the point of being able to mark molecules as favorites. Now you may conveniently select multiple molecules using the new check box feature, and initiate bulk actions on the basis of this selection. For your convenience we also added a few functions to make working with these selections even more effective (un/check all, invert, ...). Just as for the favorites, shift + left mouse click works on the check boxes as a multiple un/check feature.
I've written a couple of docking/virtual screening workflows using SMINA, a freely available tool DOI. There are a number of other alternatives and it is very difficult to get good comparisons which is why the Grand Challenges are useful.
The Grand Challenge 3 (GC3) is a blinded prediction challenge for the computational chemistry community, with components addressing pose-prediction, affinity ranking, and free energy calculations. GC3 is based on six different protein targets, Cathepsin S and five different kinases, and is separated into five subchallenges, some of which involve multiple protein targets. Only Cathepsin S is associated with cocrystal structures, so the kinase components of this challenge focus on affinity ranking and/or free energy predictions. Three of the datasets, Cathepsin S, JAK2 and TIE2, include a free energy prediction component.
This is an ideal opportunity to test novel algorithms on a carefully curated dataset.
Computational practices often employ a number of computational algorithms and dataset preparation steps for meaningful results. D3R will provide a forum for deposition, dissemination and discussion of such workflows through this website. Workflows will represent methods used successfully in the blinded challenges and methods donated by our pharmaceutical and/or academic collaborators. GitHub.
In Stage 1 September 1 - October 1, your predicted poses for the 24 ligands, in a coordinate system aligned with the S04-bound Cathepsin S structure provided in the inputs. Your predicted affinities, or affinity rankings, for all 136 compounds and/or your predicted absolute or relative binding affinities (in kcal/mol) for the free energy subset of 33 compounds. When Stage 1 closes, we will release the crystallographic poses of the 24 ligands.
In Stage 2 October 1 - December 1, your predictions of the affinity rankings of all 136 compounds and/or absolute or relative binding affinities (in kcal/mol) for the free energy subset of 33 compounds.
In the previous workflow I described docking a set of ligands with known activity into a target protein, in this workflow we will be using a set of ligands from the ZINC dataset searching for novel ligands. Once docked the workflow moves on to finding vendors and selecting subsets for purchase.
SeeSAR 6.1 has been released, looking at the release notes there are a couple of useful additions.
- Multiple protein alignment, Since version 5.6 it has been possible to load and work with multiple proteins. So far this feature could only be utilized with pre-aligned structures. Now you can do the 3D alignment in SeeSAR itself. The alignment is based on and optimized according to the superposition of related active sites. Therefore, once you have selected a binding site, just one click is all that is needed to superpose all related binding sites at once. Note that the superposition is limited to highly homologous proteins (>90% sequence identity).
- SeeSAR/StarDrop interface. We have implemented a new function that greatly improves the interaction between the two software packages. Using the option in the molecule table toolbar, you may now transfer all (or the subset of favorite) molecules directly to StarDrop, which is launched automatically. This interface is supported in StarDrop starting with the recently released StarDrop version 6.4 and StarDrop now analogously supports launching and submitting data to SeeSAR. So it is in fact possible to transfer data back and forth and exploiting maximum synergy to make the best of both worlds. Note that this feature may require a few adjustments in your SeeSAR settings to become fully functional.
- Shortcut to copy protein ligands. Usually among the first tasks after loading proteins is to copy the related protein ligands to the molecules table for further processing (docking, editing, re-scaffolding, etc.). Especially with multiple proteins this turned out to be a quite cumbersome procedure. Therefore we have implemented a shortcut function in the toolbar of the proteins tab to copy all protein ligands at once to the molecules table. Note that this function will copy all ligands irrespective of their position in relation to the common binding site that is used in the context of the molecules table. So some of the copied ligands may lie well outside the common binding site.
SeeSAR is a software tool for interactive, visual compound prioritization as well as compound evolution. Structure-based design work ideally supports a multi-parameter optimization to maximize the likelihood of success, rather than affinity alone. Having the relevant parameters at hand in combination with real-time visual computer assistance in 3D is one of the strengths of SeeSAR.
Whilst high-throughput screening (HTS) has been the starting point for many successful drug discovery programs the cost of screening, the lack of access to a large diverse sample collection, or the low throughput of the primary assay may preclude HTS as a starting point and identification of a smaller selection of compounds with a higher probability of being a hit may be desired. Directed or Virtual screening is a computational technique used in drug discovery research designed to identify potential hits for evaluation in primary assays. It involves the rapid in silico assessment of large libraries of chemical structures in order to identify those structures that most likely to be active against a drug target. The in silico screen can be based on known ligand similarity or based on docking ligands into the desired binding site.
I've updated the description to give more information about preparing the target protein.
StarDrop 6.4 now links prepared 3D docking and alignment models with data visualisation, 2D SAR analyses and predictive models in a single interface.
Computational chemists can make their validated 3D models available to their colleagues via StarDrop’s Pose Generation Interface, which is compatible with software from major computational chemistry providers, including:
- FlexX™ – BioSolveIT
- Gold™ – Cambridge Crystallographic Data Centre
- MOE™ – Chemical Computing Group
- AutoDock Vina – The Scripps Research Institute
- POSIT™ – OpenEye Scientific
- …extendable to other third party applications.
The Pose Generation Interface communicates with a Pose Generation Server, on which computational chemists can easily publish their validated docking or 3D alignment models. These are made instantly available for StarDrop users to submit their compounds and the resulting poses, protein structures and scores are returned directly to StarDrop for visualisation and analysis.
The Pose Generation Server can be installed wherever you run your 3D modelling software, supporting Linux, Windows® and Mac®
There are more details in the poster presented at the Spring ACS 2017.
The all new SeeSAR 6 provides you with a completely redesigned and now fully customizable GUI. You can choose between different bright and dark themes and GUI layouts so that you can optimally adapt SeeSAR for different use cases.
The new design is more streamlined and customizable. Instead of having 8 different kinds of buttons in different regions of the application, we now have just a main menu top left and a toolbar top right. The main menu changes depending on the mode of use (editing, site definition, ...), while the toolbar stays the same throughout. This way you are never overwhelmed with choices, but are only presented with options that you may need. Depending on you current use case, you may also want to change the overall layout (many molecules ⇒ tables to the left; many properties ⇒ tables below to make use of the whole width; 2 monitors ⇒ tables docked out) and/or the overall appearance (bright theme for presentations; dark theme for desktop work; we have also integrated a color blindness mode just in case).
In order to give you a jump start when you begin working with SeeSAR (both as a newcomer, as well as a seasoned user of the old GUI design), we have introduced an in-application help facility in this new version. First of all, upon starting the tool for the first time or after a long break in use, SeeSAR offers you a short introductory slide show, reminding you of a few basics that can make life a lot easier. But you can also now request help from within the application with a click on the lifesaver button. The help window then shows you – context dependent – explanations on the mode in which you are currently working or on the functions that you are trying to use so you can leave the help window open, consulting it when you need it. Of course you may also navigate between help pages in the help window and from there access online resources such as tutorial videos.
There is also a free webinar: introduction to SeeSAR 6.0
The generation of multiple conformations is an important step in a number of operations from input to ab initio calculations to providing input files for docking studies. A recent paper compared seven freely available conformer ensemble generators: Balloon (two different algorithms), the RDKit standard conformer ensemble generator, the Experimental-Torsion basic Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK DOI, and also provided a dataset of ligand conformations taken from the PDB.
A recent twitter discussion involving Greg Landrum and David Koes prompted Greg to publish a blog post describing conformation generation within RDKit. The post compares using distance geometry to select diverse conformations versus an approach that combines the distance geometry approach with experimental torsion-angle preferences obtained from small-molecule crystallographic data (ETKDG). He also looks at the impact of force-field minimisation.
A really interesting read with code provided.
This is a new release of DOCK with updated scoring functions including the new pharmacophore matching similarity score and a completely revamped Descriptor Score that allows for different combinations of various scoring functions to be used simultaneously From the readme
NEW IN DOCK 6.8
New Scoring Functions Two new scoring functions were added: pharmacophore score , and descriptor score [see the manual].
Pharmacophore: Calculates the pharmacophore overlap between a candidate and a reference molecule. In addition, the python wrapper, mol2bild.py, can be employed to visualize the pharmacophores.
Descriptor Score: Descriptor Score was completely overhauled. Presently, descriptor score is now a wrapper that allows multiple scoring functions to be employed simultaneously. Please refer to the manual for the complete list of scoring functions supported by descriptor score.
Four similarity-based scoring functions were added as part of descriptor score: pharmacophore score, Tanimoto score, Hungarian matching similarity score, and volume overlap score. Hungarian matching similarity score, Tanimoto score and volume overlap score, new to DOCK, can only be called using descriptor score.
Internal Energy Scoring Function Optimization of the internal energy scoring function code and the addition of two parameters:
a) The code has been optimized for calculating the repulsive VDW term. "internalenergyrep_exp", when set to its default value of 12, results in a significant speedup in certain cases. Values other than 12 are computed as in previous versions of DOCK.
b) The term "internalenergycutoff" has been added such that all conformers with an internal energy greater than the cutoff are pruned.
c) The addition of the term "pruningconformerscorescalingfactor", a divisor of the pruningconformerscore_cutoff, ensures that pruning becomes more stringent as flexibly-grown molecules proceed layer-by-layer.
Miscellaneous DOCK now supports builds using Intel compilers with either MPICH2 or Intel MPI parallelism.
A new version of SeeSAR is now available for download.
Version 5.5 includes several new features and has undergone some tweaks under the hood to improve speed.
From the release notes:-
2D browsing featuring in-view molecule properties
To further enhance the 2D browsing, we have added an illustration of the molecules' key properties in the form of a radar plot. A thumbnail of the plot is embedded in each of the 2D molecule pictures, providing a quick overview. it enlarges upon mouse-over and provides access to the configuration dialog. Add or remove property-axes, optionally fine-tune the scales and set 'desired' value ranges. A hit or miss of the latter is indicated by green or red dots on the corners of the color-coded characteristic shape of the molecule on the plot (the greener the better).
Detecting novel/unoccupied binding sites
Now SeeSAR can search your protein for unoccupied pockets based on the world-renown DoGSite-Algorithm. You may then select these to become the binding site, within which to generate poses and calculate binding affinities for your molecules. The new binding site definition feature lets you either use a selected molecule from the table (based on a 6.5Å shell around it, as before) or will detect and visualize empty pockets for you to select instead.
Multiple reference molecules
The reference molecule in SeeSAR always stays in view even when you select other entries from the molecule tables. Now, however, you are able to set - and keep in view - as many reference molecules as you like. Either set them individually - in the selected molecule menu (as before) - or mark several as favorites and set them all as references at once, via the new menu button below the table.
Multiple core replacements with just one click
With the new multiple solutions button for ReCore in the molecule editor, brainstorming new scaffold ideas became yet easier. You can now generate 10 new alternative core replacements at once. The new molecules are saved directly to the table so that you can immediately see their estimated binding affinity and view all structures in 2D at a glance.
MolSoft have announced the release of ICM version 3.8-5.
- Generate a 2D Interaction Diagram of a ligand with the binding pocket. The image is annotated with hydrogen bonds and interacting residues.
- 3D ligand editor is a powerful tool for the interactive design of new lead compounds in 3D
- Support for MMTF format. The Macromolecular Transmission Format (MMTF)
- Support for Mac retina display
- Add docking restraints by selecting atoms in the receptor
- Updates to protein modelling, bioinformatics and cheminformatics
An overview of the BioExcel Project.
BioExcel is based on improving three aspects of biomolecular research. Firstly, improving the performance and scalability of the most commonly used software, such as GROMACS (www.gromacs.org), HADDOCK (www.haddocking.org) and CPMD (www.cpmd.org), to take advantage of next-gen HPC systems and the expected increase in the amount of data produced. It’s also important to improve how easy it is for users to access and use these types of software. Not all researchers have experience in efficiently handling data and software. BioExcel aims to provide customisable workflow environments, which will allow relatively novice HPC/HTC users take advantage of the analysis software provided in ways that suit their specific research. In addition to this, hands-on training and public webinars are already underway, aiming to teach researchers best practices and how to best utilise the software and resources available.
OpenEye have announced the release of OEDocking v3.2. This is an upgrade that adds new features and fixes several bugs.
- POSIT is now integrated with the OEDocking suite. A POSIT license is still required to run POSIT.
- POSIT‘s clash detection algorithm has been enhanced. It now ignores clashes with the protein on a per atom basis if the crystallographic ligand makes these same clashes.
- OpenMPI version 1.6 is now supported on all platforms. The -mpinp and -mpihostfile flags are now used to run FRED, HYBRID and POSIT in MPI mode. These new flags replace the oempirun script.
- PDB2RECEPTOR now incorporates all of MAKEPOSERECEPTOR functionality. MAKEPOSERECEPTOR is no longer supported.
- PDB2RECEPTOR now supports the identification and selection of a desired ligand in a protein-ligand complex.
MAKERECEPTOR, PDB2RECEPTOR, APOPDB2RECEPTOR, DOCKINGREPORT, RECEPTORSETUP and RECEPTORTOOLBOX now all accept either a POSIT or FRED2 license.
Support for Mac OS X 10.9 and 10.10 was added.
- Mac OS X 10.6 and 10.7 are no longer supported.
Chemical Computing Group have just released an up date to MOE, version 2015.10 includes:-
- Generate docked poses using FFT followed by all atom minimization
- Define receptor and ligand sites to focus docking
- Automatically detect antibody CDR sites
Integrated Alignment, Consensus and Superposition in the Sequence Editor
- Manipulate multimeric protein sequences using split side-by-side Sequence Editor panes
- Use dendrograms to visualize pairwise similarity, identity and RMSD relationships
- Select residues based on plotted values using resizable sequence editor plots
Distributed Pharmacophore Searching
- Run pharmacophore searches on a cluster directly from MOE GUI
- Perform fast corporate database searches
- Access multiple databases stored on a central server
Covalent Docking and Electron Density Docking
- Use reaction-based organic transformations to covalently docking
- Minimize ligand strain energy while maximizing ligand fit to electron density
- Run docking through an enhanced streamlined scenario-based interface
Extended Hückel Descriptors and pKa Model
- Compute molecular properties such as logP, logS and molar refractivity
- Determine populations of ligand protonation states at a given pH
- Calculate the pKa and pKb of small molecules
13C NMR Analysis
- Apply QM conformation refinement to calculate 13C NMR shielding
- Convert computed shieldings and predict 13C NMR chemical shifts
- Compare computed chemical shifts to experimental shifts for structure determination
I'll write a review in the New Year.
OpenEye have announced the release of OpenEye Toolkits v2015.October. These libraries include the usual support for C++, Python, C# and Java.
- FastROCS TK was added to the OpenEye toolkits collection
- Molecule reading performance improvement in OEChem TK
- The capabilities of the OEBio-Fragment Network have been expanded
- 213 new ring templates have been added to the OEChem TK built-in ring dictionary
In particular note the 2015.Oct release is the last to support Mac OSX 10.8 so time to upgrade if you have not already done so.
Grand Challenge 2015: Prediction of ligand poses, and affinity rankings, for the protein targets HSP90 and MAP4K4. Stage 1 predictions are due November 16, 2015; Stage 2 predictions are due February 1, 2016. For details, and to sign up and participate, please see https://drugdesigndata.org/about/grand-challenge-2015
SAMPL5: Prediction of aqueous host-guest binding free energies and, optionally, enthalpies for three host-guest series. A series of aqueous-organic partition coefficients may also be added in the next several weeks. Predictions are due February 1, 2016. For details, and to participate, please see https://drugdesigndata.org/about/sampl5.
These challenges are organized by the Drug Design Data Resource, which is based at UC San Diego and supported by a grant (U01GM111528) from the NIH's National Institute of General Medical Sciences. They are made possible by generous donations of data, pre-publication, from AbbVie, Genentech, the CSAR initiative at U. Michigan, and Professors Lyle Isaacs (U. Maryland) and Bruce Gibb (Tulane U.)
BioSolveIT has just announced the release of SeeSAR 3.0.
This update of SeeSAR qualifies as major release 3, since it covers two milestones in its development. So far every SeeSAR session has started from scratch. The only way to retain molecules was to save them to file and re-load them again in a subsequent session. Needless to say that loading meant recalculating all Hyde-scores again...
Project files Starting with Version 3.0, SeeSAR allows you to store all session data in a project file. This includes the protein, ligands loaded from file and new (edited) ligands. Resuming your work on a project is now as easy as double-clicking on the project-file. As a result, everything just got a hell of a lot faster! Whilst calculating Hyde-scores for say 1000 compounds took around half an hour (depending on your hardware), loading the same information from a project file now takes only a few seconds. Note that you can also generate a project file on the command line, allowing you to outsource the calculation of Hyde-scores to a different machine. This enhancement is also a great way to exchange data and ideas with a colleague! Simply store your SeeSAR session as a project file in a commonly accessible location (e.g. a network drive). Your colleague can take a look with just a double-click.
Hyde update Hyde is quite sensitive with regards to the precise geometry of a binding pose. Even the tiniest difference in a pose can distort an anyway stretched hydrogen bond just so much that it is not recognized anymore - thereby leaving you with a huge desolvation penalty for such atoms, without the gain from the h-bond. This "sharpness" of Hyde is its greatest strength (for example by highlighting real activity cliffs), but also its greatest weakness (especially if the structure has flaws or is of low resolution). In order to minimize such troubles, we optimize each pose before the Hyde affinity assessment. We improved this optimization significantly. It is now fully flexible and with sharper clash criteria, making it suitable for docked poses as well as edited compounds. All of this as efficient as before, just perfect for interactive use.
There is a review of an earlier version of SeeSAR here
MOE2014.0901 Update is now available. MOE is a fully integrated molecular modelling and drug discovery software package.
MOE 2014.0901 updates:
- Option for AMBER residue name
- Append/prepend multiple residue sequence specified by single-letter names Builder:
- Added H’s inherit color if there is a consistent coloring in the residue
sddesc: New -smi:p option causes field headers to be written to the output ASCII file
- MOESVLRUNPATH now properly honored
- Combinatorial Builder now honors different attachment point locations on the same R-group
- Database Save As one entry per file mode now properly generates unique filenames
- Dock Template Forcing batch file now correctly generated
- Saved views in .moe files now properly restored
- Auto-save when Database Viewer display attributes are changed can now be disabled to prevent changes to the database file modification date when only the display is changed and not the database content
- SVL function Deprotonate now works properly
- Various MOE Project and Project Database Update bugs
- Various minor bug fixes
There are reviews of MOE available here
Moe:- Molecular modeling
Moe Update (Jan 2009):- Molecular modeling
Review of MOE (2009.10 release):- Molecular modeling
Moe Update (December 2010.10 release):- Molecular modeling
Moe Update (December 2011 release):- Molecular modeling
Moe Update (December 2012 release):- Molecular modeling
SeeSAR it is an interactive tool for designing/improving ligands for drug discovery from BioSolve-it, that has been recently updated. Looking through the updates it is clear they have been very receptive to user feedback.
- Solution filter, Finding interesting solutions in larger sets of compounds has become much easier in SeeSAR. Compounds can now be filtered based on any available property – allowing you to easily trim down the compound set to the most interesting subset. As before, you can browse through and sort the remaining table entries to further refine your selection. Properties can be those generated by SeeSAR (such as the Hyde affinity assessment, the torsional strain, TPSA, logP, ...) or alternatively properties loaded from an SD file.
- Joined poses, You can now join the ligands found in the protein structure, compounds loaded from file and compounds newly generated within the SeeSAR editor into one “super” table and now provide quick-links to the previous view of only those from a certain origin. This allows you to see in one table the bound ligand as a reference, the project compounds and your last round of designs. You can then select your favorites from the entire table and export these (for example for an upcoming team meeting).
- Defining the protein, SeeSAR decomposes the contents of a PDB file into chains, small molecules, waters and ions. Until now, users had to accept SeeSAR's default assignments, which is fine in the majority of cases. However, there is no rule without exception, e.g., the peptide inhibitor which is mistaken as a short chain, the small molecule which is actually a co-factor, or the solvent molecule that should be ignored. With this update, SeeSAR allows you to change these default assignments to better handle these exceptional cases, allowing you to categorize a short chain as a bound ligand, or re-assign a co-factor as a permanent part of the protein. You can also eliminate protein elements altogether.
- SMILES, PDB and MOL file support, SeeSAR now comes with with additional molecule readers that broaden the scope of the application. Aside from standard 3D molecule file formats (SDF and mol2), SeeSAR now supports 1D and 2D file formats as well as reading small molecules from PDB format. If no 3D coordinates are given, SeeSAR will calculate a clash-free, low energy conformation on the fly: with the SeeSAR positioning function you can then place such input molecules in the active site of interest. Amongst other things, this feature facilitates the importing of molecules straight from your favourite chemical drawing program and assessing such structures in in the context of your protein of interest.
DOCK 6 is written in C++ and is functionally separated into independent components, allowing a high degree of program flexibility. Accessory programs are written in C and Fortran 77. Source code for all programs is provided. Read the FAQ for details of installation under MacOSX.
Allen, W. J.; Balius, T. E.; Mukherjee, S.; Brozell, S. R.; Moustakas, D. T.; Lang, P. T.; Case, D. A.; Kuntz, I. D.; Rizzo, R. C. DOCK 6: Impact of New Features and Current Docking Performance. J. Comput. Chem. Submitted.
OpenEye ihave announce the release of POSIT v3.1, the component of the OEDocking suite devoted to pose prediction.
This update includes:
- The HYBRID and FRED algorithms have been incorporated into POSIT, the appropriate method is determined by analyzing the ligand to pose against the input receptors.
- Multiprocessing has been enabled through the use of MPI, to speed calculations.
- POSIT now supports a list of receptors files or .lst file as input. This overcomes command-line limitations for the number of receptors that can be used simultaneously.
- Added a MEDIOCRE result rating for results between 33% and 50% probability.
- Command line parameters have been simplified and updated to be compatible with the OEDocking Suite of tools.
POSIT is designed for the posing problem in lead optimization, i.e. how best to leverage project information from previous protein-ligand structures to predict the pose of a new ligand. It does this by assessing the similarity of the new ligand to known bound structures. Performance degrades as similarity decreases and so at some point it is worth searching more exhaustively.
SeeSAR 1.5 has been released. SeeSAR it is intended as an interactive tool for designing/improving ligands for drug discovery.
The latest release covers two major topics: 1. A series of features that make the editing more swift and easier. To this end they introduced hot-keys, context menus and drawing a bond by drag&drop. 2. Often times people use SeeSAR for visual inspection e.g. after docking. Now normally you'll have multiple poses per compound. For a better overview the Table now allows you to collapse all poses to just one line per compound.
Furthermore you can set a bookmark to indicate what you like and export only the ones on the wish list.
There is a review of SeeSAR here.
I just got this email
Thank you for your collaboration in helping us to test the beta version of the FORECASTER Suite 2014. From your feedback and bug reports, we have now released the final version of the Suite. The files were updated and posted on the download page. Please send us any bugs that you might have not yet reported.
The FITTED docking tool was initially been developed as a suite of three programs: SMART (used to prepare the small molecules for docking), PROCESS (used to prepare the protein files for docking) and the docking program FITTED. More recently, these three programs together with several others have been integrated into a single package, namely FORECASTER.
More information can be found here http://fitted.ca/download.html#forecaster
DOT is a software package for docking macromolecules, including proteins, DNA, and RNA. DOT performs a systematic, rigid-body search of one molecule translated and rotated about a second molecule. The intermolecular energies for all configurations generated by this search are calculated as the sum of electrostatic and van der Waals energies. These energy terms are evaluated as correlation functions, which are computed efficiently with Fast Fourier Transforms. In a typical run, energies for about 108 billion configurations of two molecules can be calculated in a few hours on a few desktop workstations working in parallel.
Roberts, Victoria A. and Thompson, Elaine E. and Pique, Michael E. and Perez, Martin S. and Ten Eyck, L. F., (2013) "DOT2: Macromolecular docking with improved biophysical models" Journal of Computational Chemistry, Volume 34, Issue 20, pages 1743-1758, 30 July 2013 DOI:
Now that I have my new MacPro I thought it might be interesting to try out a couple of the software packages that I’ve previously reviewed. ForgeV10 allows the scientist to use Cresset’s proprietary electrostatic and physicochemical fields to align, score and compare diverse molecules. It allows the user to build field based pharmacophores to understand structure activity and then use the template to undertake a virtual screen to identify novel scaffolds. I’ve previously reviewed ForgeV10 and as it was formally known FieldAlign so I’m going to focus on the support for multiple processors and a few of the new features.
There is a compilation of software reviews here
A recently publication “High Performance in silico Virtual Drug Screening on Many-Core Processors” DOI describes porting BUDE (Bristol University Docking Engine) to OpenCL.
Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single NVIDIA GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, includ- ing GPUs from NVIDIA and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets.
BUDE is now one the fastest HPC applications ever developed and nicely demonstrates the portability of OpenCL across different architectures.
There is a list of GPU accelerated applications here.
I was at the Cresset UGM last week and had a chance to hear more about BlazeGPU. The original CPU application Blaze uses the shape and electrostatic character of known ligands to rapidly search large chemical collections for molecules with similar properties. The latest version BlazeGPU runs at 40 times the speed of the CPU version of Blaze but loses nothing in accuracy. At a fraction of the hardware cost, BlazeGPU delivers the same effective, ligand based virtual screening as Blaze, based on the shape and electrostatic nature of molecules.
BlazeGPU is written in OpenCL and OpenCL libraries are available from NVidia and AMD for their graphics cards, but also from Intel for the CPU and for their new Xeon Phi coprocessor cards. BlazeGPU is currently designed only to run on the GPU - for CPU-only clusters the original code is just as fast, and on a machine with a reasonably fast GPU or two the CPU tends to run flat out just feeding data to the graphics card, so there's not that much gain running on the CPU as well as the GPU.
Currently the conformer generation still runs on the CPU, but they are looking at the possibility of porting that to OpenCL as well in the future.
The relative performance is shown in the plot below, it is worth noting that these are relatively inexpensive graphics cards that you can pick up on Amazon or ebay for a few hundred pounds. Also note for a $2.10/hour GPU instance on AmazonEC2 you can process 2m conformations.
There are more examples of GPU science here.