Macs in Chemistry

Insanely Great Science

NMR Tables and more

 

I've just been sent details of a couple of useful applications

NMR Tables

This app displays a simple interactive table of properties for the NMR active isotopes. Properties displayed are element and isotope symbol, nuclear spin, natural abundance, gyromagnetic ratio, electric quadrupole moment, Larmor frequency, and dipolar coupling. Table can be sorted on any column. The magnetic field strength can be adjusted to determine the Larmor frequency. Dipolar couplings can be adjusted for different isotopes and internuclear distances. Additional filtering for even and odd atomic number isotopes, and for nuclear spin values.

RMM

RMN is a multi-dimensional signal processing application capable of handling uniformly sampled signals in an arbitrary number of dimensions. Provides a number of signal processing operations on real, complex, or multi-channel signals, such as Fourier Transform, apodization, data filling, interactive phase corrections, complex conjugate, 2D affine (translate, shear, rotate, scale) transformations. Signals can be added, subtracted, or multiplied. Two-dimensional signals can be displayed as intensity, contour, or stacked plots. RMN imports most NMR datasets from Bruker, Tecmag, JEOL, Spinsight, Varian/Agilent, and JCAMP (XYDATA only). Additionally, RMN imports image formats jpg and png, and audio format wav.

PhySyCalc

How is PhySyCalc different from other calculators? It allows you to include unit symbols in your calculations, obtaining the answer in the desired unit without those extra unit conversion steps. On top of this great simplification, PhySyCalc knows practically every fundamental physical constant. It even knows physical properties for elements and isotopes in the periodic table. This allows you to get numerical answers in the desired unit in a fraction of the time you'd spend on a conventional calculator. PhySyCalc is quick to learn and easy to use. Can't remember a unit symbol? PhySyCalc helps you find and append commonly used units onto a number. PhySyCalc uses a natural infix notation for calculations. This means you can enter and read through the entire expression in full before calculating the result, helping you quickly identify and fix any input errors.


Comments

What's Inside the 2019 Mac Pro? Complete Disassembly and Analysis

 

A more detailed disassembly.

Update PCIe Expansion Cards that work in Mac Pro

So you can install an NVIDIA card (and it will work if you use Windows boot camp) now we need drivers for MacOS X

Adding additional storage

Internal Storage Options for Mac Pro – PROMISE Pegasus R4i RAID Storage MPX Module and Pegasus J2i to conveniently store and speedily access upto 32TB of data over internal buses. Available now at the Apple.com and will be available soon through the Promise global network of distributors and value-added resellers. *Additional Capacities Coming Soon.

Video from Appleinsider

With the Sonnet M.2 4x4 PCIe card installed in your Mac Pro® tower; Windows® or Linux® PC desktop or server; or Thunderbolt™ to PCIe card expansion system with an available x16 PCIe slot, you can use it for instant-access media storage or as a high-performance scratch disk. Mac® users can even install macOS® on one SSD to create an incredibly fast boot drive(2) and create a RAID 0 set with the other installed SSDs without loss of performance. However you use it, this card’s performance is impressive.

These drives will not be encrypted by Apple T2 chip.

9to5Mac

Comments

Bluetooth Keyboards stopped connecting

 

A couple of days ago the keyboards to two of my machines stopped connecting correctly via bluetooth. I swapped out the batteries but still now joy. I connected a wired keyboard and that works fine.

In more detail

When I switch the keyboard on it appears in the bluetooth preferences and displays a message asking for a passcode, however typing the numbers does not appear to register.

Screenshot 2019-12-21 at 19.49.35

If I open up the bluetooth preferences I can see the keyboard and if I click the "connect" button again a dialog appears but will not respond to typing in the numbers.

Screenshot 2019-12-21 at 19.50.28

I have a bluetooth trackpad which connects fine as do my AirPods. The fact that this has seemingly happened to two keyboards connected attached to two different machines is perplexing. Spent an hour with the Apple Genius Bar but no joy, they would not connect to the machine in the Apple Store which I guess means it is an issue with both keyboards.

Tried deleting the bluetooth preferences, restarting, taking out the batteries for a day…

Any other suggestions?

Comments

iFixit 2019 Mac Pro Teardown: ‘A Masterclass in Repairability

 

Just in time for Christmas ifixit have done a teardown of the new MacPro

The new Mac Pro is a Fixmas miracle: beautiful, amazingly well put together, and a masterclass in repairability.

It also looks like it is eminently upgradeable, with details of how to upgrade RAM appearing online.

Upgrading the SSD is problematic since they are tied to the Apple T2 security chip, would presumably require Apple or authorised retailer.

Comments

International Fortran Standards

 

I see that Fortran has now joined twitter @fortranlang a little late to the party but welcome.

A few more links that might be of interest.

Fortran Standard Library, International Fortran Standards, Fortran Proposals This repository contains proposals for the Fortran Standard Committee in the Issues section. The idea for this repository is to act as a public facing discussion tool to collaborate with the user community to gather proposals for the Fortran language and systematically track all discussions for each proposal

All added to the Fortran on a Mac page.

Comments

OpenEye Applications v2019.Nov release

 

OpenEye is pleased to announce the release of OpenEye Applications v2019.Nov

HIGHLIGHTS

  • SPRUCE, a new application for preparing biomolecular structures for modeling applications, is now available in the OpenEye applications bundle.
  • SZMAP now provides a simpler but enhanced workflow, using the newly released SPRUCE technology for structure preparation.
  • SMIRNOFF, a small molecule force field from the Open Force Field Initiative, is now integrated into SZYBKI.

This is the last release to support macOS 10.12. Full support for macOS 10.15 will be added in the next release


Comments

DIRAC19 released

 

DIRAC : Program for Atomic and Molecular Direct Iterative Relativistic All-electron Calculations

The DIRAC program computes molecular properties using relativistic quantum chemical methods. It is named after P.A.M. Dirac, the father of relativistic electronic structure theory.

  • EOMCC - core excitation and ionization energies via core-valence separation using projectors in RELCC (Avijit Shee, Andre Gomes, Marta Lopez Vidal). Manual: see keywords under "*CCPROJ"
  • Python interface of DIRAC with Openfermion (Bruno Senjean) to perform relativistic quantum chemistry calculations simulated on a quantum computer .
  • Nuclear Spin-Rotation tensors. Contributors: I. Agustin Aucar and Trond Saue. Reference: I. A. Aucar, S. S. Gómez, M. C. Ruiz de Azúa, and C. G. Giribet Theoretical study of the nuclear spin-molecular rotation coupling for relativistic electrons and non-relativistic nuclei.J. Chem. Phys. 136 (2012) 204119. Manual: ".SPIN-R" Tutorial: Nuclear spin-rotation constants.
  • Nuclear Magnetic-Quadrupole-Moment interaction constant in KRCI (Malaya K. Nayak). Reference: T. Fleig, M. K. Nayak and M. G. Kozlov TaN, a molecular system for probing P,T-violating hadron physics.Phys. Rev. A 93 (2016) 012505

Mac/Unix install

DIRAC is configured using CMake , typically via the setup script, and subsequently compiled using make (or gmake). The setup script is a useful front-end to CMake.


Comments

Fortran Courses

 

I have limited Fortran experience but the Fortran on a Mac page is always one of the most popular pages on the site. I was recently talking to someone about training courses and they mentioned two that might be of interest. I've added them to the Fortran on a Mac page.

Fortran Course run at ETH

FORTRAN 95 is a modern programming language that is specifically designed for scientific and engineering applications. This course gives an introduction to programming in this language, and is suitable for students who have only minimal programming experience, for example with MATLAB scripts. The focus will be on Fortran 95, but differences to Fortran 77 will also be covered for those working with already-existing codes.

Fortran course run by NAG

This two day practical hands-on workshop is aimed at Fortran programmers who want to write modern code, or to modernize existing codes, to make it more readable and maintainable by encouraging good software engineering practices. This workshop will also present how to integrate tools and techniques for Fortran codes to help you develop sustainable software for your scientific and academic research particularly in a collaborative environment. Overall, the aim is to make you a better and more productive computational scientist by improving your applied computer science skills that are directly relevant to computational science.

Comments

MyChem cheminformatics extension for MySQL and MariaDB

 

After being dormant for a while this project seems to have come back to life.

Mychem is a chemoinformatics extension for MySQL and MariaDB released under the GNU GPL license. It provides a set of functions that permits to handle chemical data within the database. These functions permit to search, analyze and convert chemical data. It is based on Open Babel.

A complete documentation for Mychem is available online and will give a good overview of its capabilities.

https://mychem.github.io/docs/

I'd be interested to hear if anyone has installed under Mac OSX.


Comments

A Vortex script to calculate the Blood-Brain Barrier (BBB) SCORE

 

A recent publication described "The Blood–Brain Barrier (BBB) Score" DOI a scoring function to determine the likelihood of a molecule being brain penetrant.

Since I'm often asked about improving CNS penetration it seemed useful to implement the algorithm in a Vortex script.

There are more details and a download link here.

The data for over 1000 examples is provided in the supplementary information, this includes both CNS penetrant and non-penetrant compounds. The plot below compares the data from the supplementary information (SuppInf_BBBscore) with the data calculated for this implementation in the Vortex script. Whilst overall there is good agreement there appear to be a few outliers. On closer investigation many of the differences appear to be due to differences in the calculated TPSA. Since both implementations use the same ChemAxon software it is possible that updated version (I used version 19.8.0) has resulted in the differences.

BBBscore

Comments

Updated Amsterdam Modeling Suite 2019.3

 

An update to Amsterdam Modeling Suite (version 2019.3) has been released.

We thank all our external developers and collaborators across the globe for their ongoing support and efforts. Together we keep building on the user-friendly AMS platform, with the powerful compute engines tied together by the central AMS driver, the graphical interface, and a Python workflow environment. AMS2019.3 provides useful productivity enhancements for many applications and research projects.

Structure and Reactivity

  • Microkinetics calculations from the GUI using Filot’s MKMCXX program
  • Reaction path & TS search with CI-NEB with all engines
  • Apply non-isotropic external stress to periodic systems
  • Quick validation of transition states
  • Energy decomposition analysis and ETS-NOCV with real unrestricted fragments

Molecular Dynamics

  • Accelerate MD runs with Temperature Replica Exchange
  • Non-equilibrium molecular dynamics (NEMD) for thermal conductivity
  • New analysis tools: radial distribution functions, histograms, temperature profiles
  • Faster visualization of trajectories for large systems

Faster Simulations

  • Automatic double parallelization speeds up certain calculations by orders of magnitude
  • Much faster periodic DFTB (including GFN1-xTB), analytic stress tensors
  • Run calculations in the Amazon Web Services (AWS) cloud directly from ADFJobs
  • Fast vibrationally resolved electronic spectra and resonance Raman

New Accurate Methods

  • Double hybrid and MP2 energy calculations for molecules with hundreds of atoms
  • Latest Grimme D4 dispersion corrections
  • Implicit solvation model GBSA for molecular DFTB and GFN-xTB

New Interfaces

  • New graphical user interface to VASP
  • Python scripting with COSMO-RS
  • Efficient pipe interface for AMS with external codes

The python scripting is pleasing to see, python is becoming the lingua franca and probably an essential component of any software package. There is more detail on the python support here.

The latest update works under Catalina. However there are still some minor issues (e.g. vtk bug on windows maximizing, these should be sorted soon).

Comments

Ensemble learning in Cheminformatics

 

Yet another invaluable post on cheminformatics and machine learning Python package for Ensemble learning #Chemoinformatics #Scikit learn.

Ensemble learning sometime outperform than single model. So it is useful for try to use the method. Fortunately now we can use ensemble learning very easily by using a python package named ‘mlens‘

Install using PIP

pip install mlens

ML-Ensemble (mlens) is an open-source high performance ensemble learning package written in Python, code is available on GitHub https://github.com/flennerhag/mlens.

ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framework to build memory efficient, maximally parallelized ensemble networks in as few lines of codes as possible.


Comments

Poll results: How do you pronounce zsh?

 

A week ago I posted a poll asking how to pronounce zsh, well the results are in.

zshPollresults

Well, the winner is Zeeshell (pronounced like seashell), however this is clearly not unanimous.

Several readers also pointed out this thread on StackExchange What are the practical differences between Bash and Zsh? which contains lots of useful information.

This book might also be useful Moving to zsh.

Comments

Schrödinger Software Release 2019-4

 

A nice video showing the new/updated features in the latest release https://www.schrodinger.com/platform.

Not yet certified for Catalina.

Comments

Mol*: fast, interactive 3D visualisation of biomolecules in your browser

 

The PDB in Europe have just introduced a new viewer for biomolecules, Mol*, is a new 3D molecular viewer developed in collaboration between RCSB PDB, PDBe, and CEITEC, from RCSB PDB and PDBe pages.

On the PDBe pages, Mol* has now replaced the LiteMol viewer in all instances, including on the PDBe search and entry pages e.g. pdbe.org/5lnk/3d. Access Mol* from any 3D View tab for Structure Summary pages at RCSB.org. Both the LiteMol and NGL viewers at PDBe and RCSB PDB, respectively, will no longer be actively developed.

Here is the structure of Aldehyde Oxidase PDB ID 4uhw.

AO_Molviewer

Comments

Poll How do you pronounce zsh?

 

The poll results are here.

As people migrate to Catalina there is an option to update your default shell.

zsh is the new default shell for new users (bash is the default shell in macOS Mojave and earlier), so if you are upgrading you may want to change your default shell to zsh.

Paul Falstad wrote the first version of Zsh in 1990[6] while a student at Princeton University.[7] The name zsh derives from the name of Yale professor Zhong Shao (then a teaching assistant at Princeton University) — Paul Falstad regarded Shao's login-id, "zsh", as a good name for a shell.

You might also like to look at oh-my-zsh

A delightful community-driven (with 1,300+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.

So while it was pretty obvious how to pronounce "Bash" (The shell's name is an acronym for Bourne-again shell, a pun on the name of the Bourne shell that it replaced), but what about "zsh"?

This book might also be useful Moving to zsh.

bike trails
Comments

Chemical Information and Computer Applications Group

 

RSC-Group-Logo-Chemical-Information-and-Computer-Applications

My Royal Society of Chemistry annual subscription letter just arrived. This is a good time to think about Interest Group membership, if you are a member of the RSC you can join several interest groups for free, however disappointingly many people fail to take advantage of this opportunity.

Interest Groups are scientific networks run by members for their community. Each group is themed around a specific area or application of the chemical sciences. They organise an annual series of events to cater for both their members and the wider scientific community. These events vary from: multi-day conferences and workshops to training events.

The Chemical Information and Computer Applications Group (CICAG) is one such group (Group number 86), you can find out more on the CICAG website.

The storage, retrieval, analysis and preservation of chemical information and data are of critical importance for research, development and education in the chemical sciences. All chemists, and everybody else who works with chemical substances, need tools and techniques for handling chemical information.

CICAG works to:

  • Support users of chemical information, data and computer applications and advance excellence in the chemical sciences
  • Inform RSC members and others of the latest developments in these rapidly evolving areas;
  • Promote the wider recognition of excellence in chemical information and computer applications at this level

This year CICAG has been involved in a workshop on Computational Tool for Drug Discovery, 2nd Conference on Artificial Intelligence in Chemistry, and the 20 Years of Rule of Five Meeting. In addition CICAG has supported the Sheffield Cheminformatics Meeting, Dial a Molecule, Artificial Intelligence and Augmented Intelligence for Automated Investigations for Scientific Discovery network and the In Silico Toxicology Networking Meeting. In addition to other outreach activities.

RSC CICAG publish a regular Newsletter which keeps members in touch with the Group's activities and includes articles, reviews of interest, news and events.

If you would like to join the group you can do so by adding interest group 86 to the last page of the renewal form and returning it to the RSC or you can make a request to join a group via email (membership@rsc.org) or telephone (01223 432141).

Comments

CSFP - A New Molecular Fingerprint

 

An interesting paper in JCIM, Connected Subgraph Fingerprints: Representing Molecules Using Exhaustive Subgraph Enumeration DOI.

The very popular ECFP fingerprint enumerates all circular substructures, i.e. substructures with a central atom and a spherical extension around them. However, not all chemical reasonable substructures are circular. They could be shaped as paths, cycles or any other irregular form and consequently cannot be represented as single features in ECFPs. To overcome these limitations, we developed a novel algorithm named CONSENS systematically enumerating all connected substructures within given size limits. CONSENS is the central element of a novel fingerprint named CSFP - Connected Subgraph Fingerprint. CSFPs are not only richer in represented substructures, furthermore they allow finegrained control to the chemical model encoded.

ConsensLib is a header-only C++ library for efficient enumeration of connected induced subgraphs. The CONSENS (Connected Subgraph Enumeration Strategy) algorithm enumerates all node sets that form a connected subgraph of a given query graph. It is available on GitHub https://github.com/rareylab/ConsensLib.

Comments

Modin for distributed Pandas calculations

 

Modin is a library designed to accelerate Pandas by automatically distributing the computation across all of the system’s available CPU cores. Modin uses Ray to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. Modin is a DataFrame designed for datasets from 1MB to 1TB+

It can be installed using PIP

pip install modin

If you don't have Ray or Dask installed, you will need to install Modin with one of the targets:

pip install modin[ray] # Install Modin dependencies and Ray to run on Ray
pip install modin[dask] # Install Modin dependencies and Dask to run on Dask
pip install modin[all] # Install all of the above

Currently, Modin depends on pandas version 0.23.4.

I've added Modin to the Open Source Data Science Python Libraries.

Comments

Custom Accessibility Keyboard Panels

 

Anyone who is interested in AppleScript or automator workflows on a Mac will have heard of Sal Soghoian, Sal joined Apple Inc. in January 1997 to serve as the Product Manager of Automation Technologies. These technologies include AppleScript, Services, the Terminal, Apple Configurator and Automator, among others. Since he left Apple he has continued to develop automation and describe them on his website User Automation.

This website is dedicated to informing individuals about the tools at their disposal that can be used by them to control the devices they engage with and rely upon every day. I hope you find this information useful. — Sal Soghoian

In a recent video Sal describes how to use an accessibility feature hidden deep within macOS and turn an iPad into a completely customizable control panel for a Mac.

Full details are here https://userautomation.com/article/blog0003.html.

userautomation

It requires a Luna Display dongle since Apple’s Sidecar technology won’t work because it doesn’t allow touch input on the iPad when it’s acting as a Mac screen.

Comments

CellPAINT

 

Just saw an interesting article "CellPAINT: Interactive Illustration of Dynamic Mesoscale Cellular Environments" DOI.

Integrative computational modeling is currently the method of choice for studying the detailed mesoscale molecular structure of cellular environments. However, current methods are highly compute intensive and require extensive and diverse domain knowledge. We have developed an interactive mesoscale illustration method, cellPAINT, that allows non-expert users to create mesoscale models that integrate a variety of biological data. CellPAINT uses the approach of popular digital painting software, providing users with a palette of “brushes” to paint molecules and infrastructure into a mesoscale scene, and coloring tools and visual filters to customize the rendering. CellPAINT also incorporates a variety of mesoscale properties, such as an interactive temperature slider that controls diffusive motion and interaction of proteins within membranes. The current release allows creation of scenes with an HIV virion, blood plasma, and a simplified T-cell.

cellpaint

The software can be downloaded from Sourceforge

Comments

macOS Installation

 

Just had news of an update to a popular book macOS Installation, useful reading if you have to admin multiple Macs.

With the introduction of macOS High Sierra Apple and Secure Boot on the iMac Pro has profoundly changed the workflows for installing and deploying macOS on a large scale. In addition new technologies and services like MDM, DEP and VPP need to be configured and used correctly. This books explains all the different terms, services and technologies and suggests workflows for Administrators to deploy and manage macOS in education, business, enterprise and other organisations in this new “post-imaging” world.


Comments

OpenEye Toolkits v2019.Oct

 

OpenEye have announced the release of OpenEye Toolkits v2019.Oct. These libraries include the usual support for C++, C#, Java, and Python.

HIGHLIGHTS

  • Spruce TK, a new toolkit for preparing biomolecular structures for modeling applications, is now available in both C++ and Python APIs. Full details of the methodology are here.
  • OEChem TK now provides improved substructure search capability, allowing users to search tens of millions of molecules in seconds.
  • SMIRNOFF, a small molecule force field from the Open Force Field Initiative, is now integrated into OEFF TK. The force field can handle almost all pharmaceutically relevant chemical space

Full details are in the release notes.

This is the last release to support macOS 10.12.

Comments

Mobile Science Updated

 

I've just finished adding a few more entries to the Mobile Science Site, there are now over 250 entries covering all areas of science.

The most popular seem to be:

MobSci

Comments

Applescript tools

 

macosxautomation released a couple of upgrades. FileManagerLib 2.3.2 now has commands for dealing with date properties, while RegexAndStuffLib 1.0.6 accepts lists of strings for its regex change and regex batch commands.

In addition an update to Script-Debugger has been released, This release also introduces a series of changes to improve compatibility with macOS Catalina (10.15).

Comments

StarDrop, version 6.6 released

 

Optibrium have just released StarDrop version 6.6 this update includes:

pKa prediction - A new model included in the ADME QSAR module. Existing ADME QSAR users can upgrade free of charge. Details of this were presented by Peter Hunt and the webinar can be accessed here.

stardropPka

SeeSAR™ modules An extended suite of SeeSAR™ modules to support structure-based design;

  • View – Visualise protein-ligand interactions in 3D.
  • Affinity – Analyse your ligand’s affinity with visual atomic contributions and torsion angle heat maps.
  • Pose – Generate compound poses for virtual screening and interactive 3D design.

Comments

Visualizing 3D molecular structures using an augmented reality

 

Another workflow has been published "Visualizing 3D molecular structures using an augmented reality app" https://chemrxiv.org/articles/Visualizing3DMolecularStructuresUsinganAugmentedRealityApp/10031888

I've added it to the How to Page.



Comments

RDKit 2019_09_1 (Q3 2019) Release

 

A new version of RDKit has been released https://github.com/rdkit/rdkit/releases/tag/Release201909_1.

Highlights:

  • The substructure matching code is now about 30% faster. This also improves the speed of reaction matching and the FMCS code.
  • A minimal JavaScript wrapper has been added as part of the core release.
  • It's now possible to get information about why molecule sanitization failed.
  • A flexible new molecular hashing scheme has been added.

There are however a number of backward incompatible changes detailed in the documents.

Also the old MolHash code should be considered deprecated. This release introduces a more flexible alternative.

Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit). The available conda binaries for this release are:

  • Linux 64bit: python 3.6, 3.7
  • Mac OS 64bit: python 3.6, 3.7
  • Windows 64bit: python 3.6, 3.7

Some things that will be finished over the next couple of days:

  • The conda build scripts will be updated to reflect the new version
  • The homebrew script

Comments

A handy guide to financial support for open source

 

I've previously written about my thoughts on the sustainability of scientific software and tried to help publicise by compiling a listing of open-source cheminformatics toolkits and Open Source Python Data Science Libraries.

I've just come across this document that may be of interest A handy guide to financial support for open source.

This document aims to provide an exhaustive list of all the ways that people get paid for open source work. Hopefully, projects and contributors will find this helpful in figuring out the best options for them.

Well worth a read and please share.

Comments

Chemistry & Elements

 

As I mentioned previously the United Nations General Assembly during its 74th Plenary Meeting proclaimed 2019 as the International Year of the Periodic Table of Chemical Elements (IYPT 2019) on 20 December 2017. The IYPT website gives details of events and you can find out more by looking for the hashtag #IYPT2019.

As you might expect this has lead to the development of the number of Periodic Table apps and I've highlighted a couple of mobile apps previously.

I've just been told of another app.

Chemistry & Elements is available on the AppStore.

Convenient interactive Mendeleev's Periodic table. Tap a chemical element in the table to find more information about it. The calculator of molar masses. Enter a chemical compound correctly and it will show molar masses and percentages of elements. The table of solubility of substances is added in the app as well as Acid strength chart. Now your textbooks become waste!

All these tables and charts are available in the app for free:

  • Periodic table
  • Offline access to information about chemical elements
  • Solubility table
  • Molar mass calculator
  • Electronegativity of elements
  • Molecular weights of organic substances
  • Reactivity series
  • Acid strengths chart


Comments

Getting ready for Catalina

 

Whilst there are many sites that track the compatibility on common desktop applications, it is often difficult to find out information about scientific applications. Based on the number of page views on the lists for Mojave, High Sierra, Sierra, El Capitan, Yosemite it is apparently a useful resource.

I'll start compiling a list over the weekend but I thought I'd mention a couple of things. One of the key features is that 32-bit applications will no longer be supported, Apple has a page describing the reasons for moving to 64-bit.

Apple's transition to 64-bit technology is now complete. Starting with macOS Catalina, 32-bit apps are no longer compatible with macOS. If you have a 32-bit app, please check with the app developer for a 64-bit version.

The easiest way to check for 32-bit apps is to click on the Apple icon (top left of screen) and select "About this Mac", then click the system report button. Then select "Applications" and details of 32/64-bit are in the rightmost column.

GetReadyCatalina

When you attempt to open a 32-bit app, you will see an alert that the app is not optimized for your Mac, or that the developer needs to update it to work with this version of macOS. There may also be drivers that need to be updated.

Techrader has also been tracking installation problems.

As ever feel free to send me any information on scientific applications under Catalina and add them to the list.

Comments

ModelAR added to Mobile Science site

 

ModelAR is a powerful 3D modeling tool for students looking to practice organic chemistry. You can explore chemical structures by creating a molecule on the workspace and quickly toggle to pop into AR. This Augmented Reality feature allows you to interact with virtual molecules in real space. ModelAR brings chemistry to life.

Comments

BBEdit 13 Now Available

 

Everyone's favourite text editor BBEdit 13 Now Available! (October 3, 2019)

This major release offers over a hundred feature additions, changes, and refinements, including Pattern Playgrounds, the Grep Cheat Sheet, enhanced Dark Mode support, and readiness for macOS Catalina. Read all about it!

BBEdit 13 is a paid upgrade (US$29.99 or US$39.99) for all licensed customers with BBEdit 12.6.7 or earlier.

BBEdit 13 is a free upgrade for all licensed BBEdit 12 customers who purchased their license on or after May 1, 2019.

If you have been using BBEdit 11 or 12 in Free Mode, BBEdit 13.0 will restart your evaluation period. You'll have access to all of BBEdit 13's features for 30 days, after which BBEdit will return to feature-limited operation.


Comments

ChemDraw and Microsoft Office

 

There has been a twitter discussion on roundtrip editing between ChemDraw and Office 365, there was a little confusion between the desktop and web versions.

Philip Skinner summarised things for the current versions.

To clarify. Copy/paste of editable structures into MS Office works on PC and Mac. Copy/paste of editable structures into MS Office Online does not work on PC nor on Mac

The same is true for pretty much all chemical drawing packages.


Comments

Fortran on a Mac

 

I've done another update to the Fortran on a Mac page.

Added a number of open-source comp chem packages.

Many thanks to Sebastian Ehlert for highlighting dftd4.


Comments

Installing RDKit using Homebrew

 

I just saw this message on the RDKit users message board which offers a method to install RDKit using Homebrew, I use Anaconda to install RDKit so I've not tested it.

Recently, I updated the brew install recipe for rdkit on Mac. The biggest change is that boost and boost-python's versions were pinned down, so that the brew install recipe should be much more reproducible than before. Here is a fail-safe way to install rdkit with it (with Python wrappers, and InChI support):

I've added the instructions to the Cheminformatics on a Mac page as an alternative to using Anaconda to install RDKit.

The RDKit is an open source toolkit for cheminformatics, 2D and 3D molecular operations, descriptor generation for machine learning, etc.

helm2smiles


Comments

Wolfram|Alpha updated

 

Everyone's favourite mobile scientific search app has been updated.

Wolfram|Alpha has been updated

Use the power of Wolfram's computational intelligence to answer your questions. Building on 30+ years of development led by Stephen Wolfram, Wolfram|Alpha is the world's definitive source for instant expert knowledge and computation.

There are many more apps on the Mobile Science Site simply use the "Search" Box

mobilescience


Comments

A complete guide to K-means clustering algorithm

 

A little while back I compared different Options for Clustering large datasets of Molecules.

Clustering is an invaluable cheminformatics technique for subdividing a typically large compound collection into small groups of similar compounds. One of the advantages is that once clustered you can store the cluster identifiers and then refer to them later this is particularly valuable when dealing with very large datasets. This often used in the analysis of high-throughput screening results, or the analysis of virtual screening or docking studies.

One popular (and quick) technique for clustering is to use K-means clustering. I just came across this very useful explanation of K-means clustering, well worth a read.

A complete guide to K-means clustering algorithm.


Comments

Benchmark set for relative free energy calculations.

 

Free energy perturbation (FEP) is a method that is used in computational chemistry for computing free energy differences from molecular dynamics or Metropolis Monte Carlo simulations and used in a wide variety of applications.

FEP calculations have been used for studying host–guest binding energetics, pKa predictions, solvent effects on reactions, and enzymatic reactions. Other applications are the virtual Screening of ligands in drug Discovery, as well as for In silico mutagenesis studies. For the study of reactions it is often necessary to involve a quantum-mechanical (QM) representation of the reaction center because the molecular mechanics (MM) force fields used for FEP simulations can't handle breaking bonds.

In recent years this technique has gained popularity in predicting binding interactions in drug discovery, and it is great that Merck have made a benchmark dataset available on GitHub https://github.com/MCompChem/fep-benchmark. Details of which are described here DOI. Eight different targets are included together with 200 ligands.

CDK8, c-Met, Eg5, Hif2a, PFKFB3, SHP2, SYK, TNKS2


Comments

SimplyFortran

 

SimplyFortran 3.5 has been released.

This release brings major upgrades to the overall package on Windows and macOS, and adds numerous minor features and bug fixes to the development environment. The GNU Fortran compiler has been updated to version 9.1.0 on Windows and macOS, which incorporates asynchronous input and output for Fortran code on these platforms. Simply Fortran's compiler on Windows handles asynchronous operations using native threading unlike the default configuration, simplifying the build process and eliminating possible external dependencies in compiled Fortran code. The integrated development environment now includes the option to manually trigger autocomplete rather than automatically popping up autocompletion lists. This minor change makes typing considerably more efficient in many cases when autocompletion is unnecessary. The ability to export a simple CMake instruction file has been added to the development environment's "Export" submenu. A possible crash due to an invalid memory access when creating a Makefile has been corrected.

I've included this on the Fortran on a Mac page and used the opportunity to update the page. I'm not a big Fortran user so if anyone knows of anything that could be included please drop me a line.


Comments

Durrant Lab Software

 

A reader recently pointed out BlendMol part of a suite of software tools developed by the Jacob Durrant Lab.

BlendMol is a Blender plugin that can easily import VMD 'Visualization State' and PyMOL 'Session' files. BlendMol empowers scientific researchers and artists by marrying molecular visualization and industry-standard rendering techniques. The plugin works seamlessly with popular analysis programs (i.e., VMD/PyMOL). Users can import into Blender the very molecular representations they set up in VMD/PyMOL.

This looks like a very interesting open-source project available on GitHub, however looking at the software page https://durrantlab.pitt.edu/durrant-lab-software/ I see there are a number of other interesting packages.

Dimorphite-DL adds hydrogen atoms to molecular representations, as appropriate for a user-specified pH range. It is a fast, accurate, accessible, and modular open-source program for enumerating small-molecule ionization states.

Gypsum-DL is a free, open-source program that converts 1D and 2D small-molecule representations (SMILES strings or flat SDF files) into 3D models. It outputs models with alternate ionization, tautomeric, chiral, cis/trans isomeric, and ring-conformational states.

PCAViz is an open-source Python/JavaScript toolkit for sharing and visualizing MD trajectories via a web browser. To encourage use, an easy-to-install PCAViz-powered WordPress plugin enables ‘plug-and-play’ trajectory visualization.

Scoria is a Python package for manipulating three dimensional molecular data. Unlike similar packages, Scoria is written in pure Python and so requires no dependencies or installation. One can incorporate the Scoria source code directly into their own programs. But Scoria is not designed to compete with other similar packages. Rather, it complements them. Our package leverages others (e.g., NumPy, SciPy, MDAnalysis), if present, to speed and extend its own functionality.

Looks like a great resource.


Comments

Crowdfunding software development

 

Some time ago I wrote a piece on my thoughts on scientific software development I got a lot of very positive feedback and one of the comments about not knowing about available cheminformatics toolkits lead me to create a page on open source toolkits. However this really did not address the underlying problem of how to fund specialist scientific software.

Which is why I was intrigued to hear about Andrew Dalke's efforts to crowdfund development of an open source cheminformatics software development.

This is an experiment to see if a crowdfunding consortium can be used to fund the matched molecular pair program “mmpdb”. The deadline to join is 1 February 2020!

The project is mmpdb, initial work was described in and article in JCIM "mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets" DOI.

Here we present mmpdb, an open-source matched molecular pair (MMP) platform to create, compile, store, retrieve, and use MMP rules. mmpdb is suitable for the large data sets typically found in pharmaceutical and agrochemical companies and provides new algorithms for fragment canonicalization and stereochemistry handling. The platform is written in Python and based on the RDKit toolkit.

Go over to the project page http://mmpdb.dalkescientific.com to find out more and if you can contribute please do, and also please share the link. He will be talking at the RDKit UGM #rdkitugm2019 and the presentation will probably be online later.


Comments

Mac backup options

 

We now have so much of our digital life on a hard drive including photos, music, emails from friends and family. In addition it is always worth having a current backup prior to a major OS upgrade, and with macOS Catalina on the horizon now would be a good time to review.

This page reviews backup options including external or network hard drives, and cloud storage, in addition to the software tools available.

Backup Options for Mac.

cloud-computing-1990405_1280


Comments

BBEdit Update

 

Everybody's favourite text editor BBEdit has been updated.

BBEdit 12.6.7 contains fixes for reported issues.

BBEdit 12.6.7 does not add any new features. (It doesn't take away any, either.)

Made a change to ask the OS-provided print panel to place the page attribute controls (orientation, scaling, paper size) in the panel proper, rather than hiding them behind the "Page Attributes" section in the popup menu.

Plus several bug fixes


Comments

Python and CUDA

 

After my last post on Macs and CUDA I was sent a link to CuPy which is a library that is supported by NVIDIA that allows to easily run CUDA code in Python using NumPy arrays as input.

CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement. All you need to do is just replace numpy with cupy in your Python code. It supports various methods, indexing, data types, broadcasting and more.

To install

pip install cupy

Note

The latest version of cuDNN and NCCL libraries are included in binary packages (wheels). For the source package, you will need to install cuDNN/NCCL before installing CuPy, if you want to use it.

Or you can install versions specific to the particular CUDA environment. Full details are on GitHub https://github.com/cupy/cupy.

Comments

Macs and CUDA

 

One of the highlights for me at the recent 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry in Cambridge was the work of Adrian Roitberg and Olexandr Isayev et al on Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning DOI.

Here we train a general-purpose neural network potential (ANI- 1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology, and chemistry, and billions of times faster than CCSD(T)/CBS calculations.

The presentation was really compelling and really looks like an example where AI can be truly transformational. The good news is the code is all freely available on Github https://github.com/isayev/ASE_ANI, the bad news is that it "Works only under Ubuntu variants of Linux with a NVIDIA GPU" and Python binaries built for python 3.6 and CUDA 9.2.

In the past I would have stopped there but with the increasing number of external GPU and a NVIDIA CUDA Installation Guide for Mac OS X I'm wondering if there might be a path forward. I'd be very interested to hear about experiences with external GPU with NVIDIA graphics cards and using the CUDA toolkit on a Mac.

Update

Olexandr emailed me to to mention they have a pure Python version https://github.com/aiqm/torchani this will run on Mac however there is no GPU acceleration.

TorchANI is a pytorch implementation of ANI. It is currently under alpha release, which means, the API is not stable yet. If you find a bug of TorchANI, or have some feature request, feel free to open an issue on GitHub, or send us a pull requests

Also stumbled across the paper

Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks https://arxiv.org/abs/1909.02487Arxiv


Comments

2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry

 

The 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry is now over, two intensive days of presentations and posters. Many thanks for all who took part and made it such a successful event.

Special mention to the Poster prize winners.

P17 by Jenke Scheen of the University of Edinburgh Entitled: "Improving the accuracy of alchemical free energy methods by learning correction terms for binding energy estimates"

P6 by Adam Green of the University of Leeds Entitled: "Activity-directed discovery of inhibitors of the p53/MDM2 interaction: towards autonomous functional molecule discovery"

P3 by Ya Chen of the University of Hamburg Entitled: "NP-Scout: machine learning approach for the identification of natural products and natural product-like compounds in large molecular databases"

If you want to browse through the Twitter feeds search for the #AIChem19 hashtag.

Many of the presentations are now available in pdf format on the meeting website.

We are already thinking about a possible 3rd meeting, and any feedback would be much appreciated.

Comments

Introductory video for iNMR

 

Giuseppe Balacco has posted an introductory video that shows many basic operations applied to a simple 1-H spectrum using iNMR

If you have any comments or suggestions I'm sure Giuseppe would be delighted to see them posted on the YouTube page.

iNMR is the software of your dreams: elegant yet affordable, straightforward yet complete, tightly integrated with the OS, well tested and fast. When your spectra are beautifully reproduced in full screen size and they respond immediately to your commands, that is the ultimate NMR experience! iNMR can do all the things you expect from a traditional NMR program (and ten times more), plus the things you would expect from a genuine Mac or Win application. The clean interface is the secret to the high user satisfaction and productivity. iNMR is being continuously updated and tailored to the needs of the customers. One-to-one support and tailored programming are included.

Comments

Determining the Amino Acids in a collection of peptides

 

I've recently become interested the comparison of the amino amino-acid composition of peptides, to allow comparison of cyclic versus linear peptides, or brain penetrant curses non-penetrant. I had a look around but could not find any tools that did this, in particular I wanted to include any non-proteinergic amino-acids.

This tutorial provides a means to analyse many thousands of peptides using Vortex.

Comments

OraRdkitCart an Oracle data cartridge

 

OraRdkitCart is an Oracle data cartridge/extensible index to allow substructure and similarity searching using SQL queries on tables which contain indexed chemical structures.

It uses a Java RMI server and RDKit wrappers for chemical structure handling.

The cartridge has been tested on Oracle 12C and Oracle 18C. It would be expected to run on Oracle 19C, but has not yet been tested.

Full details on GitHub https://jones-gareth.github.io/OraRdKitCart/index.html

Comments

Greg Landrum's ACS talk on RDKit

 

I've created a page of open source cheminformatics toolkits here.




Comments

MOE 2019.0102 released

 

An update to MOE has been released by Chemical Computing Group. This is a minor update but is recommended for all users.

However I'm sure this item will delight many users

Copy/Paste in popup menus. Copy and Paste menu items have been added to the MOE Popup menu. Copy menu items have been added to the various Entry and Field popups, including in the footer, of the Database Viewer.

Comments

Novartis Open Source tools for Drug Discovery

 

I'm sure most readers of this site are aware of the Open-Source cheminformatics toolkit RDKit that was first developed in Novartis. However I wonder how many are aware of the other Open-Source tools that Novartis have supported.

You can read more about them here

The Novartis Institutes for BioMedical Research (NIBR) is pioneering new informatics tools for drug discovery. We believe in the power of open-sourced, global collaboration for the greater good. Join us to help patients worldwide.

They are available on GitHub here.

They include Habitat an object management system, OntoBrowser a tool to manage ontologies and controlled terminologies. YAP is an extensible parallel framework, written in Python using OpenMPI libraries, and GridVar a jQuery plugin that visualises multi-dimensional datasets as layers organised in a row-column format

Comments

SeeSAR updated

 

BioSolveIT have announced an update to SeeSAR

This is more or less a "silent release" that includes numerous improvements to SeeSAR behind the scenes and also two new features for the user. Perhaps you won't notice most of these changes but in case you have stumbled on these things in the past, here are a few examples: We optimized the internal database so that molecules can be moved much more quickly from one data table to another, we fixed a small bug in the ReCore engine, clashes and torsions are calculated automatically together with HYDE and torsion labels are now also available for bonds in new fragments generated in the inspirator. PDB export You may now select any pose in any of the data tables and export it together with the binding-site protein as a complex in PDB format. This feature is a bonus for users who wish to post-process results in MD packages that expect this particular input format. extended licenses Since version 9, the new FlexX is integrated into SeeSAR as the Docking mode and can be used with the SeeSAR license. Since version 9.1, the docking calculation set-up can be exported from SeeSAR for separate processing in the commandline version of FlexX, e.g. on a cluster, which until now required a separate license. Your SeeSAR license is now valid for both the GUI as well as bulk docking using the commandline version of FlexX.

Comments

A parallel Fortran framework for neural networks and deep learning

 

Since the Fortran on a Mac is one of the most popular pages on the site I thought I'd mention this paper, submitted to ACM SIGPLAN Fortran Forum, I've just come across.

A parallel Fortran framework for neural networks and deep learning DOI.

This paper describes neural-fortran, a parallel Fortran framework for neural networks and deep learning. It features a simple interface to construct feed-forward neural networks of arbitrary structure and size, several activation functions, and stochastic gradient descent as the default optimization algorithm. Neural-fortran also leverages the Fortran 2018 standard collective subroutines to achieve data-based parallelism on shared- or distributed-memory machines. First, I describe the implementation of neural networks with Fortran derived types, whole-array arithmetic, and collective sum and broadcast operations to achieve parallelism. Second, I demonstrate the use of neural-fortran in an example of recognizing hand-written digits from images. Finally, I evaluate the computational performance in both serial and parallel modes. Ease of use and computational performance are similar to an existing popular machine learning framework, making neural-fortran a viable candidate for further development and use in production.


Comments

iOS adoption prior to iOS 13

 

With the release of iOS 13 on the horizon I thought I'd have a look at the excellent mixpanel to look at the current adoption. iOS12 has been rapidly taken up and now commands over 90% of share of iOS use.

iOS12uptake


Comments

Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks

 

As a regular Jupyter/Python user this publication (PLoS Comput Biol 15(7): e1007007) DOI is a great reminder of good practice, and as Jupyter becomes increasingly popular as a means to share code/data/results writing the notebook in a manner that helps readers is increasingly important.

This ability to combine executable code and descriptive text in a single document has close ties to Knuth’s notion of “literate programming” and has convinced many researchers to switch to computational notebooks from other programming environments. Jupyter Notebooks in particular have seen widespread adoption: as of December 2018, there were more than 3 million Jupyter Notebooks shared publicly on GitHub (https://www.github.com), many of which document academic research.

There are of course many different ways to share Jupyter notebooks.

Whether you use notebooks to track preliminary analyses, to present polished results to collaborators, as finely tuned pipelines for recurring analyses, or for all of the above, following this advice will help you write and share analyses that are easier to read, run, and explore.

Comments

ORCA 4.2 released

 

ORCA is a general purpose quantum chemistry package that is free of charge for academic users. It has been developed since the late 90s and by now is one of the most heavily used quantum chemistry packages worldwide. It can be downloaded from the Website of the Max Planck Institut fuer Kohlenforschung at www.kofo.mpg.de

Commercial users should contact https://www.faccts.de/orca/

With a strong user base of more than 15.000 registered users in academia worldwide, ORCA is the fastest growing quantum-chemical software package to date.  ORCA provides cutting-edge methods in the fields of density functional theory as well as correlated wave-function based methods

ORCA 4.2 New Features

Local correlation

  • Iterative (T) for open shells
  • Multi-level scheme for open shell systems (all PNO accuracy levels)
  • DLPNO-STEOM-CCSD for closed shells
  • DLPNO-CCSD(T)-F12 for open shells
  • Automatic fragmentation in LED analysis
  • RIJCOSX-LED implementation
  • HF-LD method for efficient dispersion energy calculations

Multi-Reference

  • FIC-CASPT2 implementation including level shift and IP/EA shift.
  • FIC-NEVPT2 unrelaxed densities and natural orbitals.
  • CIPSI/ICE improvements. Can be run now with configurations, individual determinants or CSFs (experimental)
  • FIC-ACPF/AQCC: variants of the FIC-MRCI ansatz
  • Efficient linear response CASSCF
  • Reduced memory requirements in MRCI and CIPSI/ICE

Spectroscopy

  • GIAO EPR calculations (one issue with the SOMF operator still remaining)
  • Improvements to ESD module for fluorescence, phosphorescence, bandshape, lifetime and resonance Raman calculations
  • ESD now includes also the prediction of the Intersystem Crossing non-radiative rates
  • Hyperfine couplings for CASSCF calculations (but not as response)

Excited states

  • Spin-orbit coupling in TD-DFT
  • MECP optimization for TD-DFT
  • Conical Intersection Optimization
  • Range-separated double-hybrids (B2PLYP, B2GPPLYP) for TDDFT
  • Numerical and Hellmann-Feynman NACMEs using TD-DFT/CIS
  • DLPNO-STEOM-CCSD for closed shells (also see 'Local correlation')

Solvation

  • CPCM Gaussian Charge Scheme with the scaled-vdW surface and the Solvent Excluded Surface (SES). Available for single point energy calculations and geometry optimizations using the analytical gradient.

SCF/optimizer/semi-empirics/infrastructure etc.

  • Nudge elastic band (NEB) transition states improvements (also works with xTB for initial path)
  • Improved compound method scripting language for workflow improvements
  • Improved ASCII property file
  • Libxc interface allows a far wider range of density functionals to be used
  • Interfaced with Grimmes GFN-xTB and GFN2-xTB
  • Improvement of IRC algorithm
  • Cartesian minimization (L-OPT) for systems with 100.000s of atoms, Minimization of specific elements (incl. H) only, fragment specific optimization treatment (relax all, relax hydrogens, rigid fragment, fixed fragments)

QM/MM and MM

  • First release with ORCA-native MM and QM/MM implementation
  • Automated conversion from NAMDs CHARMM format
  • Automated generation of simple force-field for non-standard molecules
  • Simple definition of active and QM regions
  • Automated inclusion and placement of link-atoms
  • Automated charge-shifts to prevent over-polarization
  • MM and QM/MM work with all kinds of optimizations, NEB / NEB-TS methods, frequency analysis
  • Option for rigid MM water (TIP3P) in MD simulation and optimization

Molecular Dynamics

  • Added a Cartesian minimization command to the MD module, based on L-BFGS and simulated annealing. Works for large systems (> 10'000 atoms) and also with constraints. Offers a flag to only optimize hydrogen atom positions (for crystal structure refinement).
  • The MD module can now write trajectories in DCD file format (in addition to the already implemented XYZ and PDB formats).
  • The thermostat is now able to apply temperature ramps during simulation runs.
  • Added more flexibility to region definition (can now add/remove atoms to/from existing regions).
  • Added two new constraint types which keep centers of mass fixed or keep complete molecules rigid.
  • Ability to store the GBW file every n-th step during MD runs (e.g. for plotting orbitals along the trajectory).
  • Can now set limit for maximum displacement of any atom in a MD step, which can stabilize dynamics with poor initial structures. Runs can be cleanly aborted by "touch EXIT".
  • Better handling/reporting of non-converged SCF during MD runs.
  • Fixed an issue which slowed down molecular dynamics after many steps.
  • Stefan Grimme's xTB method can now be used in the MD module, allowing fast simulations of large systems.

Miscellaneous

  • Compute thermochemical corrections at different temperatures without recomputing the Hessian
  • Fragments can now be defined in the geom block as simple lists
  • Simpler input format for definition of atom lists and fragments, in particular useful for large atom lists
  • basename.trj files are now called basename_trj.xyz

Comments

Illustrate a small Fortran program for creating non-photorealistic illustrations of molecules

 

First let me say I’m not a big Fortran user but any blog posts about Fortran always seem to be very popular, and the Fortran on a Mac page is one of the most frequently read. I've just been sent details of another Fortran program and I've added it to the page.

Illustrate is a small Fortran program for creating non-photorealistic illustrations of molecules, with cartoony colors and outlines, and soft ambient shadows. It is used to create the RCSB Molecule of the month.

236-Cyclin_and_Cyclindependent_Kinase-1fin



Comments

ChemTube3D

 

The upgrade of https://www.chemtube3d.com is now complete. This is an absolutely fabulous resource for chemists with 2400 pages of information. Don't miss the Controls button that controls the animations and try turning your mobile phone for different views.

One of the problems for users of ChemTube3D is finding the content they require because it is so extensive. The new intelligent Search feature is more powerful and easy to use. The new Categories feature groups all the content into a wide range of subsections and provides an alternative way of discovering content. You can reach Categories directly from the new menu structure by clicking on a subject before using the submenu. You can view all the Categories if you perform a Search using the Search box top right. Try searching for “orbital” or “symmetry” or “VSEPR” or a subject of your choice. This should help bring Chemistry Animations under Control.

There is also a ChemTube3D iOS app available for free download.

ChemTube3D

There is now a brief history of ChemTube3D.

There are more apps for mobile devices on the Mobile Science Site


Comments

An interactive RDKit widget for Jupyter: a first pass

 

This looks like it could be very interesting.

A blog post by Greg Landrum a widget for displaying molecules where you can select atoms and find out which atoms are selected propagating to Python in a Jupyter Notebook.

This is basic, but I think it's a decent start towards something that could be really useful. Interested? Have suggestions (ideally accompanied by code!) on how to improve it? If it looks like this is actually likely to be used, I will figure out how to create a standalone nbwidget out of this and create a separate github repo for it.

Looks like a useful tool for selecting bonds for conformational analysis, selecting bonds for creating a Ramachandran plot, selecting groups for bioisosteric replacement……

Sounds like Greg is looking for input.


Comments

Sire

 

Sire is a free, open source, multiscale molecular simulation framework, written to allow computational modellers to quickly prototype and develop new algorithms for molecular simulation and molecular design. Sire is written as a collection of libraries, each of which contains self-contained and robust C++/Python building blocks. These building blocks are vectorised and thread-aware and can be streamed (saved/loaded) to and from a version-controlled and tagged binary format, thereby allowing them to be combined together easily to build custom multi-processor molecular simulation applications.

Sire is available via conda

conda install -c conda-forge -c omnia -c michellab sire

Note that on OS X you will need to run Python scripts with the sire_python interpreter. This is due to an issue with the default Python interpreter that is installed via Conda

You can also download the binary here

http://siremol.org/pages/binaries.html

And compile from source from GitHub

https://github.com/michellab/Sire

To compile Sire, you need a working C++ compiler with at least C++ 2014 support (gcc >= 5 or clang >= 3.7), cmake (version 3.0.0 or above), a Git client to download the source, and a working internet connection (needed by the Sire compilation scripts to download additional dependencies).


Comments

Jupyter notebook to look at molecular similarity

 

I was recently asked for a tool to compare the similarity of a list of molecules with every other molecule in the list. I suspect there may be commercial tools to do this but for small numbers of compounds it is easy to visualise in a Jupyter notebook using RDKit.

Read more here, MolecularSimilarityNotebook

molsim


Comments

Extending Jupyter

 

I'm a great fan of Jupyter notebooks and I'm always looking for ways to get more out of them. I came across this blog post recently which is packed with useful tips

99 ways to extend the Jupyter ecosystem

Whenever someone says ‘You can do that with an extension’ in the Jupyter ecosystem, it is often not clear what kind of extension they are talking about. The Jupyter ecosystem is very modular and extensible, so there are lots of ways to extend it. This blog post aims to provide a quick summary of the most common ways to extend Jupyter, and links to help you explore the extension ecosystem.

I've also published some notebooks under Tips and Tutorials, Jupyter notebooks


Comments

Positions and Meetings

 

Andreas Bender circulated this listing around the Cambridge Cheminformatics Network and I thought I'd pass it on.

Positions

UK

Senior Computational Chemist The University of Manchester https://www.linkedin.com/jobs/view/1368449905/

Senior Computational Chemist Sygnature, Nottingham https://www.linkedin.com/jobs/view/1356257398

Research Associate - Using Gene Expression Data for Compound Safety Assessment Cambridge University https://www.ch.cam.ac.uk/job/22351

PostDoc Fellow - AI, knowledge graph & networks in drug discovery AstraZeneca, Cambridge https://www.linkedin.com/jobs/view/1330746187

Postdoctoral Data Scientist - AI in Drug Design The Beatson Institute for Cancer Research Glasgow https://jobs.newscientist.com/en-gb/job/1401669675/postdoctoral-data-scientist-ai-in-drug-design/

Bioinformatics Analyst DIOSVax, Cambridge https://www.linkedin.com/jobs/view/1349281847/

Computational Postdoctoral Fellow - Cancer Drug Combinations Sanger Institute, Hinxton https://www.linkedin.com/jobs/view/1387749574

Europe

Ph.D. candidate (f/m/d) in "Computational method development for bulk and single cell RNAseq of inflammatory skin diseases" TU Muenchen https://portal.mytum.de/jobs/wissenschaftler/NewsArticle20190701100233

2 Open positions: PhD student and Lecturer; focus on AI/machine learning on large-scale data Uppsala University, Sweden https://pharmb.io/blog/recruitment-phd-lecturer-2019/

Events

25 July 2019 (webinar) BioCompute: A Standardized Method to Communicate Bioinformatic Workflow Information and Ease Organizational Burden https://www.eventbrite.com/e/us-epa-national-center-for-computational-toxicology-communities-of-practice-tickets-64532545581

4 September 2019, 4pm Cambridge Cheminformatics Network Meeting Chemistry Department, Cambridge http://www.c-inf.net

25 September 2019 Chemistry Networks Events https://www.eventbrite.co.uk/e/chemistry-networks-2019-focus-on-ai-and-machine-learning-tickets-62298582738

27 November 2019 (to be confirmed) "In Silico Toxicology" Meeting King's College Cambridge (if you are interested to either present at this event or to attend please let me know and I will keep you posted, more also in the next newsletter)

Comments

2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry

 

AI-webpage-image

I was just looking through the delegate registrations for the 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry Meeting taking place in Cambridge, UK 2nd to 3rd September 2019. We now have significantly more registrations than the first meeting, participants are coming from 16 different countries and whilst the UK and US predominate there are many participants from the rest of Europe and even some from Japan and Korea. There are 90 different organisations represented and I'm delighted to see there are over 20 student attendees, many from overseas. A number of students are presenting posters and the lineup of people taking part in the flash poster session can be found here.

Registration is still open for what looks like what will be another outstanding meeting.

A few people have said they are planning a visit to Cambridge for a holiday around the meeting and have asked for suggestions of things to do. Visit Cambridge is a good place to start.


Comments

KiSThelP2019 updated

 

KiSThelP is a cross-platform free open-source program developed to estimate thermodynamic and kinetic properties from electronic structure data. To date, five computational chemistry software formats are supported (Gaussian, GAMESS, NWChem, ORCA, MOLPRO).

Some key features are:

  • gas-phase molecular thermodynamic properties (offering hindered rotor treatment)
  • thermal equilibrium constants
  • transition state theory rate coefficients (TST, VTST) including one-dimensional tunnelling effects (Wigner and Eckart)
  • RRKM rate constants, for elementary reactions with well-defined barriers.

For information, please visit: http://kisthelp.univ-reims.fr


Comments

Autocompletion with deep learning

 

This looks really interesting

TabNine is an autocompleter that helps you write code faster by adding a deep learning model which significantly improves suggestion quality. You can see videos at the link above.

There has been a lot of hype about deep learning in the past few years. Neural networks are state-of-the-art in many academic domains, and they have been deployed in production for tasks such as autonomous driving, speech synthesis, and adding dog ears to human faces. Yet developer tools have been slow to benefit from these advances

Deep TabNine is trained on around 2 million files from GitHub. During training, its goal is to predict each token given the tokens that come before it. To achieve this goal, it learns complex behaviour, such as type inference in dynamically typed languages.

An interesting idea, my only concern is the quality of code in the training set.

Comments

AI in Chemistry bursaries still available

 

The 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry meetings is filling up fast, however there are still 6 bursaries unallocated. The closing date for applications is 15 July. The bursaries are available up to a value of £250, to support registration, travel and accommodation costs for PhD and post-doctoral applicants studying at European academic institutions.

You can find details here https://www.maggichurchouseevents.co.uk/bmcs/AI-2019.htm.

Twitter hashtag - #AIChem19

Comments

Raspberry Pi 4 is now on sale

 

The Raspberry Pi 4 is now available, this is a comprehensive upgrade, touching almost every element of the platform

Your tiny, dual-display, desktop computer…and robot brains, smart home hub, media centre, networked AI core, factory controller, and much more

  • A 1.5GHz quad-core 64-bit ARM Cortex-A72 CPU (~3× performance)
  • 1GB, 2GB, or 4GB of LPDDR4 SDRAM
  • Full-throughput Gigabit Ethernet
  • Dual-band 802.11ac wireless networking
  • Bluetooth 5.0
  • Two USB 3.0 and two USB 2.0 ports
  • Dual monitor support, at resolutions up to 4K
  • VideoCore VI graphics, supporting OpenGL ES 3.x
  • 4Kp60 hardware decode of HEVC video
  • Complete compatibility with earlier Raspberry Pi products

If you are in Cambridge, UK you can visit the New Raspberry Pi store upstairs from the AppleStore in the Grand Arcade.


Comments

Multi SMILES to Chemdraw files Updated

 

I recently wrote a script in response to a tweet

tweet

A reader wrote in to ask if it might be possible to modify the script to use the identifier in the file containing the SMILES string as shown below.

Ic1ccccc1   ID_1
CC=O    ID_2
CC(O)=O     ID_3
CC(OC(C)=O)=O   ID_4
CC(C)=O     ID_5
CC#N    ID_6
CC(c1ccccc1)=O  ID_7
CC(Br)=O    ID_8
CC(Cl)=O    ID_9

So I've modified the script to allow this, the full details are here including a link to download the script.

Results


Comments

Open Force Field Toolkit update

 

The Open Force Field Consortium is an industry-funded effort to develop small molecule force fields.

0.4.1 - Bugfix Release This update fixes several toolkit bugs that have been reported by the community. Details of these bugfixes are provided here. It also refactors how ParameterType and ParameterHandler store their attributes, by introducing ParameterAttribute and IndexedParameterAttribute. These new attribute-handling classes provide a consistent backend which should simplify manipulation of parameters and implementation of new handlers.


Comments

Multi SMILES to Chemdraw files

 

Have you ever wanted to convert a file containing multiple SMILES strings to ChemDraw structures?

This AppleScript might help





Comments

Molecular Modeling in the Cloud

 

I came across this website recently https://www.ks.uiuc.edu/Research/cloud/, describing efforts to harness cloud computational resources.

The use of advanced molecular simulation techniques often comes with additional computational and software requirements. Running molecular simulation and analysis tasks in the Cloud can significantly lower the barriers to use of advanced simulation methods and provides a cost-effective and practical solution for many molecular modeling tasks and for small and moderate size molecular dynamics simulations.

There are detailed instructions for accessing Amazon web services EC2, both GPU and CPU hardware.

We have adapted our molecular modeling applications NAMD, VMD, and associated tools to operate within the Amazon Web Servers (AWS) Elastic Compute Cloud (EC2) platform, enabling popular research workflows such as MDFF structure refinment and QwikMD simulation protocols to be run remotely by scientists all over the globe, with no need for investment in local computing resources and a reduced requirement for expertise in high performance computing technologies.

Well worth a browse.


Comments

Python leads the 11 top Data Science, Machine Learning platforms

 

The results of the latest KDnuggets poll, which is in it's 20th year, are in. Python is clearly moving to become the dominant platform with the votes for R slowly declining.

top-analytics-data-science-machine-learning-software-2019-3yrs-590

The blog post on KDnuggets gives far more detailed analysis and is well worth reading.


Comments

Jupyter notebook to create Wordcloud of tweets

 

I've often wanted to try creating a word cloud and when Noel O'Boyle collected together all the tweets from the Sheffield Conf on Chemoinformatics this seemed a good opportunity.

Relive the Sheffield Conf on Chemoinformatics with these #shef2019 tweets I've pulled down from Twitter, link to tweet.

The Jupyter notebook used to create the word cloud is here, it uses the excellent word cloud generator word_cloud. You will need to download the text from the tweets from the link provided in the tweet.

test

Comments

Installing Amber/AmberTools on macOS

 

I've written a page describing how to install a variety of cheminformatics tools on a Mac using HomeBrew and Anaconda including the tools needed to compile many of the packages.

brew install cdk
brew install chemspot
brew install indigo
brew install inchi
brew install opsin
brew install osra
brew install open-babel

I don't use Amber but a number of readers have asked me about installation. Rather than go though it here I would simply refer you to a superb detailed explanation here https://www.ovetande.se/software/amber/install/ambertools19-macos-10-14-4-xcode-10-2-1/.

Installing Amber/AmberTools on macOS has become much easier though is not to be considered completely trivial. Below different methods are presented to install the current version of AmberTools, where the method should be the same for Amber

Comments

Support for attending meetings

 

Just finished the RSC CICAG committee meeting and one of the discussion items was support for people to attend conferences and meetings. CICAG is committed to supporting attendance at our meetings in as many ways as possible. We always ensure that the venue supports wheelchair access and that any meals accommodate any dietary requirements that have been gathered on registration.

We offer student bursaries to help cover registration, travel, and accommodation if required. These are detailed on the individual conference pages. If you are making a particularly lengthly journey and need additional help please get in touch and we will see what we can do to help.

There are also travel grants Travel grants both for students and early career scientists https://www.rsc.org/ScienceAndTechnology/Funding/division-travel-grants/index.asp to attend conferences or workshops.

A recent addition are awards for carers https://www.rsc.org/campaigning-outreach/campaigning/incldiv/grants-for-carers/ caring responsibilities are wide and varied, but they can sometimes be hard to balance alongside attending conferences.

If you have any other suggestions feel free to email cicagrsc@gmail.com

Upcoming meetings

2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry Website
Twenty Years of the Rule of Five Website.

Comments

3D-e-Chem NLeSC project

 

This looks interesting 3D-e-Chem NLeSC project.

This project will develop technologies to improve the integration of ligand and protein data for structure-based prediction of protein-ligand selectivity and polypharmacology.

The project will use KNIME Analytics Platform to integrate the different technologies and datasets.



Comments

Mnova 14.0.1 is ready for download.

 

Just got an email

We have fixed bugs and are bringing you some new features in version Mnova 14.0.1 to help you with your analytical data. As usual we would like to thank the people who have contributed by giving us great feedback!.

Download here

Mnova 14.0.1 is a minor release where they have fixed several bugs and also implemented new functionality mainly in NMR, MS and NMRPredict, more details here.


Comments

Adding substructure searching to a FileMaker Pro Database

 

Anyone who has had to store or search a collection of chemical structures rapidly realises that they need a software tool with a little chemical intelligence. Whilst there are a number of commercial databases they tend to be rather expensive and often require a knowledge of SQL or dedicated IT support. Fine for large corporations but not suitable for a single chemist or small group. In contrast FileMaker Pro is a popular desktop database with an easy to use interface (there are also server and mobile versions). Unfortunately whilst it is easy to use it does not support chemical structure based searching. Fortunately FileMaker Pro comes with an easy to use scripting interface and we can create scripts that run command line applications like Openbabel.

Findlistsearch

This tutorial shows how to add substructure and similarity searching to a FileMaker Pro database, full details are available here including download of example database.

Comments

Molecular Transformer

 

When this paper first appeared on the arXiv preprint server "Molecular Transformer - A Model for Uncertainty-Calibrated Chemical Reaction Prediction https://arxiv.org/abs/1811.02633 it generated considerable interest.

Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: given reactants and reagents, predict the products. Similar to other work, we treat reaction prediction as a machine translation problem between SMILES strings of reactants-reagents and the products. We show that a multi-head attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark dataset. Our algorithm requires no handcrafted rules, and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without reactant-reagent split and including stereochemistry, which makes our method universally applicable.

I just noticed that it had been recently updated.

If you are interested in this exciting area of chemistry you might be interested to know the code is available on GitHub and the trained model is available online https://rxn.res.ibm.com.

One of the authors, Alpha Lee, is speaking at the 2nd Artificial Intelligence in Chemistry Meeting #AIChem19, 2nd to 3rd September 2019, Fitzwilliam College, Cambridge, UK. You can register for the meeting here if you would like to hear first hand about this technology.

AI-webpage-image

The full lineup of speakers are here. Also remember there are bursaries available for the meeting.


Comments

Openforcefield

 

The Open Force Field Initiative is an open source, open science, and open data approach to better force fields. All the code is on GitHib and they also provide highly curated datasets.

The idea is to enable molecular mechanics on small and macromolecules jointly using open and freely available software.

A recent blog post from Peter Schmidtke caught my eye.

Recently a few updates of the openforcefield toolkit came out … a game changer, as you’ll see.

The work investigated whether the 768 fragments from the XChem fragment library at Diamond can be parametrised with the new version of Open Force Field (0.4) and how they behave after a simple minimisation.

In short all fragments technically pass the parametrisation and minimisation step, this was supported by visual inspection.

All the code is on GitHub.


Comments

Can we trust Published data?

I posted a poll on twitter

Looking at abstracts for the AI in Chemistry Meeting … many mine published data. The quality of the public data is obviously critical for good models. Is this something the AI community should be concerned about or get involved with to improve the quality of the literature?

The results are now in and interestingly despite nearly 2.5K impressions only 28 people voted. Of those that voted the overwhelming majority feel that AI scientists should help to improve the quality of the literature.

AIpollresult

The comments associated with the tweet are interesting, certainly many machine learning models are robust enough to accommodate some poor data but I think there is a deeper concern.

Elisabeth Bik has regularly flagged questionable publications, unfortunately these are not always detected before their influence has been propagated through the literature.

For a very detailed example look at 5-HTTLPR: A POINTED REVIEW looking at an unusual version of the serotonin transporter gene 5-HTTLPR.

I've heard of many examples of scientists being unable to reproduce literature findings, usually little happens, however Amgen were able to reproduce only 6 out of 53 'landmark' studies and they published their findings.

How many times do scientists assume failure to reproduce published findings is their error?

There have been several studies looking at the possible causes of the failure to reproduce work, in 2011, an evaluation of 246 antibodies used in epigenetic studies found that one-quarter failed tests for specificity, meaning that they often bound to more than one target. Four antibodies were perfectly specific — but to the wrong target Reproducibility crisis: Blame it on the antibodies.

See also "The antibody horror show: an introductory guide for the perplexed" DOI

Colourful as this may appear, the outcomes for the community are uniformly grim, including badly damaged scientific careers, wasted public funding, and contaminated literature.

If you are mining literature data to predict novel drug targets then Caveat emptor.

 

Comments

Binder news

 

If you use Binder to serve your Jupyter notebooks you will be interested in this.

Have a repository full of Jupyter notebooks? With Binder, open those notebooks in an executable environment, making your code immediately reproducible by anyone, anywhere

We flipped the switch on making mybinder.org 6 a federation. This means that there are now two clusters that serve requests for mybinder.org 6. What changes for you as a user? Hopefully nothing. You will notice that if you visit mybinder.org 6 (or any other link to it) you will be redirected to gke.mybinder.org 1 or ovh.mybinder.org 5. Beyond that small change everything should keep working as before

This should mean that Binder becomes more robust and not susceptible to outages. Now this is in place it should also be possible to add further server resources.

Comments

Special Issue "Machine Learning with Python"

 

I was just sent details of a Special Issue "Machine Learning with Python for the journal Information.

We live in this day and age where quintillions of bytes of data are generated and collected every day. Around the globe, researchers and companies are leveraging these vast amounts of data in countless application areas, ranging from drug discovery to improving transportation with self-driving cars.As we all know, Python evolved into the lingua franca of machine learning and artificial intelligence research over the last couple of years. What makes Python particularly attractive for us researchers is that it gives us access to a cohesive set of tools for scientific computing and is easy to teach and learn. Also, as a language that bridges many different technologies and different fields, Python fosters interdisciplinary collaboration. And besides making us more productive in our research, sharing tools we develop in Python has the potential to reach a wide audience and benefit the broader research community.

This special issue is now open for submission.

Comments

Foldit

 

Foldit is a game in which players attempt to fold a protein sequence. Foldit players have a number of tools that allow them to change both the fold and the sequence of a virtual protein. The player's score is calculated from the energy of the virtual protein.

This work has now been published "De novo protein design by citizen scientists" DOI in Nature.

One hundred forty-six Foldit player designs with sequences unrelated to naturally occurring proteins were encoded in synthetic genes; 56 were found to be expressed and soluble in Escherichia coli, and to adopt stable monomeric folded structures in solution. The diversity of these structures is unprecedented in de novo protein design, representing 20 different folds—including a new fold not observed in natural proteins.

Download and instructions are here https://fold.it/portal/node/2007799.


Comments

pro Fit 7.0.14 has just been released

 

pro Fit 7.0.14 has just been released and is available now at quansoft.com/down.html. This is a maintenance update to QuantumSoft’s product for data and function analysis/plotting and nonlinear curve fitting.

This release improves Apple Script performance and fixes several other bugs. This is a recommended update for all users of pro Fit 7.0.

pro_Fit_Screenshot

pro Fit 7 is a Mac OS application for data/function analysis, plotting, and curve fitting. It is used by scientists, engineers and students to analyse their measurements and the mathematical models they use to describe them. Users can define any mathematical function and use it to model their data, finding the function parameters that best describe it. A vast number of tools allow the mathematical and statistical analysis and processing of functions and data sets, and the software is also used to produce aesthetically pleasing graphical representations for books, articles, and any other reports involving plots of data and functions.

There is a listing of Data Analysis Tools for Mac OSX here.


Comments

Can we trust published data

 

A Twitter poll can we trust published data and should AI community be involved?

Looking at abstracts for https://www.maggichurchouseevents.co.uk/bmcs/AI-2019.htm … many mine published data. The quality of the public data is obviously critical for good models.

https://twitter.com/macinchem/status/1135795628901093376

Comments

New MacPro Tech Specs

 

Apple announced the new MacPro at the WWDC, the page describing the new machine is up and the technical specifications are here.

The intro movie is available here.

Comments

Swift for TensorFlow Models

 

This repository contains TensorFlow models written in Swift.

Swift for TensorFlow is a next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond. This is an early-stage project: it is not feature-complete nor production-ready, but it is ready for pioneers to try in projects, give feedback, and help shape the future!

This is the second public release of Swift for TensorFlow, available across Google Colaboratory, Linux, and macOS.


Comments

NextMove open source MolHash

 

MolHash is a command-line application and programming library for generating hashes from molecular structures. This section gives an overview of each of the most useful hash functions in turn. The user should find it straightforward to add additional hash functions, or tweak the existing ones.

The source code is available on GitHub https://github.com/nextmovesoftware/molhash.

CMAKE, RDKit and Boost are required.

There are detailed instructions on GitHub describing the compilation and installation instructions, but I got several errors asking where RDKit was etc.

Fortunately, thanks to Matt, you can now install using conda

conda install -c mcs07 -c conda-forge molhash

Once installed you can check it is working by typing this in the Terminal

MacPro:username$ molhash -help
usage:  molhash [options] <infile> [<outfile>]
    Use a hyphen for <infile> to read from stdin
options:    
    -a  Process all the molecule (and not just the single largest component)
    -sa Suppress atom stereo
    -sb Suppress bond stereo
    -sh Suppress explicit hydrogens
    -si Suppress isotopes
    -sm Suppress atom maps
    -t  Store titles only
hash type:
    -g   anonymous graph [default]
    -e   element graph
    -s   canonical smiles
    -m   Murcko scaffold
    -mf  molecular formula
    -ab  atom and bond counts
    -dv  degree vector
    -me  mesomer
   -ht  hetatom tautomer
    -hp  hetatom protomer
   -rp  redox-pair
    -ri  regioisomer
    -nq  net charge

An example of usage

 MacPro:username$ echo "c1ccccc1C(=O)Cl" | molhash -mf -
C7H5ClO c1ccc(cc1)C(=O)Cl
Comments

End of the line for Python 2

 

Just a reminder that support for Python 2.7 will end on Jan 31 2020 (there will be no 2.8), all major scientific packages now support Python 3.x and there will be no further updates the Python 2.x versions.

An increasing number of projects have pledged to drop support for Python 2.7 no later than 2020, these include pandas, RDKit, iPython, Matplotlib, NumPy, SciPy, BioPython, Psi4, scikit-learn, Tensorflow, Jupyter notebook and many more.

Time to update those old scripts and Jupyter notebooks.

Comments

3decision for looking at protein structures

 

At the latest Bio-ITWorld 3decision was recognised with an award

3decision – next generation structural knowledge management solution

Although wildly used, rational structure-based drug design (SBDD) techniques are far from being applied to their fullest potential. The major hurdles lie in the inconsistent data persistence and the complexity of analyzing structural data. Moreover, the structural data is often analyzed by domain experts only and their knowledge and experience are not well shared and exposed to other communities. Abbvie has addressed these pitfalls by co-developing a web-based structural knowledge management solution called 3decision. It allows Abbvie to transform a massive amount of data coming from in-house and public 3D structures and sequences, into applicable knowledge for drug discovery projects. The collaborative aspects within SBDD projects are in focus and the user interface allows all types of users to easily generate, test and connect their ideas with each other. The development of 3decision allowed Abbvie to dramatically increases the ROI of SBDD work and protein structure production.

seq_brows_bioit 3decision is a web-based collaborative platform for storing, analyzing and sharing protein-ligand structures, sequences, and associated data, it works fine on a Mac and Google Chrome is recommended browser

Comments

CCG releases PSILO 2019.02

 

Chemical Computing Group have announced the 2019 release of PSILO, CCG Protein Structure Database System. The PSILO 2019.02 version includes a variety of new features and enhancements for viewing and searching records and for aligning and identifying protein active sites. Additional features in PSILO 2019.02 include streamlined PSILO IT infrastructure which facilitates deployment and performing PSILO searches directly from MOE.

NEW & ENHANCED FEATURES IN PSILO 2019.02

  • Analyze Ligand and Receptor Interaction Patterns using Clustered 3D Environments
  • Perform the Full Range of PSILO Searches from MOE
  • Infer Apo-pockets Using PSILO Family References
  • Include Crystal Contacts in Ligand Interaction Diagrams
  • Streamlined PSILO IT Infrastructure

Comments

In which area is Artificial Intelligence likely to most impact Chemistry, the results are in

 

I ran a poll last week asking "In which area is Artificial Intelligence likely to most impact Chemistry?" And we now have the results.

pollResults

Whilst Molecular Design was the most popular choice it was interesting to see that all options were well supported. This suggests that there are opportunities for artificial intelligence to have an impact in many facets of chemistry. I'm delighted to see this since this was part of the thinking behind the AI in Chemistry meeting and I think the line up of speakers will have something for everyone.

2nd RSC-BMCS / RSC-CICAG, Artificial Intelligence in Chemistry, Monday-Tuesday, 2nd to 3rd September 2019. Fitzwilliam College, Cambridge, UK. #AIChem19

Synopsis
Artificial Intelligence is presently experiencing a renaissance in development of new methods and practical applications to ongoing challenges in Chemistry. Following the success of the inaugural “Artificial Intelligence in Chemistry” meeting in 2018, we are pleased to announce that the Biological & Medicinal Chemistry Sector (BMCS) and Chemical Information & Computer Applications Group (CICAG) of the Royal Society of Chemistry are once again organising a conference to present the current efforts in applying these new methods. The meeting will be held over two days and will combine aspects of artificial intelligence and deep machine learning methods to applications in chemistry.

Programme (draft)

Monday, 2nd September
08.30
Registration, refreshments
09.30
Deep learning applied to ligand-based de novo design: a real-life lead optimization case study
Quentin Perron, IKTOS, France
10.00
A. Turing test for molecular generators
Jacob Bush, GlaxoSmithKline, UK
10.30
Flash poster presentations
11.00
Refreshments, exhibition and posters
11.30
Presentation title to be confirmed
Keynote: Regina Barzilay, Massachusetts Institute of Technology, USA
12.30
Lunch, exhibition and posters
14.00
Artificial intelligence for predicting molecular Electrostatic Potentials (ESPs): a step towards developing ESP-guided knowledge-based scoring functions
Prakash Rathi, Astex Pharmaceuticals, UK
14.30
Molecular transformer for chemical reaction prediction and uncertainty estimation
Alpha Lee, University of Cambridge, UK
15.00
Drug discovery disrupted - quantum physics meets machine learning
Noor Shaker, GTN, UK
15.30
Refreshments, exhibition and posters
16.00
Application of AI in chemistry: where are we in drug design?
Christian Tyrchan, AstraZeneca, Sweden
16.30
Presentation title to be confirmed
Anthony Nicholls, OpenEye Scientific Software, USA
17.30 Close
18.45 Drinks reception
19.15 Conference dinner

Tuesday, 3rd September
08.30
Refreshments
09.00v Deep generative models for 3D compound design from fragment screens
Fergus Imrie, University of Oxford, UK
09.30
DeeplyTough: learning to structurally compare protein binding sites
Joshua Meyers, BenevolentAI, UK
10.00
Discovery of nanoporous materials for energy applications
Maciej Haranczyk, IMDEA Materials Institute, Spain
10.30
Refreshments, exhibition and posters
11.00
Deep learning for drug discovery
Keynote: David Koes, University of Pittsburgh, USA
12.00
Networking lunch, exhibition and posters
14.00
Presentation title to be confirmed
Olexandr Isayev, University of North Carolina at Chapel Hill, USA
14.30
Dreaming functional molecules with generative ML models
Christoph Kreisbeck, Kebotix, USA
15.00
Refreshments, exhibition and posters
15.30
Presentation title to be confirmed
Keynote: Adrian Roitberg, University of Florida, USA
16.30
Close

You can get more information and register here https://www.maggichurchouseevents.co.uk/bmcs/AI-2019.htm.


Comments

RSC Elections

 

Voting for the Royal Society of Chemistry 2019 elections is now open and you should have been notified.

This year, they are holding elections for the following positions:

  • RSC President (one vacancy)
  • Elected Trustees (three vacancies)
  • Elected member of Professional Standards Board (one vacancy)
  • President of Analytical Division (one vacancy)
  • President of Chemistry Biology Interface Division (one vacancy)
  • President of Education Division (one vacancy)
  • President of Environment, Sustainability and Energy Division (one vacancy)
  • Elected member of Analytical Division Council (two vacancies)
  • Elected member of Education Division Council (two vacancies)
  • Elected member of Environment, Sustainability and Energy Division Council (two vacancies)
  • Elected member of Faraday Division Council (two vacancies)
  • Elected member of Materials Chemistry Division Council (two vacancies)
  • Elected member of Organic Division Council (two vacancies)

Voting closes at 17:00 (UK time) on Friday 21 June 2019 so I'd urge you to vote ASAP.

On a personal note.

David Rees is standing for RSC President, I've known David for many years and I can't think of a better person to lead the RSC in these uncertain times. A really top class scientist with an excellent career in Drug Discovery, whilst maintaining contacts with academic research and holding important roles within the RSC.


Comments

SilcsBio Software

 

A recent publication "Optimization and Evaluation of Site-Identification by Ligand Competitive Saturation (SILCS) as a Tool for Target-Based Ligand Optimization" DOI caught my eye. Predicting ligand binding affinities is a very challenging process and whilst free energy perturbation methods have proved useful they are very computationally demanding. SILCS looks to give similar accuracy but with reduced computational demands.

The software is available from SILCSBIO and whilst it requires significant compute resources or access to a virtual cluster using Amazon Web Services, the SilcsBio Graphical User Interface (GUI) enables running SILCS and SSFEP simulations and analysing results through a GUI instead of the command line and is available for Mac OSX and Windows. Visualisation of results uses VMD or PYMOL plugins.


Comments

Samson tutorials

 

I've been keeping an eye on Samson for a while now and whilst we still wait for the version 1 release the current version sports some interesting developments.

SAMSON is the quickly growing platform for molecular modeling. SAMSON's goal is to make it faster for everyone to design drugs, materials and nanosystems.

There are an increasing number of Elements

SAMSON Elements are modules for SAMSON that you add from SAMSON Connect. The first time you start SAMSON, some default SAMSON Elements are automatically installed

And the documentation now includes some tutorials, the GROMACS Wizard Light help you to easily run GROMACS simulations and get results as plots and simulation trajectories.

As I've mentioned before the Samson scripting API allows you to control Samson using Python.


Comments

Mnova 14 is released

 

Mnova 14 has just been released this includes new features in most plugins: NMR, MS, NMRPredict, Screen, DB, Structure Elucidation, and also some new features.

We have added a new module, Mnova ElViS, for Electronic and Vibrational Spectroscopies, as we continue to add new analytical data that can be read, processed, archived and reported using Mnova. For those working on regulated markets and having to comply with 21 CFR Part 11 or GxP rules, Mnova 14 includes brand new Audit Trail and Digital signatures features


Comments

Amsterdam Modeling Suite 2019

 

The Amsterdam Modeling Suite 2019 (AMS2019) has been released.

Full details and release notes are here

Full documentation https://www.scm.com/doc/Documentation/index.html

Download binaries here https://www.scm.com/support/downloads/.


Comments

A review of BioTransformer

 

A recent publication described BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification DOI. There are a number of tools that predict sites of metabolism on a molecule and I've mentioned a couple FAME and SMARTCyp in the past. These packages flag potential metabolic hot spots (mainly for CYP mediated metabolism) but don't attempt to provide any information on the putative metabolites.

BioTransformer combines a machine learning approach with a knowledge-based approach to predict small molecule metabolism, and then calculate properties for the putative metabolites to aid metabolite ID

The full review of BioTranformer is here.


Comments

CGRtools: Python Library for Molecule, Reaction and Condensed Graph of Reaction Processing

 

CGRtools is a set of tools for processing of reactions based on Condensed Graph of Reaction (CGR) approach, details on Github https://github.com/cimm-kzn/CGRtools. Published in JCIM DOI

Basic operations:

  • Read /write /convert formats MDL .RDF and .SDF, SMILES, .MRV
  • Standardize reactions and valid structures checker.
  • Produce CGRs.
  • Perfrom subgraph search.
  • Build /correct molecules and reactions.
  • Produce template based reactions.

stable version are available through PyPI

pip install CGRTools

Install CGRtools library DEV version for features that are not well tested

pip install -U git+https://github.com/cimm-kzn/CGRtools.git@master#egg=CGRtools

There is also a tutorial using Jupyter notebook https://github.com/cimm-kzn/CGRtools/tree/master/tutorial


Comments

MacOSX Market Share

 

I occasionally get questions from scientific software developers about MacOSX market share visiting my site, it currently runs at 55% MacOSX, 23% Windows, 13% iOS, 4% Android, 4% Linux. However I am aware that readership is very scientific focussed audience and does not represent worldwide use. So I occasionally have a look at Statcounter to get a more general view as shown below.

StatCounter

The mobile market is of course much larger and there are an increasing number of Mobile Science apps but at the moment it looks like Desktop/Laptop still dominates science.


Comments

HELM notation in Jupyter Notebook

 

I was recently asked for a way to visualise HELM notation

HELM (Hierarchical Editing Language for Macromolecules) enables the representation of a wide range of biomolecules such as proteins, nucleotides, antibody drug conjugates etc. whose size and complexity render existing small-molecule and sequence-based informatics methodologies impractical or unusable.

The RDKit provides limited support for HELM notation (currently peptide) and a simple Jupyter Notebook provides an easy interface as shown here


Comments

Special Issue on Emerging Architectures in Computational Chemistry

 

This may be of interest Special Issue on Emerging Architectures in Computational Chemistry

Multithreaded parallelization of the energy and analytic gradient in the fragment molecular orbital method

OpenMP in VASP: Threading and SIMD

Field‐programmable gate arrays and quantum Monte Carlo: Power efficient coprocessing for scalable high‐performance computing

Coupled‐cluster singles, doubles and perturbative triples with density fitting approximation for massively parallel heterogeneous platforms

Domain‐specific virtual processors as a portable programming and execution model for parallel computational workloads on modern heterogeneous high‐performance computing architectures


Comments

infiniSee from BioSolveIT

 

BioSolveIT have issued a new desktop app that gives easy and graphical access to searches in infinitely large chemical spaces of tangible molecules. The new infiniSee finds molecules based on a fuzzy pharmacophore-like similarity to an input query.

Screenshot 2019-04-29 at 07.15.09

Users can swiftly mine from 5 billion (5 x 109) Enamine REAL Space (not to be confused with the ten times smaller REAL database) — or use the company’s free KnowledgeSpace that is based on publically available building blocks and published reactions. More spaces are in the making, and the company has previously helped several pharmas create in-house spaces (Evotec, AstraZeneca, Merck etc.)

The output can be either directly purchased from BioSolveIT’s partner Enamine; alternatively users can synthesize the results themselves with a very high likelihood due to the setup of the chemical spaces. Classical library searches can certainly also be performed, these are processed quickly with parallel computing strategies that exploit multi-node architectures on standard computers and laptops.

Goodies include a forced match of user-selected subgroups; likewise very helpful is the 2D color-coding of molecular sketches that help the user understand the computed similarities. Results can be exported as SD files, remarkably with 3D coordinates on demand, or as CSV files for post-processing with a text editor. Queries may be masked to maintain IP safety.

Dark and light schemes are supported, and the software runs under Mac OSX, Windows 64bit, and Linux, not requiring any access to the external world, so that no user information is conveyed across the net.

The website and links to fully functional test installations is at https://www.biosolveit.de/infiniSee


Comments

Updating to Open Babel 3.0 from 2.x

 

Just a quick heads up, the update to version 3.0 breaks the API in a number of cases, and introduces some new behaviour behind-the-scenes.

For full details https://open-babel.readthedocs.io/en/latest/UseTheLibrary/migration.html.

In particular

The babel executable has been removed, and obabel should be used instead.

In OB 3.x, both openbabel.py and pybel.py live within the openbabel module

There are also changes to the handling of elements, atoms and bond flags, and aromaticity.



Comments

AmberTools19 is now available

The latest update to AmberTools is now available.

AmberTools consists of several independently developed packages that work well by themselves, and with Amber18 itself. The suite can also be used to carry out complete molecular dynamics simulations, with either explicit water or generalized Born solvent models.

AmberTools19 (released on April 25, 2019) consists of the following major codes:

  • NAB/sff: a program build molecules, run MD or apply distance geometry restraints using generalized Born, Poisson-Boltzmann or 3D-RISM implicit solvent models
  • antechamber and MCPB: programs to create force fields for general organic molecules and metal centers
  • tleap and parmed: basic preparatory tools for Amber simulations
  • sqm: semiempirical and DFTB quantum chemistry program
  • pbsa: performs numerical solutions to Poisson-Boltzmann models
  • 3D-RISM: solves integral equation models for solvation
  • sander: workhorse program for molecular dynamics simulations
  • mdgx: a program for pushing the boundaries of Amber MD, primarily through parameter fitting. Also includes customizable virtual sites and explicit solvent MD capabilities.
  • cpptraj and pytraj: tools for analyzing structure and dynamics in trajectories
  • MMPBSA.py and amberlite: energy-based analyses of MD trajectories


Comments

Take a screenshot of math and paste the LaTeX into your editor

 

Mathpix have developed an app (Snip) that automates the tedious aspects of typing documents containing math, and they also provide an API (MathpixOCR) for developers to integrate OCR capabilities into their own application.

Take a screenshot of math and paste the LaTeX into your editor, all with a single keyboard shortcut.

This also works for hand drawn equations

https://twitter.com/cosmic_mar/status/1119360017151315968


Comments

SeeSAR 9 released

 

SeeSAR a structure-based design tool has been updated.

Version 9 represents another major leap in SeeSAR's evolution, fully adopting the 'modes' concept. Molecules can be transferred freely between modes as you carry out various different tasks. This gives you much more flexibility while maintaining a structured overview. To help you keep track of where you are, we distinguish the modes using a beautiful backlit color scheme, focused top center on the mode switch but found throughout the tool to guide navigation.


Comments

Science Foods

 

Given that this is the International Year of the Periodic table it is not surprising that we are seeing many interesting ways to highlight it.

This post on Instagram caught my eye recently.

https://www.instagram.com/p/BwcSwd7B5Gs/?

Screenshot 2019-04-22 at 07.28.34



Comments

Mathematica version 12 released

 

I'm sure many will have noticed the release of Mathematica version 12, the website contains details of the many additional new features, but I happened to notice there were a couple of Chemistry features including:

The Molecule is a symbolic representation of a chemical species and is a fully computable first-class member of the Wolfram Language. More details here.


Comments

Mobile Science site

 

There are now 250 entires in the Mobile Science Database covering every scientific discipline.

The most often viewed apps are


Comments

ChemDoodle 3D v4 is available

The latest update to the popular chemical 3D modelling and visualisation tool ChemDoodle 3D is out.

ChemDoodle® 3D is a molecular modeling and scientific visualization platform with a focus on user customizability and universal support. Just like its 2D counterpart, all of the graphics are fully customizable and controllable. The large feature set is well organized for intuitive access and we develop ChemDoodle 3D to work with the vast majority of graphics cards in use.

This is a major update to ChemDoodle 3D a couple of notable features are:

  • Major update to the molecular modelling engine. A very accurate implementation of the MMFF94 and MMFF94s force fields can now be used by the Minimizer widget when building molecules.
  • You can now set which optimizer function is used, between steepest descent, conjugate gradients and BFGS.
  • The Minimizer widget now presents an option to display force field specific atom typing labels for the referenced structure. For the MMFF94 force fields, both the generic Numeric and specific Symbolic atom types are properly attributed in ChemDoodle 3D.
  • An advanced system has been added to place new bond connections to atoms in the most optimal location in 3D around the atom. Use this to efficiently create molecules and quickly build coordination complexes.
  • A new widget for creating atomic orbital graphics,
  • Connolly surfaces and surface colour functions,
  • New model types for proteins, including an advanced cylinder and plank model as well as a cartoon model
  • Generation of armchair/zigzag/chiral carbon nanotube geometries.
  • Significant improvements to the MMTF interpreters, fixing all reported and known issues. Improvements to the CIF interpreters to read a wider range of files. Also fixed a centering issue with CIF input.


Comments

A review of alvaDesc

 

alvaDesc is a desktop tool for the calculation of a wide range of molecular descriptors and a number of molecular fingerprints from https://www.alvascience.com. alvaDesc can be used to determine over 5000 different descriptors (the full list is here).

It can be accessed via the command line or via a GUI.

3Dplot

The complete review is here..



Comments

Chemfiles 0.9

 

Just got this message

We are very happy to announce the release of the 0.9 version of Chemfiles. Chemfiles is a C++ library providing write and read access to chemistry file formats. Chemfiles also has bindings to other languages and can be used from C, Fortran, Python, Julia and Rust.

Source code is available on GitHub also described in detail here DOI.

It can be installed using Conda

conda install -c conda-forge chemfiles

There are other libraries for file conversion in particular OpenBabel a C++ library providing conversions between more than 110 formats.


Comments

A Quick look at Flare and Python

I recently wrote a review of Flare Version 2 which is a recent extension to the Cresset portfolio with the introduction of Electrostatic Complementarity (EC), i.e. a comparison of electrostatics on both the small molecule ligand and the target protein. In addition Flare version 2 includes a new Python API, that allows users to automate tasks by scripting, but also integration with other Python packages such as RDKit cheminformatics toolkit, Python modules for graphing, statistics (NumPy, SciPy, MatPlotLib), and Jupyter notebook integration, it is this aspect of Flare that is the subject of this review.


Comments

Open Forcefield 0.2

 

The 0.2.0 release of the Open Force Field Toolkit, featuring RDKit support and the new-and-improved SMIRNOFF v0.2 force field spec has been announced. https://openforcefield.org/news/openforcefield-toolkit-0.2-release/.

We're excited to announce the public release of the Open Force Field toolkit version 0.2.0! Most notably, this release adds the ability to assign SMIRNOFF parameters and AM1-BCC charges with a completely open-source backend, adding support for the RDKit and AmberTools via a new ToolkitWrapper infrastructure that can be extended in the future to support additional cheminformatics toolkits. The OpenEye Toolkit will continue to be supported, as well as used internally our parameter-fitting pipelines in the short term. We're extremely grateful to the long list of contributors that have made this release possible, especially Shuzhe Wang from the Riniker group for piloting much of the RDKit functionality.


Comments

Python Collection

 

I was sent to this recently.

Python Collection

This collection publishes articles describing new Python modules and libraries, as well as applications developed in Python. Python is a free, open source programming language with an emphasis on readability which is widely used in science due to its ease of use and high-performance. Python’s usefulness in research is further bolstered by scientific libraries and tools such as Numpy, Scipy, Pandas, IPython and MatPlotlib. As for example demonstrated by Biopython, Python libraries can be incredibly valuable to other researchers. Publishing a citable, peer reviewed article outlining a new package boosts its visibility and enables its creators to receive proper credit for their contribution.

Very little there at present but I'll keep an eye on it for the future.


Comments

Backup, backup, backup

 

This is often a time of year when people do some spring cleaning. Of course it always too easy to delete something that you may need later so having a good backup strategy in place is strongly recommended.

I was reminded when I saw the article in Nature recently "11 ways to avert a data-storage disaster. Hard-drive failures are inevitable, but data loss doesn’t have to be" Link.

I have a couple of types of backup, an archive that stores critical versions of files, I have copies in Time Machine, an external server and I use Amazon WebServices to provide external storage.

The second runs every evening and generates a copy of the hard drives and stores then on Amazon WebServices

The final type is the every day stuff and for that I use Time Machine, this gives me virtually instant access to accidentally deleted/corrupted files, this is stored on an external Synology server.

I recently had to restore a machine from a backup and was delighted to be up and running in a couple of hours.


Comments

Javascript Molecule Viewers

 

In the dim and distant past the only option for molecules (particularly biomolecules) was a Unix workstation and specialist software, with the advent of web technologies a number of Java applets were developed that enabled users to visualise proteins etc within the web browser. However, due to security concerns java applets have now been discontinued and a range of javascript based molecular viewers have been developed. This is a list of the viewers that I have come across, please feel free to contact me with any I've missed.

A list of Javascript/WebGL molecule viewers

Structure displayed using 3Dmol.js.

Mouse Controls

Movement      Mouse Input      Touch Input
Rotation Primary Mouse Button Single touch
Translation Middle Mouse Button or Ctrl+Primary Triple touch
Zoom Scroll Wheel or Second Mouse Button or Shift+Primary Pinch (double touch)
Slab Ctrl+Second Not Available

You might also be interested in these pages

Open Source Cheminformatics Tookits

Open Source Python Data Science Libraries


Comments

QUBEKit: QUantum BEspoke FF toolKit

 

Just saw an interesting paper "QUBEKit: Automating the Derivation of Force Field Parameters from Quantum Mechanics" DOI.

QUBEKit is python based force field derivation toolkit that allows users to derive accurate molecular mechanics parameters directly from quantum mechanical calculations.

Code is available on GitHub QUBEKit, and there is a user tutorial on the Wiki Page.

Requirements:

  • Anaconda3
  • Biochemical and Organic Simulation System (BOSS)
  • OpenMM
  • Gaussian09
  • ONETEP
  • Matlab 2017

Python modules used:

  • numpy
  • argparse
  • collections
  • colorama
  • matplotlib

Comments

Swift 5 Released

 

Swift 5 is a major milestone in the evolution of the language.

Swift 5 makes shipping apps dramatically better. The Swift runtime is now built right in to iOS, macOS, watchOS and tvOS. Your app no longer needs to bundle this library for these latest OS releases. And with great App Store support, your users will get faster downloads and smaller apps. Additional Features in Swift 5 * String reimplemented with UTF-8 encoding which can often result in faster code * Exclusive access to memory is now enforced by default on debug and release builds * SIMD Vector and Result types added to the Standard Library * Performance improvements to Dictionary and Set * Support for dynamically callable types to improve interoperability with dynamic languages such as Python, JavaScript and Ruby


Comments

Py-ChemShell 2019 released

 

Just got this notice.

We are pleased to announce Py-ChemShell 2019 (v19.0), the first beta release of the Python-based version of ChemShell. Py-ChemShell 2019 offers new interfaces to ORCA and DL_POLY 4, a complete task-farmed parallelisation framework (including parallel finite difference gradients), RESP charge fitting procedures, and case studies for problems in materials modelling.

Py-ChemShell can be downloaded free of charge under the open source GNU LGPL v3 licence from this site.

ChemShell is a scriptable computational chemistry environment for multiscale modelling. While it supports standard quantum chemical or force field calculations, its main strength lies in hybrid QM/MM calculations. The concept is to leave the time-consuming energy evaluation to external specialised codes, while ChemShell takes over the communication and data handling.

ChemShell provides interfaces to a variety of QM and MM codes, including:

  • GAMESS-UK
  • NWChem
  • FHI-aims
  • DALTON
  • MNDO
  • TURBOMOLE
  • Orca
  • Molpro
  • Gaussian
  • DMol3
  • Q-Chem
  • DL_POLY
  • GULP
  • CHARMM
  • GROMOS

Comments

The Chemfp Project

 

The Chemfp project started as a way to promote the FPS format for cheminformatics fingerprint exchange and has evolved into a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. The 10 years of work and research results of the chemfp project have now been described in an excellent publication.

I looked at Chemfp when comparing various options for clustering large datasets and Chemfp was one of the highest performing, and Andrew Dalke was very responsive to questions.


Comments

Workshop on Computational Tools for Drug Discovery

Workshop on Computational Tools for Drug Discovery (with SCI).
10 April 2019, The Studio, Birmingham.
https://www.soci.org/events/scirsc-workshop-on-computational-tools-for-drug-discovery

Details of the workshops

Attendees will be able to choose from 4 of 6 sessions.

Optibrium Guided multi-parameter optimisation of 2D and 3D SAR

In this workshop, we will explore the concept of multi-parameter optimisation (MPO) and its application to quickly target high-quality compounds with a balance of potency and appropriate absorption, distribution, metabolism and excretion (ADME) properties. We will further illustrate how this concept can be combined with an understanding of 2D and 3D structure-activity relationships (SAR) to guide the design of new, improved compounds.

The workshop will be based on practical 'hands-on' examples using our StarDrop™ software and all participants will get a 1-month free trial license to use StarDrop following the workshop. For more information on StarDrop, please visit our website or watch some videos of StarDrop in action at www.optibrium.com/community/videos.

Cresset Next generation structure-based design with Flare

Learn how simple structure-based design can be within small molecule discovery projects. The workshop will cover ligand design in the protein active site, Electrostatic Complementarity™ maps and scores, ensemble docking of ligands with Lead Finder, calculations of water stability and locations using 3D-RISM, energetics of ligand binding using WaterSwap and use of Python extensions. Applications you will use: Flare™ , Lead Finder™.

Dotmatics Data visualisation and analysis with Dotmatics

Dotmatics offers a comprehensive scientific software platform for knowledge management, data storage, enterprise searching and reporting. The focus of the workshop will be the Dotmatics visualisation and data analysis software in small molecule drug discovery workflows around compound selection from vendor catalogues and analysis of lead optimisation datasets as typically found in drug discovery.

BioSolveIT Fast – Visual – Easy – computer-aided drug design for all chemists

In this workshop you will learn - hands-on - to use modern software for hit-finding, hit-to-lead and lead optimization. We will walk you around the drug discovery cycle and show you: how to assess your protein and discover a binding site; how simple modifications to the bound molecule affect the binding affinity; how to replace a scaffold or explore sub-pockets for improved binding; how to keep all your key ADME-parameters in check, while you optimize your lead; and last but not least how to quickly find new starting points in a giant 3.8 billion vendor catalog of compounds ready for purchase.

Instead of dry theory, we will explain those use cases based on real-world scenarios and interesting targets such as Thrombin, BTK, Endothiapepsin and BRD4. Bring your own laptop to try this out for yourself right away and receive the software as well as a free trial license on top. The Software tools are called:

SeeSAR – "modeling for all chemists" and REAL Space Navigator – "the world’s largest searchable catalog of compounds on demand".

Knime An interactive workflow for hit list triaging

In this workshop I will introduce a workflow built using the open source KNIME Analytics Platform for doing hit-list triaging and selecting compounds for confirmatory assays or other followup testing. We will use a real-world HTS dataset and work through reading the data in, flagging molecules that are likely to have interfered with the assay, manual "rescue" of compounds removed by the filters, and selecting a compound subset that covers the chemical diversity of the hits yet still allows learning some SAR from subsequent experiments. Participants will be provided with both the dataset and the workflow used during the workshop so that they can adapt it to their own needs.

ChemAxon Computational intelligence driven drug design

The most recent era of vast data sources, rapid data processing and model building enables drug designers to propose high quality structures in ideation phase in lean ("fail-early") discovery cycles. The goal of this workshop is to demonstrate an integrated system (Marvin Live) to:

freely create, store and manage ideas utilize computational models such as phys-chem properties, 3D alignment, predictive models (created in KNIME) exploit existing evidences (MMP, various data sources) during design session. The dynamic plugin system facilitates balancing attributes through comparison and triage of hypothetical compounds on a single interface.

Comments

Cambridge Structural Database 2019

 

Cambridge Crystallographic Data Centre (CCDC) announced the first release of CSD data and software update of 2019.

The 2019 CSD Release contains 957,868 unique structures and 973,630 entries (CSD version 5.40) – an increase of more than 57,000 entries. We are currently on course to reach a million structures by summer 2019.

The update includes an exciting new polyhedra display option in our visualisation software Mercury.

Read more here….


Comments

CICAG meetings 2019

 

Meetings for 2019 that CICAG (http://www.rsccicag.org) is involved with.

Workshop on Computational Tools for Drug Discovery (with SCI).
10 April 2019, The Studio, Birmingham.
https://www.soci.org/events/scirsc-workshop-on-computational-tools-for-drug-discovery A great opportunity to gets hands on training to get you started on a variety of important software tools. All software and training materials required for the workshop will be provided for attendees to install and run on their own laptops and use for a limited period afterwards.

Eighth Joint Sheffield Conference on Chemoinformatics, The Edge, University of Sheffield, UK, Monday 17th – Wednesday 19th June, 2019..
https://cisrg.shef.ac.uk/shef2019/ CICAG are really delighted to be sponsoring this meeting.

AI in chemistry (with RSC-BMCS).
Two-day meeting to be held in Cambridge on 2nd and 3rd September 2019. Fitzwilliam College
https://www.maggichurchouseevents.co.uk/bmcs/AI-2019.htm First very successful meeting in London was heavily oversubscribed, closing date for oral abstracts is 31 March and Posters 5 July.

Post-grad Cheminformatics/CompChem symposium, Wednesday 4th Sept 2019 Cambridge Chemistry Dept.
Opportunity for Post-grads to meet and present their work. Keep the date free, meeting details to be published soon, Cambridge Cheminformatics Network meeting will immediately follow the meeting so why not make a day of it.

20 years of Ro5 (with RSC-BMCS).
Wednesday, 20th November 2019, Sygnature Discovery, BioCity, Nottingham, UK.
It has been over 20 years since Lipinski published his work determining the properties of drug molecules associated with good solubility and permeability. Since then, there have been a number of additions and expansions to these “rules”. There has also been keen interest in the application of these guidelines in the drug discovery process and how these apply to new emerging chemical structures such as macrocycles. This symposium will bring together researchers from a number of different areas of drug discovery and will provide a historical overview of the use of Lipinski’s rules as well as look to the future and how we use these rules in the changing drug compound landscape. Details will be on https://www.maggichurchouseevents.co.uk/bmcs/ in the near future.

Comments

Happy birthday World Wide Web

 

The Google Doodle today celebrates the birth of the world wide web. It is a shame however that they use a generic PC icon rather than the computer on which the internet was first built a NEXT Cube.

Screenshot 2019-03-12 at 10.25.03

A NeXT Computer and its object oriented development tools and libraries were used by Tim Berners-Lee and Robert Cailliau at CERN to develop the world's first web server software, CERN httpd, and also used to write the first web browser, as shown in the image below.

First_Web_Server

CERN are running a webinar to celebrate the event.

Welcome and Introduction

  • Welcome by Anna Cook - master of ceremonies

  • Opening talk by Fabiola Gianotti - CERN Director General

Let’s Share What We Know - panel discussion

This session highlights the importance of sharing what we know in the context of the early days of the Web. The Web has had a huge influence on the way we collaborate and share knowledge in society as a whole. Collaboration and sharing knowledge were also core values at the heart of its early evolution.

Chair: Frédéric Donck

Speakers: Tim Berners-Lee, Robert Cailliau, Jean-François Groff, Lou Montulli, Zeynep Tufekci

For Everyone - conversation

The Web was designed For Everyone!

Conversation between Sir Tim Berners-Lee and Bruno Giussani

Towards the Future - panel discussion

This session will focus on the aspects that technology evolution can bring us

Chair: Bruno Giussani

Speakers: Doreen Bogdan-Martin, Jovan Kurbalija, Monique Morrow, Zeynep Tufekci

Closing Remarks

  • Closing remarks by Charlotte Warakaulle - CERN Director for International Relations
Comments

Review of Flare version 2

 

Cresset provide a variety of software packages to support small molecule design, built on the foundation of their extended forcefield XED forcefield. When I first reviewed a couple of Cresset products FieldView, FieldAlign and Forge the forcefield was only applicable to small molecules. However the forcefield has been constantly developed and can now be applied to proteins.

Flare Version 2 is a recent extension to the portfolio with the introduction of Electrostatic Complementarity (EC), i.e. a comparison of electrostatics on both the small molecule ligand and the target protein DOI.

Electrostatic interactions between small molecules and their respective receptors are essential for molecular recognition and are also key contributors to the binding free energy. Assessing the electrostatic match of protein-ligand complexes therefore provides important insights into why ligands bind and what can be changed to improve binding. Ideally, ligand and protein electrostatic potentials at the protein-ligand interaction interface should maximize their complementarity while minimizing desolvation penalties.

In addition Flare version 2 includes a new Python API, that allows users to automate tasks by scripting, but also integration with other Python packages such as RDKit cheminformatics toolkit, and Python modules for graphing, statistics (NumPy, SciPy, MatPlotLib), and Jupyter notebook integration.

waterswapview

Flare gives access to a very powerful set of tools designed to aid ligand binding, docking, electrostatic modelling and WaterSwap, all within a well thought-out interface. The storyboard feature also allows the user to store snapshots of progress and coupled with the log acts like a notebook.

You can read the full review here.

Comments

Chembience updated

 

Update to RDKit 2018.09.2 and Postgres 10.7.

Chembience is a Docker based platform supporting the fast development of chemoinformatics-centric web applications and microservices. It creates a clean separation between your scientific web service implementation and any host-specific or infrastructure-related configuration requirements.


Comments

Chemical Structure Association Trust grants

 

Application deadline for the 2019 Grant is April 19, 2019. Successful applicants will be notified no later than May 24, 2019.

The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, research or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. One or more Grants will be awarded annually up to a total combined maximum of ten thousand U.S. dollars ($10,000). Grantees have the option of payments being made in U.S. dollars or in British Pounds equivalent to the U.S. dollar amount. Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how the grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

There are more details of the requirements on the website

The 2018 awards went to

2018

Stephen Capuzzi, Division of Chemical Biology and Medicinal Chemistry at the University of North Carolina Eshelman School of Pharmacy, Chapel Hill (USA), was awarded a Grant to attend the 31th ICAR in Porto, Portugal from 06/11/2018 to 06/15/2018, where he presented his research entitled “ComputerAided Discovery and Characterization of Novel Ebola Virus Inhibitors.”

Christopher Cooper, Cavendish Laboratory, University of Cambridge, UK, was awarded a Grant to present his current research on systematic, high-throughput screening of organic dyes for co-sensitized dye-sensitized solar cells. He presented his work at the Solar Energy Conversion Gordon Research Conference and Seminar held June 16-22, 2018 in Hong Kong.

Mark Driver, Chemistry Department, University of Cambridge, UK,was awarded a Grant to offset costs to attend the 7th EUCheMS conference where he will present a poster on his research that focuses on the development and applications of a theoretical approach to model hydrogen bonding.

Genqing Wang, La Trobe Institute for Molecular Sciences, La Trobe University, Australia, was awarded a Grant to present his work at the Fragment-Based Lead Discovery Conference (FBLD2018) in San Diego, USA in October 2018. The current focus of his work is the development of novel anti-virulence drugs which potentially overcome the problems of antibiotic resistance of Gram-negative bacteria.

Roshan Singh, University of Oxford, UK, was awarded a Grant to conduct research within Dr. Marcus Lundberg’s Group at Uppsala University, Sweden, as part of a collaboration that he has set up between them and Professor Edward Solomon’s Group at Stanford University, California. He conducts research within Professor John McGrady’s group at the University of Oxford. The collaboration will look to consolidate the experiments studies on heme Fe (IV)=O complexes currently being studied by Solomon’s Group with future multi-reference calculations to be conducted within Lundberg’s Group.


Comments

Counting Identical structures in two datasets

 

Sometimes I have two datasets and I just want to know the overlap of identical structures. This Vortex script counts the number of identical structures by comparing InChIKeys. It then displays a matrix showing how many unique molecules in each dataset and how many molecules are in both datasets.

results

Comments

Data Extractor

 

Data Extractor has been updated to version 1.7.1 with a number of internal improvements.

Data Extractor allows to extract data contained inside text documents and collect them in an internal organized table with fields and records. It can parse all the text files you specify and analyze them understanding from text tags what to extract and where to put it.

Data Extractor requires Mac OSX 10.10 or later.

There are more Data Analysis tools here.

Comments

MOE 2019.1 released

 

The 2019.01 release of Chemical Computing Group's Molecular Operating Environment software includes a variety of new features, enhancements

Full release notes are here.

includes:

  • Calculate and Analyze pH-Dependent Protein Properties
  • MOEsaic Session Sharing and Project Customization
  • Determine Conformation Population from NMR NOE Data
  • Predict Relative Binding Energies with AMBER Thermodynamic Integration

Worth noting there is an updated Version of Flexera License Manager.

MOE now uses an updated version of the Flexera license manager. The license manager server components lmgrd, chemcompd, and lmutil have all been updated to version 11.16.0.0. Note that older versions of MOE will continue to run with updated license manager servers.


Comments

Update to MayaChemTools

 

I just heard that the following command line scripts available as part of MayaChemTools package now have implemented multiprocessing functionality.

o RDKitCalculateMolecularDescriptors.py

o RDKitCalculatePartialCharges.py

o RDKitGenerateConformers.py

o RDKitFilterChEMBLAlerts.py

o RDKitFilterPAINS.py

o RDKitPerformMinimization.py

o RDKitRemoveSalts.py

o RDKitSearchSMARTS.py


Comments

SCI-RSC Workshop on Computational Tools for Drug Discovery

 

I'm delighted to report this meeting seems to be filling up fast

All scientists working in drug discovery need tools and techniques for handling chemical information. This workshop offers a unique opportunity to try out a range of software packages for themselves with expert tuition in different aspects of pre-clinical drug discovery. Attendees will be able to choose from sessions covering data processing and visualisation; ligand and structure-based design, or ADMET prediction run by the software providers. All software and training materials required for the workshop will be provided for attendees to install and run on their own laptops and use for a limited period afterwards

More details

Presentations from Optibrium / Cresset / Dotmatics /BioSolveIT/ Knime / ChemAxon

Comments

BBEdit Updated

 

BBEdit 12.6 has been released and this is a very significant update. BBEdit is now a sandboxed application which means there are a number changes to the way permissions are handled.

It is well worth reading the Release Notes which offer a very detailed explanation of the situation.

Without unrestricted access to your files and folders, many of BBEdit’s most useful features, from the basic to the most powerful, won't work at all; or they may misbehave in unexpected ways. At the very least, this hinders your ability to work done.

In order to resolve this fundamental conflict between security and usability, we have devised a solution in which BBEdit requests that you permit it the same sort of access to your files and folders that would be available to a non-sandboxed version.

There are also many additions, changes and fixes.

Comments

Data curation workflow

 

One of the most time-consuming parts of any data analysis is curating the input data prior to any model building. This Knime workflow is fully documented and described and as such is an invaluable starting point.

A semi-automated procedure is made available to support scientists in data preparation for modelling purposes. The procedure address:

  • Automatic chemical data retrieval (i.e., SMILES) from different, orthogonal web based databases, by using two different identifiers, i.e. chemical name and CAS registration number. Records were scored based on the coherence of information retrieved from different web sources.
  • Data curation procedure performed to top scored records. The procedure includes removal of inorganic and organometallic compounds and mixtures, neutralization of salts, removal of duplicates, checking of tautomeric forms.
  • Standardization of chemical structures yielding to ready-to-use data for the development of QSARs.

Comments

The official release of GROMACS 2016.6 is available

 

This release fixes remaining issues found since version 2016.5. All users of the 2016 series are encouraged to update to 2016.6. Please see the link to the release notes below for more details.

You can find the code, documentation, release notes, and test suite at the links below.

Code: ftp://ftp.gromacs.org/pub/gromacs/gromacs-2016.6.tar.gz
Documentation: http://manual.gromacs.org/documentation/2016.6/index.html (including release notes, install guide, user guide, reference manual)
Test Suite: http://gerrit.gromacs.org/download/regressiontests-2016.6.tar.gz


Comments

PyMOL 2.3 released

 

Just got this message

We are happy to announce the release of PyMOL 2.3. Download ready-to-use bundles from https://pymol.org/2/ or update your installation with "conda install -c schrodinger pymol". New features include: - Atom-level cartoon transparency - Fast MMTF export - Sequence viewer gaps display

This is the first time there are PyMOL bundles with Python 3. If you use custom or third-party Python 2 scripts, they might stop working until you convert them.

Full release notes are here https://pymol.org/dokuwiki/?id=media:new23 and


Comments

International Year of the Periodic Table of Chemical Elements (IYPT 2019)

 

The United Nations General Assembly during its 74th Plenary Meeting proclaimed 2019 as the International Year of the Periodic Table of Chemical Elements (IYPT 2019) on 20 December 2017.

1869 is considered as the year of discovery of the Periodic System by Dmitri Mendeleev. 2019 will be the 150th anniversary of the Periodic Table of Chemical Elements and has therefore been proclaimed the "International Year of the Periodic Table of Chemical Elements (IYPT2019)" by the United Nations General Assembly and UNESCO.

The IYPT website gives details of events and you can find out more by looking for the hashtag #IYPT2019. Of particular note is Mendeleev 150: 4th International Conference on the Periodic Table endorsed by IUPAC.

simplePT

The periodic table has always been a popular source of apps, with a variety of mobile apps available. There are a couple that I would highlight.

The Elements in Action

The periodic table comes to life with 79 video explorations of the weird, wonderful, and sometimes alarming properties of the elements. Filmed by BAFTA award winner Max Whitby in partnership with Theodore Gray, author of the iconic book and app The Elements, and previously available only in a few museum installations, this is the most beautifully filmed collection of videos ever assembled to explore and explain what makes each element unique and fascinating.

There is a companion book The Elements: A Visual Exploration of Every Known Atom in the Universe that is also very popular.

The Periodic Table Project

To celebrate the International Year of Chemistry (IYC), Chem 13 News magazine together with the University of Waterloo's Department of Chemistry and the Faculty of Science encouraged chemistry educators and enthusiasts worldwide to adopt an element and artistically interpret that element. The project created a periodic table as a mosaic of science and art. The apps include the creative process behind each tile along with basic atomic properties of the element. The apps work to truly highlight the artistic expression of the Periodic Table Project.

Periodic Table

Created by the Royal Society of Chemistry. Did you know that neodymium is used in microphones? Or europium in Euro bank notes to help stop counterfeiting? These are just two of the absorbing facts in our free, user-friendly and customisable app, based on our popular and well-respected Royal Society of Chemistry Periodic Table website.

The RSC also created Top Trumps Chemistry a card game to learn more about the elements.

There are also several online interactive periodic tables

Ptable

RSCperiodictable

ElementsTable

PeriodicTable

There are many events around the world registered on the IYPT website, and if you are organising something you can add them to the list.


Comments

2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry

 

In June 2018 the First RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry meeting was held in London. This proved to enormously popular, there were more oral abstracts and poster submissions than we had space for and was so over-subscribed we could have filled a venue double the size.

Planning for the second meeting is now in full swing, and it will be held in Cambridge 2-3 September 2019.

Event : 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry
Dates : Monday-Tuesday, 2nd to 3rd September 2019
Place : Fitzwilliam College, Cambridge, UK
Websites : Event website, and RSC website.

Twitter #AIChem19

aifirst_announcement-

Applications for both oral and poster presentations are welcomed. Posters will be displayed throughout the day and applicants are asked if they wished to provide a two-minute flash oral presentation when submitting their abstract. The closing dates for submissions are:

  • 31st March for oral and
  • 5th July for poster

Full details can be found on the Event website,


Comments

A new method for 3D printing

 

Where as the usual 3D printing methods create a 3D object by printing it layer by layer a recent publication in Science DOI described a method by which the objects were created by passing light through liquid acrylate. Computed Axial Lithography (CAL), allows generation arbitrary geometries volumetrically through photopolymerization. The process was inspired by the 3D image construction used in CT scans.

The exposure process takes about two minutes for an object a few centimetres across; the team recreated a version of Auguste Rodin’s sculpture The Thinker a few centimetres tall.

The supplementary materials include details for printing several objects

I've added it to the 3D printing page.


Comments

New release of MayaChemTools

 

A new release of MayaChemTools is now available, these comprise a fantastic collection of Perl and Python scripts, modules, and classes to support a variety of day-to-day computational discovery needs.

The core set of command line Perl scripts available in the current release of MayaChemTools has no external dependencies and provide functionality for the following tasks:

  • Manipulation and analysis of data in SD, CSV/TSV, sequence/alignments, and PDB files
  • Listing information about data in SD, CSV/TSV, Sequence/Alignments, PDB, and fingerprints files
  • Calculation of a key set of physicochemical properties, such as molecular weight, hydrogen bond donors and acceptors, logP, and topological polar surface area
  • Generation of 2D fingerprints corresponding to atom neighborhoods, atom types, E-state indices, extended connectivity, MACCS keys, path lengths, topological atom pairs, topological atom triplets, topological atom torsions, topological pharmacophore atom pairs, and topological pharmacophore atom triplets
  • Generation of 2D fingerprints with atom types corresponding to atomic invariants, DREIDING, E-state, functional class, MMFF94, SLogP, SYBYL, TPSA and UFF
  • Similarity searching and calculation of similarity matrices using available 2D fingerprints
  • Listing properties of elements in the periodic table, amino acids, and nucleic acids
  • Exporting data from relational database tables into text files

The command line Python scripts based on RDKit provide functionality for the following tasks:

  • Calculation of molecular descriptors and partial charges
  • Comparison of 3D molecules based on RMSD and shape
  • Conversion between different molecular file formats
  • Enumeration of compound libraries and stereoisomers
  • Filtering molecules using SMARTS, PAINS, and names of functional groups
  • Generation of graph and atomic molecular frameworks
  • Generation of images for molecules
  • Performing structure minimization and conformation generation based on distance geometry and forcefields
  • Performing R group decomposition
  • Picking and clustering molecules based on 2D fingerprints and various clustering methodologies
  • Removal of duplicate molecules and salts from molecules

The command line Python scripts based on PyMOL provide functionality for the following tasks:

  • Aligning macromolecules
  • Splitting macromolecules into chains and ligands
  • Listing information about macromolecules
  • Calculation of physicochemical properties
  • Comparison of marcromolecules based on RMSD
  • Conversion between different ligand file formats
  • Mutating amino acids and nucleic acids
  • Generating Ramachandran plots
  • Visualizing X-ray electron density and cryo-EM density
  • Visualizing macromolecules in terms of chains, ligands, and ligand binding pockets
  • Visualizing cavities and pockets in macromolecules
  • Visualizing macromolecular interfaces
  • Visualizing surface and buried residues in macromolecules

Comments

Programming Languages for Chemical Information

 

This looks like it should be well worth bookmarking.

https://www.biomedcentral.com/collections/programming-languages

This thematic series comprises a set of invited papers, each one describing the use of a single language for the development of cheminformatics software that implement algorithms and analyses and aims to cover a variety of language paradigms. The issue will be rolling, such that as papers on new languages are submitted they will be automatically added to this issue.

The first article DOI is by Kevin Theisen (of ChemDoodle fame) reviewing HTML5/Javascript. Apparently there have been more lines of Javascript written than all other programming languages combined so it seems appropriate as a kick off article.


Comments

Using the Python 3 library fpsim2 for similarity searches

 

FPSim2 is a new tool for fast similarity search on big compound datasets (>100 million) being developed at ChEMBL. It was developed as a Python3 library to support either in memory or out-of-core fast similarity searches on such dataset sizes.

It is built using RDKit and can be installed using conda. It requires Python 3.6 and a recent version of RDKit..

I've written a couple of Jupyter notebooks to demonstrate it's use.

You can read the full tutorial here, and download the notebooks.






Comments

Easy Markdown updated

 

Easy Markdown has been updated to version 1.8.

A text written in Markdown is a plain text which looks correctly to humans as text and automatically translates in a correctly web pages coded in html. In Easy Markdown the window is split in two parts. As you type plain text on the left, you see on the right the resulting web page as it will be seen on the web.

There are many other Markdown editors here detailed here


Comments

A few thoughts on scientific software

 

When a wrote "A few thoughts on scientific software" I was somewhat surprised by the interest and amount of feedback I got. I've since added two more pages based on the feedback,

A listing of open-source cheminformatics toolkits and Open Source Python Data Science Libraries.

If you have any other suggestions feel free to let me know.


Comments

Chemical reactions from US patents (1976-Sep2016)

 

Great work by NextMove, an open, machine-readable, freely-reusable, annotated reaction data set, available for download here https://figshare.com/articles/ChemicalreactionsfromUSpatents1976-Sep2016/5104873

Reactions extracted by text-mining from United States patents published between 1976 and September 2016. The reactions are available as CML or reaction SMILES. Note that the reactions SMILES are derived from the CML.

Reaction SMILES

For convenience the reaction SMILES includes tab delimited columns for: PatentNumber, ParagraphNum, Year, TextMinedYield, CalculatedYield

Now that we have a large initial data set it would be great if others could contribute using the same format.

There is a fabulous detailed review of this invaluable resource on the Depth-First blog http://depth-first.com/articles/2019/01/28/the-nextmove-patent-reaction-dataset/


Comments

Alvascience

 

Just came across this.

Alvascience cheminformatics tools.

BMFpred is an easy-to-use software implementing the QSAR models described in “F. Grisoni, V.Consonni, M.Vighi (2018). Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish, Integrated Environmental Assessment and Management” to predict the laboratory-based fish Biomagnification Factor (BMF) of chemicals.

alvaDesc is the next generation tool for the calculation of a wide range of molecular descriptors and a number of molecular fingerprints. Specifically it calculates almost 4000 descriptors independent of 3-dimensional information such as constitutional, topological, phamacophore. It includes ETA and Atom-type E-state indices together with functional groups and fragment counts. Additionally, alvaDesc implements an extensive number of 3-dimensional descriptors such as 3D-autocorrelation, Weighted Holistic Invariant Molecular descriptors (WHIM) and GETAWAY.

alvaDesc_mac

Also available as a KNIME node, the alvaDesc KNIME Plugin contains three KNIME nodes:

  • Descriptor: calculates molecular descriptors
  • Fingerprint: calculates molecular fingerprints
  • Molecule Reader: reads standard molecule files and can be used as a source for the other two nodes (which are also compatible with KNIME standard molecule nodes)

Comments

Fortran

 

Just looking at the Archer application usage over the last month, lots of materials modelling codes, electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.

Screenshot 2019-01-23 at 16.10.35

ARCHER is the latest UK National Supercomputing Service. The ARCHER Service started in November 2013 and is presently expected to run till November 2019. ARCHER provides a capability resource to allow researchers to run simulations and calculations that require large numbers of processing cores working in a tightly-coupled, parallel fashion.

Also clear that Fortran still dominates

Screenshot 2019-01-23 at 16.11.19

As I've said many times I'm not a big Fortran user but the Fortran on a Mac page is the most accessed page on the site.


Comments

Comparison of bioactivity predictions

 

Small molecules can potentially bind to a variety of bimolecular targets and whilst counter-screening against a wide variety of targets is feasible it can be rather expensive and probably only realistic for when a compound has been identified as of particular interest. For this reason there is considerable interest in building computational models to predict potential interactions. With the advent of large data sets of well annotated biological activity such as ChEMBL and BindingDB this has become possible.

ChEMBL 24 contains 15,207,914 activity data on 12,091 targets, 2,275,906 compounds, BindingDB contains 1,454,892 binding data, for 7,082 protein targets and 652,068 small molecules.

These predictions may aid understanding of molecular mechanisms underlying the molecules bioactivity and predicting potential side effects or cross-reactivity.

Whilst there are a number of sites that can be used to predict bioactivity data I'm going to compare one site, Polypharmacology Browser 2 (PPB2) http://ppb2.gdb.tools with two tools that can be downloaded to run the predictions locally. One based on Jupyter notebooks models built using ChEMBL built by the ChEMBL group https://github.com/madgpap/notebooks/blob/master/targetpred21_demo.ipynb and a more recent random forest model PIDGIN. If you are using proprietary molecules it is unwise to use the online tools.

Read the article here

Comments

CrystalMaker 10.4

 

The much-awaited CrystalMaker 10.4 is now shipping - complete with full “Dark Mode”.

CrystalMaker and CrystalDiffract are real 64-bit Mac programs, written in Cocoa/Objective-C, with beautiful real Mac interfaces. They’re not Windows or Unix applications reskinned via Qt, Java, wxWidgets they’re the real deal: 100% pure Mac. Thus they are able to offer:-

  • Retina display
  • Multi-touch
  • Force touch
  • Haptic feedback
  • Touch bar interface (MacBook Pro)
  • Dark Mode
  • Full-screen mode and Spaces
  • Quick Look
  • Finder thumbnails
  • QuickTime video
  • Apple Help
  • Code-signed, sandboxed, with “hardened runtime” for maximum security

CrystalMaker 10.4 has over 60 new features, of which the most-important are probably:-

  1. Dark Mode
  2. Sleek new structures library with integrated CrystalViewer (1,100 structures; fully customizable)
  3. New energy-modelling engine: makes designing your own molecules quick and easy, with vibrational spectra simulation.
  4. Live powder diffraction: link CrystalMaker 10.4 with CrystalDiffract 6.8 so that editing a crystal in CrystalMaker automatically updates its simulated powder diffraction pattern in CrystalDiffract.
  5. Interpolate Structures command - makes animating structural behaviour smooth and seamless.
  6. Customizable Atoms Inspector and coordinates display.
  7. Spring-loaded sidebars: move the mouse to the edge of the screen to show the relevant sidebar (works great in full-screen mode).
  8. Powerful video sizing/compression options.
  9. Fat sticks display option (great for emphasising structural channels).
  10. Advanced control over axial vectors with scaling, fonts, positioning, inset etc.

More details are available from the download page http://crystalmaker.com/crystalmaker/

They also have a set of video tutorials available.

CrystalMaker X lets you import data from over 40 different formats: with instant display and powerful customization. CrystalMaker can handle including multi-structure files such as DL_POLY HISTORY - use CrystalMaker's synchronization and animation capabilities to rapidly understand structural behaviour, lattice dynamics, or visualize the trajectory of a simulation. CrystalMaker X can also handle truly massive structures. Take advantage of our unique "Depth Profiling" tool, to rapidly scan ares of interest in massive structures - ideal for characterizing the results from computer models.



Comments

TS Calc The mathematical equations tool

 

TS Calc is a document based application and its documents can be realized and used as calculation models for specific mathematical technical problems. It is a complete different approach to solve math problems respect to the usual one using spreadsheets.

4


Comments

Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data

 

Around 4% of the population suffer from colour blindness in one for or another with red/green colour blindness being the most common and sadly in many plots, graphs, presentations little effort is made to make things easier for those people with colour blindness.

Color blindness, also known as color vision deficiency (CVD), is the decreased ability to see color or differences in color. Simple tasks such as selecting ripe fruit, choosing clothing, and reading traffic lights can be more challenging. Color blindness may also make some educational activities more difficult.

A recent publication seeks to address this need, Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data DOI

While there have been some attempts to make aesthetically pleasing or subjectively tolerable colormaps for those with CVD, our goal was to make optimized colormaps for the most accurate perception of scientific data by as many viewers as possible. We developed a Python module, cmaputil, to create CVD-optimized colormaps, which imports colormaps and modifies them to be perceptually uniform in CVD-safe colorspace while linearizing and maximizing the brightness range. The module is made available to the science community to enable others to easily create their own CVD-optimized colormaps.

journal.pone.0199239.g001


Comments

LICHEM: Layered Interacting CHEmical Models

 

An update to LICHEM: Layered Interacting CHEmical Models has been published DOI

LICHEM is an open-source (GPLv3) interface between QM and MM software so that QM/MM calculations can be performed with polarizable and frozen electron density force fields. Functionality is also present for standard point-charge based force fields, pure MM, and pure QM calculations.

Available from GitHub https://github.com/CisnerosResearch/LICHEM.

Note, On OSX machines, the SEDI, TEX, BIB, and CXXFLAGS variables will need to be modified.

Comments

Workshop on Computational Tools for Drug Discovery

 

Registration opened just before Christmas and apparently there were a number of people sign up over the festive period. Remember there are a limited number of places and it is first come first served.

Registration and full details are here.

Computational Tools Flyer

This workshop is intended to provide expert tutorials to get you started and show what can be achieved with the software.

Comments

GROMAC 2019 released

 

I just noticed GROMACS 2019 was released on Dec 31 2018.

GROMACS http://www.gromacs.org is one of the major software packages for the simulation of biological macromolecules. It is aimed at performing the simulation of large, biologically relevant systems, with a focus on both being efficient and flexible to allow the research of a number of different systems

Several important performance improvements

  • Simulations now automatically run using update groups of atoms whose coordinate updates have only intra-group dependencies. These can include both constraints and virtual sites. This improves performance by eliminating overheads during the update, at no cost.
  • Intel integrated GPUs are now supported with OpenCL for offloading non-bonded interactions.
  • PME long-ranged interactions can now also run on a single AMD GPU using OpenCL, which means many fewer CPU cores are needed for good performance with such hardware.

Release notes here


Comments

Macs in Chemistry Annual Site Review

 

At the end of each year I have a look at the website analytics to see which items were the most popular.

Over the year there were 70,000 unique visitors with 25% visiting the site on multiple occasions. The US provided 30% of the visitors and the UK 7% with Germany, India and France around 5%. As you might expect the majority were Mac users (56%), but there were also Windows (25%), iOS (12%), and Android (2.5%) users.

Of the Mac users, 51% are now using 10.14 (Mojave), 27% 10.13 with all older versions each well below 10%.

Chrome and Safari were the preferred browsers with both around 40%.

The most popular web pages were (other than the main page)

The popularity of the Fortran on a Mac page has continued for several years now and it has been updated several times with user provided information.

The most viewed blog pages in 2018 were

The update to Mojave seems to have been another daily smooth transition, with most software developers reporting no major issues.

A couple of recent additions have generated significant interest.

As has an article I wrote on my thoughts on scientific software

Comments

A Jupyter Kernel for Swift

 

I'm constantly impressed by the expansion of Jupyter it is rapidly becoming the first-choice platform for interactive computing.

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

A latest expansion is a Jupyter Kernel for Swift, intended to make it possible to use Jupyter with the Swift for TensorFlow project.

Swift for TensorFlow is a new way to develop machine learning models. It gives you the power of TensorFlow directly integrated into the Swift programming language. With Swift, you can write the following imperative code, and Swift automatically turns it into a single TensorFlow Graph and runs it with the full performance of TensorFlow Sessions on CPU, GPU and TPU.

Requires MacOS 10.13.5 or later, with Xcode 10.0 beta or later


Comments

The International Year of the Periodic Table

 

2019 is the international year of the Periodic Table

1869 is considered as the year of discovery of the Periodic System by Dmitri Mendeleev. 2019 will be the 150th anniversary of the Periodic Table of Chemical Elements and has therefore been proclaimed the "International Year of the Periodic Table of Chemical Elements (IYPT2019)" by the United Nations General Assembly and UNESCO.

If you want to brush up on the table there are a couple mobile apps to help you.

Periodic Table
The Elements in Action
The Periodic Table Project

You can follow events for the year on Twitter https://twitter.com/iypt2019

Comments