Macs in Chemistry

Insanely Great Science

WebMolKit: switched to Apache 2.0

Just saw this.

WebMolKit is a cheminformatics library that I’ve been working on for a long time: it runs on all kinds of JavaScript engines (browsers, desktop via Electron, command line via NodeJS). Its flagship feature is a powerful chemical sketcher, but it also has many supporting functions for handling molecules. As of now, the licensing terms have been switched to Apache 2.0, which basically means you are allowed to use it for non-open projects, as long as proper credit is given

I've updated the Open Source Cheminformatics Toolkits page


The beta of the 2020.03 RDKit released


The beta of the 2020.03 RDKit is now available on GitHub

Backwards incompatible changes:

  • Searches for equal molecules (i.e. mol1 @= mol2) in the PostgreSQL cartridge now use the dochiralsss option. So if dochiralsss is false (the default), the molecules CC(F)Cl and C[C@H](F)Cl will be considered to be equal. Previously these molecules were always considered to be different.
  • Attempting to create a MolSupplier from a filename pointing to an empty file, a file that does not exist or sometihing that is not a standard file (i.e. something like a directory) now generates an exception.
  • The cmake option RDKOPTIMIZENATIVE has been renamed to RDKOPTIMIZEPOPCNT


  • The drawings generated by the MolDraw2D objects are now significantly improved and can include simple atom and bond annotations
  • An initial implementation of a modified scaffold network algorithm is now available
  • A few new descriptor/fingerprint types are available - BCUTs, Morse atom fingerprints, Coulomb matrices, and MHFP and SECFP fingerprints

Plus lots of bug fixes.


OpenEye Toolkits v2019.Oct


OpenEye have announced the release of OpenEye Toolkits v2019.Oct. These libraries include the usual support for C++, C#, Java, and Python.


  • Spruce TK, a new toolkit for preparing biomolecular structures for modeling applications, is now available in both C++ and Python APIs. Full details of the methodology are here.
  • OEChem TK now provides improved substructure search capability, allowing users to search tens of millions of molecules in seconds.
  • SMIRNOFF, a small molecule force field from the Open Force Field Initiative, is now integrated into OEFF TK. The force field can handle almost all pharmaceutically relevant chemical space

Full details are in the release notes.

This is the last release to support macOS 10.12.


RDKit 2019_09_1 (Q3 2019) Release


A new version of RDKit has been released


  • The substructure matching code is now about 30% faster. This also improves the speed of reaction matching and the FMCS code.
  • A minimal JavaScript wrapper has been added as part of the core release.
  • It's now possible to get information about why molecule sanitization failed.
  • A flexible new molecular hashing scheme has been added.

There are however a number of backward incompatible changes detailed in the documents.

Also the old MolHash code should be considered deprecated. This release introduces a more flexible alternative.

Binaries have been uploaded to ( The available conda binaries for this release are:

  • Linux 64bit: python 3.6, 3.7
  • Mac OS 64bit: python 3.6, 3.7
  • Windows 64bit: python 3.6, 3.7

Some things that will be finished over the next couple of days:

  • The conda build scripts will be updated to reflect the new version
  • The homebrew script


CGRtools: Python Library for Molecule, Reaction and Condensed Graph of Reaction Processing


CGRtools is a set of tools for processing of reactions based on Condensed Graph of Reaction (CGR) approach, details on Github Published in JCIM DOI

Basic operations:

  • Read /write /convert formats MDL .RDF and .SDF, SMILES, .MRV
  • Standardize reactions and valid structures checker.
  • Produce CGRs.
  • Perfrom subgraph search.
  • Build /correct molecules and reactions.
  • Produce template based reactions.

stable version are available through PyPI

pip install CGRTools

Install CGRtools library DEV version for features that are not well tested

pip install -U git+

There is also a tutorial using Jupyter notebook


The Chemfp Project


The Chemfp project started as a way to promote the FPS format for cheminformatics fingerprint exchange and has evolved into a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. The 10 years of work and research results of the chemfp project have now been described in an excellent publication.

I looked at Chemfp when comparing various options for clustering large datasets and Chemfp was one of the highest performing, and Andrew Dalke was very responsive to questions.


Open Source Python Data Science Libraries


When I wrote the article entitled A few thoughts on scientific software one of the responses I got was that people did not know about the existence of open-source chemistry toolkits so I thought I'd publish a page that hopefully prevent stop people reinventing the wheel. Here are a few open-source cheminformatics toolkits that I'm aware of.

As a follow up I thought I'd put together a list of useful python libraries for data science

As always happy to hear comments or suggestion for additions.


OpenEye Toolkits v2018.Oct released


OpenEye have announced the release of OpenEye Toolkits v2018.Oct. These libraries include the usual support for C++, Python, C#, and Java. HIGHLIGHTS:

  • Omega TK now includes a method specifically tuned to sample macrocyclic conformational space.
  • FastROCS TK is now available in C++ and Java.
  • Quacpac TK includes improvements to the tautomer functionality.

Full details are in the Release notes.