Macs in Chemistry

Insanely Great Science

The beta of the 2020.03 RDKit released

 

The beta of the 2020.03 RDKit is now available on GitHub https://github.com/rdkit/rdkit/releases/tag/Release202003_1b1.

Backwards incompatible changes:

  • Searches for equal molecules (i.e. mol1 @= mol2) in the PostgreSQL cartridge now use the dochiralsss option. So if dochiralsss is false (the default), the molecules CC(F)Cl and C[C@H](F)Cl will be considered to be equal. Previously these molecules were always considered to be different.
  • Attempting to create a MolSupplier from a filename pointing to an empty file, a file that does not exist or sometihing that is not a standard file (i.e. something like a directory) now generates an exception.
  • The cmake option RDKOPTIMIZENATIVE has been renamed to RDKOPTIMIZEPOPCNT

Highlights:

  • The drawings generated by the MolDraw2D objects are now significantly improved and can include simple atom and bond annotations
  • An initial implementation of a modified scaffold network algorithm is now available
  • A few new descriptor/fingerprint types are available - BCUTs, Morse atom fingerprints, Coulomb matrices, and MHFP and SECFP fingerprints

Plus lots of bug fixes.



Comments

OpenEye Toolkits v2019.Oct

 

OpenEye have announced the release of OpenEye Toolkits v2019.Oct. These libraries include the usual support for C++, C#, Java, and Python.

HIGHLIGHTS

  • Spruce TK, a new toolkit for preparing biomolecular structures for modeling applications, is now available in both C++ and Python APIs. Full details of the methodology are here.
  • OEChem TK now provides improved substructure search capability, allowing users to search tens of millions of molecules in seconds.
  • SMIRNOFF, a small molecule force field from the Open Force Field Initiative, is now integrated into OEFF TK. The force field can handle almost all pharmaceutically relevant chemical space

Full details are in the release notes.

This is the last release to support macOS 10.12.

Comments

RDKit 2019_09_1 (Q3 2019) Release

 

A new version of RDKit has been released https://github.com/rdkit/rdkit/releases/tag/Release201909_1.

Highlights:

  • The substructure matching code is now about 30% faster. This also improves the speed of reaction matching and the FMCS code.
  • A minimal JavaScript wrapper has been added as part of the core release.
  • It's now possible to get information about why molecule sanitization failed.
  • A flexible new molecular hashing scheme has been added.

There are however a number of backward incompatible changes detailed in the documents.

Also the old MolHash code should be considered deprecated. This release introduces a more flexible alternative.

Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit). The available conda binaries for this release are:

  • Linux 64bit: python 3.6, 3.7
  • Mac OS 64bit: python 3.6, 3.7
  • Windows 64bit: python 3.6, 3.7

Some things that will be finished over the next couple of days:

  • The conda build scripts will be updated to reflect the new version
  • The homebrew script

Comments

CGRtools: Python Library for Molecule, Reaction and Condensed Graph of Reaction Processing

 

CGRtools is a set of tools for processing of reactions based on Condensed Graph of Reaction (CGR) approach, details on Github https://github.com/cimm-kzn/CGRtools. Published in JCIM DOI

Basic operations:

  • Read /write /convert formats MDL .RDF and .SDF, SMILES, .MRV
  • Standardize reactions and valid structures checker.
  • Produce CGRs.
  • Perfrom subgraph search.
  • Build /correct molecules and reactions.
  • Produce template based reactions.

stable version are available through PyPI

pip install CGRTools

Install CGRtools library DEV version for features that are not well tested

pip install -U git+https://github.com/cimm-kzn/CGRtools.git@master#egg=CGRtools

There is also a tutorial using Jupyter notebook https://github.com/cimm-kzn/CGRtools/tree/master/tutorial


Comments

The Chemfp Project

 

The Chemfp project started as a way to promote the FPS format for cheminformatics fingerprint exchange and has evolved into a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. The 10 years of work and research results of the chemfp project have now been described in an excellent publication.

I looked at Chemfp when comparing various options for clustering large datasets and Chemfp was one of the highest performing, and Andrew Dalke was very responsive to questions.


Comments

Open Source Python Data Science Libraries

 

When I wrote the article entitled A few thoughts on scientific software one of the responses I got was that people did not know about the existence of open-source chemistry toolkits so I thought I'd publish a page that hopefully prevent stop people reinventing the wheel. Here are a few open-source cheminformatics toolkits that I'm aware of.

As a follow up I thought I'd put together a list of useful python libraries for data science

As always happy to hear comments or suggestion for additions.



Comments

OpenEye Toolkits v2018.Oct released

 

OpenEye have announced the release of OpenEye Toolkits v2018.Oct. These libraries include the usual support for C++, Python, C#, and Java. HIGHLIGHTS:

  • Omega TK now includes a method specifically tuned to sample macrocyclic conformational space.
  • FastROCS TK is now available in C++ and Java.
  • Quacpac TK includes improvements to the tautomer functionality.

Full details are in the Release notes.


Comments