Macs in Chemistry

Insanely great science


Jupyter notebook to look at molecular similarity

I was recently asked for a tool to compare the similarity of a list of molecules with every other molecule in the list. I suspect there may be commercial tools to do this but for small numbers of compounds it is easy to visualise in a Jupyter notebook using RDKit.

The RDKit has a variety of built-in functionality for generating molecular fingerprints and using them to calculate molecular similarity. Morgan fingerprints, better known as circular fingerprints, are built by applying the Morgan algorithm to a set of user-supplied atom invariants. The generated fingerprints are then compared using Dice similarity metric.

The input data file format is tab separated text

Mol_ID  SMILES_parent   Name
OSA_000001  CN1CCN(CC1)c1ccc(cc1)C#N    4-(4-methylpiperazin-1-yl)benzonitrile
OSA_000002  CN1CCN(CC1)C(=O)NC1=CC=C(F)C=C1 N-(4-fluorophenyl)-4-methylpiperazine-1-carboxamide

The results are displayed as a coloured coded matrix as shown below.


You can view the whole notebook here

or download the notebook and example file here MolSimNotebook

Last Updated 30 July 2019