Jupyter notebook to look at molecular similarity
I was recently asked for a tool to compare the similarity of a list of molecules with every other molecule in the list. I suspect there may be commercial tools to do this but for small numbers of compounds it is easy to visualise in a Jupyter notebook using RDKit.
The RDKit has a variety of built-in functionality for generating molecular fingerprints and using them to calculate molecular similarity. Morgan fingerprints, better known as circular fingerprints, are built by applying the Morgan algorithm to a set of user-supplied atom invariants. The generated fingerprints are then compared using Dice similarity metric.
The input data file format is tab separated text
Mol_ID SMILES_parent Name OSA_000001 CN1CCN(CC1)c1ccc(cc1)C#N 4-(4-methylpiperazin-1-yl)benzonitrile OSA_000002 CN1CCN(CC1)C(=O)NC1=CC=C(F)C=C1 N-(4-fluorophenyl)-4-methylpiperazine-1-carboxamide
The results are displayed as a coloured coded matrix as shown below.
You can view the whole notebook here
or download the notebook and example file here MolSimNotebook
Last Updated 30 July 2019