Cheminformatics on a Mac
I’ve recently needed to set up a new Mac and I realised that the current installation process for all the applications, tools, chemistry toolboxes, and associated dependencies was unmanageable. I have a mixture of apps that I have compiled myself, others that I have simply used the precompiled binaries, others from Macports etc.
So I decided to start over and try Homebrew.
Installation of Homebrew
Homebrew is a package manager for Mac OSX that installs packages in it’s own directory then symlinks the files to /usr/local. The reason I went with Homebrew rather than MacPorts is that I found on occasions MacPorts overwrote existing files. Homebrew instead warns you of any clashes and allows you to decide which version to keep. If you need to remove MacPorts there is a detailed guide. You may also need to update your BASH profile.
To install Homebrew you first need to have access to the command line tools for Xcode, the easiest way to do this is to download Xcode from the Mac Appstore
- Start Xcode on the Mac.
- Choose Preferences from the Xcode menu.
- In the General panel, click Downloads.
- On the Downloads window, choose the Components tab.
- Click the Install button next to Command Line Tools. You are asked for your Apple Developer login during the install process.
Or You can download the Xcode command line tools directly from the developer portal as a .dmg file. https://developer.apple.com/downloads/index.action. On the "Downloads for Apple Developers" list, select the Command Line Tools entry that you want.
For many scientific applications you will also need X11, the easiest way to get this is to install XQuartz. The XQuartz project is an open-source effort to develop a version of the X.Org X Window System that runs on OS X. Together with supporting libraries and applications, it forms the X11.app. the latest downloads are available here
To install Homebrew type this command in the Terminal
ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
The 'brew doctor' command checks everything is fine. e.g. it will warn if the developer tools are missing, and if there are unexpected items in /usr/local/bin and /usr/local/lib that may clash and might need to be deleted.
I got this message
warning: Unbrewed dylibs were found in /usr/local/lib. If you didn't put them there on purpose they could cause problems when building Homebrew formulae, and may need to be deleted. Unexpected dylibs: /usr/local/lib/libavogadro.0.9.2.dylib /usr/local/lib/libcdt.4.0.0.dylib /usr/local/lib/libcgraph.4.0.0.dylib /usr/local/lib/libFileMgmt.dylib /usr/local/lib/libinchi.0.0.1.dylib /usr/local/lib/libinchi.0.3.1.dylib /usr/local/lib/libinchi.0.4.1.dylib /usr/local/lib/libopenbabel.4.0.1.dylib /usr/local/lib/libopenbabel.4.0.2.dylib /usr/local/lib/libpathplan.4.0.0.dylib .....
Most of these are from my previous installation of OpenBabel and needed to be deleted. Don’t worry if you forget to delete a file, when you come to the brew install of that package it will first check and warn you of any files should be removed.
Installing packages using Homebrew
The way Homebrew works is it installs everything to /usr/local/Cellar, then creates aliases in /usr/local/bin and /usr/local/lib so they are on your $PATH. You will likely have stuff manually installed in these directories - this is fine, but if they are things that can be installed using homebrew it's best to delete them and then reinstall using homebrew.
It is a good idea to first update the package list
Then we can just run "brew install
brew install pkg-config brew install git brew install subversion brew install gcc brew install python brew install boost --build-from-source brew install tcl-tk
gfortran is now part of gcc we can check gfortran is installed, in the Terminal type
man -k fortran gfortran(1) - GNU Fortran compiler
and the location
which gfortran /usr/local/bin/gfortran
which should be an alias to "/usr/local/Cellar/gcc/4.8.3/bin/gfortran".
To install a range of cheminformatics packages we can use a custom “tap” created by Matt
brew tap mcs07/cheminformatics
brew install cdk brew install chemspot brew install indigo brew install inchi brew install opsin brew install osra brew install rdkit
Now outdated see below
There is already an (outdated) open-babel formula in the main homebrew repository, so use the full path, the "mcs07/cheminformatics" part is required because of the old outdated open-babel formula in the main homebrew repository which clashes. The "--HEAD" part means install the latest development version from GitHub. This isn't ideal, but the latest version available as an installer is the 2.3.1 version is so outdated now that there are problems compiling it on Mavericks.
brew install mcs07/cheminformatics/open-babel --HEAD
If you have previously installed Openbabel using
brew install mcs07/cheminformatics/open-babel --HEAD
The "--HEAD" part means install the latest development version from GitHub. The latest version of OpenBabel is now available so can be installed directly.
brew uninstall mcs07/cheminformatics/open-babel Uninstalling /usr/local/Cellar/open-babel/HEAD... (309 files, 14.6M) brew install mcs07/cheminformatics/open-babel You can check you have the latest version installed by type this in a Terminal window obabel -V Open Babel 2.4.0 -- Sep 24 2016 -- 14:01:18
You may get errors something like
Warning: Could not link inchi. Unlinking... Error: The `brew link` step did not complete successfully The formula built, but is not symlinked into /usr/local You can try again using `brew link inchi' Possible conflicting files are: /usr/local/include/inchi/inchi_api.h /usr/local/lib/libinchi.dylib -> /usr/local/lib/libinchi.0.dylib ==> Summary /usr/local/Cellar/inchi/1.04: 38 files, 1.8M, built in 61 seconds
I deleted the highlighted files, then
brew link inchi Linking /usr/local/Cellar/inchi/1.04... 37 symlinks created
There are a couple of applications that rely on OpenBabel we can now install
brew install filter-it brew install strip-it brew install align-it brew install shape-it
So what have we installed
cdk is The Chemical Development Kit a scientific, LGPL-ed library for bio- and cheminformatics and computational chemistry written in Java
chemspot is ChemSpot is a set of tools for named entity recognition and classification of chemicals in natural language texts, including trivial names, abbreviations, molecular formulas and IUPAC entities.
indigo Indigo is an organic chemistry toolkit
inch InChi is a non-proprietary, international standard to represent chemical structures.
opsin Opsin is an Open Parser for Systematic IUPAC nomenclature.
osra Osra Optical Structure Recognition Application is a utility designed to convert graphical representations of chemical structures and reactions, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES or MOL files.
rdkit RDKit is Open-Source Cheminformatics and Machine Learning toolkit
filter-it, strip-it, align-it, and shape-it are a set of tools created by Silicos-it they are built on top of OpenBabel open source C++ API for rapid calculation of molecular properties
Filter-it™ is a command-line program for filtering molecules with unwanted properties out of a set of molecules
Strip-it™ is a tool to extract molecular scaffolds according predefined rules based on definitions as described by Murcko (J. Med. Chem. 1996, 39, 2887), Pollock (J. Chem. Inf. Model. 2008, 48, 1304) and Schuffenhauer (J. Chem. Inf. Model. 2007, 47, 47)
Align-it™ is a pharmacophore-based tool to align molecules by representing pharmacophoric features as Gaussian 3D volumes.
Shape-it™ is a shape-based alignment tool that represents molecules as a set of atomic Gaussians. The software is based on the alignment method described by Grant and Pickup (J. Phys. Chem. 1995, 99, 3503).
PYMOL can also be installed using Homebrew
brew tap homebrew/science brew tap homebrew/dupes brew install python --with-brewed-tk --enable-threads --with-x11 brew install pymol
This installation switches the stereo/mono graphics paradigm. Recent builds of OSX with intel chips seem to crash with stereo graphics. Therefore, Homebrew-installed pymol defaults to assuming the "-M" flag has been passed to it. You can switch to stereo graphics with the "-S" flag when you start PYMOL.
You will also need a number of python bindings to access the toolkits from python scripts, to do this we use PIP a tool for installing and managing Python packages.
curl -LO https://raw.github.com/pypa/pip/master/contrib/get-pip.py python get-pip.py
And then everything is up to date.
pip install --upgrade pip pip install --upgrade setuptools
Then install using
pip install Pillow pip install numpy pip install scipy pip install scikit-learn pip install pandas pip install matplotlib pip install lxml pip install pycairo pip install chembl_beaker pip install standardiser pip install openbabel
Update I’ve heard of issues with installing pycairo using PIP, if you have problems try
brew install py2cairo
Checking it all works
We can now test the applications are working, to test Opsin in a Terminal window type:
echo "iodobenzene" | opsin -o smi Run the jar using the -h flag for help. Enter a chemical name to begin: INFO - Initialising OPSIN... INFO - OPSIN initialised IC1=CC=CC=C1
To check that the rdkit python bindings are working: Type 'python' to enter the python interpreter, then try:
python Python 2.7.6 (default, Mar 13 2014, 10:34:57) [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from rdkit import Chem >>> mol = Chem.MolFromSmiles('C1=CC=CC=C1F') >>> Chem.MolToSmiles(mol) 'Fc1ccccc1' >>>
To check OpenBabel is working type this in a Terminal window:
obabel -:'C1=CC=CC=C1F' -ocan Fc1ccccc1 1 molecule converted
To test InChi type this in the Terminal:
inchi_main InChI ver 1, Software version 1.04 (Library call example; classic interface) Build of September 9, 2011. Usage: inchi_main inputFile [outputFile [logFile [problemFile]]] [-option[ -option...]] Options: Input STDIO Use standard input/output streams InpAux Input structures in InChI default aux. info format (for use with STDIO) SDF:DataHeader Read from the input SDfile the ID under this DataHeader Output ........
To test chemspot in the Terminal type (change username to your name), beware chemspot is pretty resource hungry.
echo "Iodobenzene was added dropwise to a suspension of magnesium turnings in tetrahydrofuran containing a crystal of iodine, after addition a solution of benzaldehyde in diethyl ether was added dropwise." > /Users/username/Desktop/test.txt
This will create a file on your desktop called test.txt, now type the following
chemspot -t /Users/usename/Desktop/test.txt -o /Users/username/Desktop/untitled.txt
This will create a file on your desktop called untitled.txt
If you open it in a text editor it should read
-1 9 Iodobenzene 000591504 591-50-4 InChI=1S/C6H5I/c7-6-4-2-1-3-5-6/h1-5H C031905 49 57 magnesium 022537220 CHEBI:18420 7439-95-4 888 22394505 InChI=1/Mg/q+2 DB01378 HMDB00547 C00305 D008274 71 85 tetrahydrofuran 000109999 CHEBI:26911 109-99-9 8028 36538472 InChI=1/C4H8O/c1-2-4-5-3-1/h1-4H2 HMDB00246 111 116 iodine CHEBI:24859 14362-44-8 InChI=1/I 148 159 benzaldehyde 000100527 CHEBI:17169 100-52-7 240 7849373 InChI=1/C7H6O/c8-6-7-4-2-1-3-5-7/h1-6H HMDB06115 C00261 D02314 C032175 164 176 diethyl ether 000060297 CHEBI:35702 60-29-7 585984 InChI=1/C4H10O/c1-3-5-4-2/h3-4H2,1-2H3 C13240 D01772 D004986
To test OSRA you will need an image of chemical structure, you can drag the image below to your desktop.
Then in the Terminal type:
osra /Users/usename/Desktop/indole.png c1ccc2c(c1)[nH]cc2
For some reason screenshots don’t work at present, a workaround is to open the screenshot image in Preview and save it as a jpeg file.
Or it can be done from the command line using GraphicsMagick
gm convert /Users/swain/Desktop/indole.png /Users/swain/Desktop/indole.gif
gm convert /Users/swain/Desktop/indole.png -strip /Users/swain/Desktop/indoleupdated.png
To test filter-it (and the other tools from silicos-it) in the terminal type
filter-it +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Filter-it v1.0.2 | Apr 5 2014 12:28:23 -> GCC: 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38) -> Open Babel: 2.3.90 Copyright 2012 by Silicos-it, a division of Imacosi BVBA
Chembl_beaker package was developed at ChEMBL group, EMBL-EBI, Cambridge, UK. It is a wrapper for RDKit and OSRA, which exposes the following methods:
- Format convertion
- Compound recognition
- Raster image (PNG) generation
- Vector image (SVG) generation
- HTML5 ready compound representation
- Maximum Common Substructure
- Smiliarity maps
- ChEMBL standardisation process, consisting of neutralisation, bond breaking, salt removal and applying various rules.
- 3D coordinates generation, using Universal Force Field
- Various other calculations (for example kekulisation)
- Marvin 4 JS compilant webservices
As a portable, lightweight, CORS-ready, REST-speaking, SPORE-documented webserver. This particular implementation wraps RDKit in Bottle on Tornado. To start the web server, in a Terminal window type
run_beaker Bottle v0.12.5 server starting up (using TornadoServer())... Listening on http://localhost:8080/ Hit Ctrl-C to quit.
The open a web browser and type in the URL
and you should see the following
If you select smiles23D/:CTAB and type in a SMILES string and click on the green “GET” button you should see the following response.
Installing Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field
A recent paper in J Cheminformatics described Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field DOI a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT’s source code, additional examples and documentation are available on GitHub.
ODDT (Open Drug Discovery Toolkit)
Programming language: Python
at least one of the toolkits:
ffnet (0.7.1+), only for neural network functionality.
Installation of the toolkits using Homebrew is described above.
The easiest way to install ODDT on a Mac is to use PIP
pip install oddt
You may get messages suggesting you upgrade some of the dependencies such as scipy, this can be done using PIP
pip install —upgrade scipy
You can easily check all is working by running python in a terminal window
python Python 2.7.10 (default, Jun 3 2015, 09:19:56) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import oddt >>> mol = oddt.toolkit.readstring('smi', 'Cc1c(cc(cc1[N+](=O)[O-])[N+](=O)[O-])[N+](=O)[O-]') >>> mol.atom_dict['atomtype'] array(['C.3', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'N.pl', 'O.2', 'O.co', 'N.pl', 'O.2', 'O.co', 'N.pl', 'O.2', 'O.co'], dtype='|S4') >>> mol.atom_dict['isacceptor'] array([False, False, False, False, False, False, False, False, True, True, False, True, True, False, True, True], dtype=bool) >>>
The publication also includes a series of iPython notebooks to get you started.
Update for El Capitan
When El Capitan first came out I upgraded a machine with an existing installation of a variety of cheminformatics tools installed using Homebrew and PIP as described aboveUnder this situation Pymol worked without problem. However I have had a few readers email me saying they are having problems with Pymol so I took a new machine running El Capitan and tried to install the same cheminformatics tools including Pymol using Homebrew and PIP. All worked fine except Pymol which opened but crashed with the following error.
Username:~ prompt$ pymol PyMOL(TM) Molecular Graphics System, Version 126.96.36.199. Copyright (c) Schrodinger, LLC. All Rights Reserved. Created by Warren L. DeLano, Ph.D. PyMOL is user-supported open-source software. Although some versions are freely available, PyMOL is not in the public domain. If PyMOL is helpful in your work or study, then please volunteer support for our ongoing efforts to create open and affordable scientific software by purchasing a PyMOL Maintenance and/or Support subscription. More information can be found at "http://www.pymol.org". Enter "help" for a list of commands. Enter "help <command-name>" for information on a specific command. Hit ESC anytime to toggle between text and graphics. Detected OpenGL version 2.0 or greater. Shaders available. Detected GLSL version 1.20. OpenGL graphics engine: GL_VENDOR: NVIDIA Corporation GL_RENDERER: NVIDIA GeForce 8600M GT OpenGL Engine GL_VERSION: 2.1 NVIDIA-10.0.40 310.90.10.05b12 Detected 2 CPU cores. Enabled multithreaded rendering. libpng warning: Application built with libpng-1.6.19 but running with 1.5.23 /usr/local/bin/pymol: line 4: 3628 Segmentation fault: 11 "/usr/local/opt/python/bin/python2.7" "/usr/local/Cellar/pymol/188.8.131.52/libexec/lib/python2.7/site-packages/pymol/__init__.py" "$@“
The helpful on the Pymol user list pointed me to this message on the Homebrew-Science issues
First uninstall pymol and libpng
brew uninstall pymol brew uninstall libpng
then install pymol first.
brew install pymol brew install libpng
If you now type Pymol in a Terminal window it should start fine.
Updated 12 December 2017