Macs in Chemistry

Insanely great science

 

Cheminformatics on a Mac

I’ve recently needed to set up a new Mac and I realised that the current installation process for all the applications, tools, chemistry toolboxes, and associated dependencies was unmanageable. I have a mixture of apps that I have compiled myself, others that I have simply used the precompiled binaries, others from Macports etc.

So I decided to start over and try Homebrew.

Installation of Homebrew

Homebrew is a package manager for Mac OSX that installs packages in it’s own directory then symlinks the files to /usr/local. The reason I went with Homebrew rather than MacPorts is that I found on occasions MacPorts overwrote existing files. Homebrew instead warns you of any clashes and allows you to decide which version to keep. If you need to remove MacPorts there is a detailed guide. You may also need to update your BASH profile.

To install Homebrew you first need to have access to the command line tools for Xcode, the easiest way to do this is to download Xcode from the Mac Appstore

  1. Start Xcode on the Mac.
  2. Choose Preferences from the Xcode menu.
  3. In the General panel, click Downloads.
  4. On the Downloads window, choose the Components tab.
  5. Click the Install button next to Command Line Tools. You are asked for your Apple Developer login during the install process.

Or You can download the Xcode command line tools directly from the developer portal as a .dmg file. https://developer.apple.com/downloads/index.action. On the "Downloads for Apple Developers" list, select the Command Line Tools entry that you want.

For many scientific applications you will also need X11, the easiest way to get this is to install XQuartz. The XQuartz project is an open-source effort to develop a version of the X.Org X Window System that runs on OS X. Together with supporting libraries and applications, it forms the X11.app. the latest downloads are available here

To install Homebrew type this command in the Terminal

ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"

Then type

brew doctor

The 'brew doctor' command checks everything is fine. e.g. it will warn if the developer tools are missing, and if there are unexpected items in /usr/local/bin and /usr/local/lib that may clash and might need to be deleted.

I got this message

warning: Unbrewed dylibs were found in /usr/local/lib.
If you didn't put them there on purpose they could cause problems when
building Homebrew formulae, and may need to be deleted.

Unexpected dylibs:
/usr/local/lib/libavogadro.0.9.2.dylib
/usr/local/lib/libcdt.4.0.0.dylib
/usr/local/lib/libcgraph.4.0.0.dylib
/usr/local/lib/libFileMgmt.dylib
/usr/local/lib/libinchi.0.0.1.dylib
/usr/local/lib/libinchi.0.3.1.dylib
/usr/local/lib/libinchi.0.4.1.dylib
/usr/local/lib/libopenbabel.4.0.1.dylib
/usr/local/lib/libopenbabel.4.0.2.dylib
/usr/local/lib/libpathplan.4.0.0.dylib
.....

Most of these are from my previous installation of OpenBabel and needed to be deleted. Don’t worry if you forget to delete a file, when you come to the brew install of that package it will first check and warn you of any files should be removed.

Installing packages using Homebrew

The way Homebrew works is it installs everything to /usr/local/Cellar, then creates aliases in /usr/local/bin and /usr/local/lib so they are on your $PATH. You will likely have stuff manually installed in these directories - this is fine, but if they are things that can be installed using homebrew it's best to delete them and then reinstall using homebrew.

It is a good idea to first update the package list

brew update

Then we can just run "brew install " for everything we want. A few examples that you may want:

brew install pkg-config
brew install git
brew install subversion
brew install gcc
brew install python
brew install boost --build-from-source
brew install tcl-tk

gfortran is now part of gcc we can check gfortran is installed, in the Terminal type

man -k fortran
gfortran(1)              - GNU Fortran compiler

and the location

which gfortran
/usr/local/bin/gfortran

which should be an alias to "/usr/local/Cellar/gcc/4.8.3/bin/gfortran".

To install a range of cheminformatics packages we can use a custom “tap” created by Matt

brew tap mcs07/cheminformatics

Then run

brew install cdk
brew install chemspot
brew install indigo
brew install inchi
brew install opsin
brew install osra
brew install rdkit

Now outdated see below

There is already an (outdated) open-babel formula in the main homebrew repository, so use the full path, the "mcs07/cheminformatics" part is required because of the old outdated open-babel formula in the main homebrew repository which clashes. The "--HEAD" part means install the latest development version from GitHub. This isn't ideal, but the latest version available as an installer is the 2.3.1 version is so outdated now that there are problems compiling it on Mavericks.

brew install mcs07/cheminformatics/open-babel --HEAD

Update

If you have previously installed Openbabel using

brew install mcs07/cheminformatics/open-babel --HEAD

The "--HEAD" part means install the latest development version from GitHub. The latest version of OpenBabel is now available so can be installed directly.

brew uninstall mcs07/cheminformatics/open-babel
Uninstalling /usr/local/Cellar/open-babel/HEAD... (309 files, 14.6M)
brew install mcs07/cheminformatics/open-babel

You can check you have the latest version installed by type this in a Terminal window

obabel -V
Open Babel 2.4.0 -- Sep 24 2016 -- 14:01:18

You may get errors something like

Warning: Could not link inchi. Unlinking...
Error: The `brew link` step did not complete successfully
The formula built, but is not symlinked into /usr/local
You can try again using `brew link inchi'

Possible conflicting files are:
/usr/local/include/inchi/inchi_api.h
/usr/local/lib/libinchi.dylib -> /usr/local/lib/libinchi.0.dylib
==> Summary
/usr/local/Cellar/inchi/1.04: 38 files, 1.8M, built in 61 seconds

I deleted the highlighted files, then

brew link inchi
Linking /usr/local/Cellar/inchi/1.04... 37 symlinks created

There are a couple of applications that rely on OpenBabel we can now install

brew install filter-it
brew install strip-it
brew install align-it
brew install shape-it

So what have we installed

cdk is The Chemical Development Kit a scientific, LGPL-ed library for bio- and cheminformatics and computational chemistry written in Java

chemspot is ChemSpot is a set of tools for named entity recognition and classification of chemicals in natural language texts, including trivial names, abbreviations, molecular formulas and IUPAC entities.

indigo Indigo is an organic chemistry toolkit

inch InChi is a non-proprietary, international standard to represent chemical structures.

opsin Opsin is an Open Parser for Systematic IUPAC nomenclature.

osra Osra Optical Structure Recognition Application is a utility designed to convert graphical representations of chemical structures and reactions, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES or MOL files.

rdkit RDKit is Open-Source Cheminformatics and Machine Learning toolkit

filter-it, strip-it, align-it, and shape-it are a set of tools created by Silicos-it they are built on top of OpenBabel open source C++ API for rapid calculation of molecular properties
Filter-it™ is a command-line program for filtering molecules with unwanted properties out of a set of molecules
Strip-it™ is a tool to extract molecular scaffolds according predefined rules based on definitions as described by Murcko (J. Med. Chem. 1996, 39, 2887), Pollock (J. Chem. Inf. Model. 2008, 48, 1304) and Schuffenhauer (J. Chem. Inf. Model. 2007, 47, 47)
Align-it™ is a pharmacophore-based tool to align molecules by representing pharmacophoric features as Gaussian 3D volumes.
Shape-it™ is a shape-based alignment tool that represents molecules as a set of atomic Gaussians. The software is based on the alignment method described by Grant and Pickup (J. Phys. Chem. 1995, 99, 3503).

PYMOL can also be installed using Homebrew

brew tap homebrew/science
brew tap homebrew/dupes
brew install python --with-brewed-tk --enable-threads --with-x11
brew install pymol

This installation switches the stereo/mono graphics paradigm. Recent builds of OSX with intel chips seem to crash with stereo graphics. Therefore, Homebrew-installed pymol defaults to assuming the "-M" flag has been passed to it. You can switch to stereo graphics with the "-S" flag when you start PYMOL.

pymol -S

You will also need a number of python bindings to access the toolkits from python scripts, to do this we use PIP a tool for installing and managing Python packages.

curl -LO https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py

And then everything is up to date.

pip install --upgrade pip
pip install --upgrade setuptools

Then install using

pip install Pillow
pip install numpy
pip install scipy
pip install scikit-learn
pip install pandas
pip install matplotlib
pip install lxml
pip install pycairo
pip install chembl_beaker
pip install standardiser
pip install openbabel

Update I’ve heard of issues with installing pycairo using PIP, if you have problems try

brew install py2cairo

Checking it all works

We can now test the applications are working, to test Opsin in a Terminal window type:

echo "iodobenzene" | opsin -o smi
Run the jar using the -h flag for help. Enter a chemical name to begin:
INFO - Initialising OPSIN... 
INFO - OPSIN initialised
IC1=CC=CC=C1

To check that the rdkit python bindings are working: Type 'python' to enter the python interpreter, then try:

python
Python 2.7.6 (default, Mar 13 2014, 10:34:57) 
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from rdkit import Chem
>>> mol = Chem.MolFromSmiles('C1=CC=CC=C1F')
>>> Chem.MolToSmiles(mol)
'Fc1ccccc1'
>>>

To check OpenBabel is working type this in a Terminal window:

obabel -:'C1=CC=CC=C1F' -ocan 
Fc1ccccc1   
1 molecule converted

To test InChi type this in the Terminal:

inchi_main
InChI ver 1, Software version 1.04 (Library call example; classic interface) Build of September 9, 2011.

Usage:
inchi_main inputFile [outputFile [logFile [problemFile]]] [-option[ -option...]]

Options:

Input
STDIO       Use standard input/output streams
  InpAux      Input structures in InChI default aux. info format
          (for use with STDIO)
SDF:DataHeader Read from the input SDfile the ID under this DataHeader
Output
........

To test chemspot in the Terminal type (change username to your name), beware chemspot is pretty resource hungry.

  echo "Iodobenzene was added dropwise to a suspension of magnesium turnings in tetrahydrofuran containing a crystal of iodine, after addition a solution of benzaldehyde in diethyl ether was added dropwise." > /Users/username/Desktop/test.txt

This will create a file on your desktop called test.txt, now type the following

chemspot -t /Users/usename/Desktop/test.txt -o /Users/username/Desktop/untitled.txt

This will create a file on your desktop called untitled.txt

If you open it in a text editor it should read

-1  9   Iodobenzene 000591504       591-50-4            InChI=1S/C6H5I/c7-6-4-2-1-3-5-6/h1-5H                   C031905
49  57  magnesium   022537220   CHEBI:18420 7439-95-4   888 22394505    InChI=1/Mg/q+2  DB01378 HMDB00547   C00305      D008274
71  85  tetrahydrofuran 000109999   CHEBI:26911 109-99-9    8028    36538472    InChI=1/C4H8O/c1-2-4-5-3-1/h1-4H2       HMDB00246           
111 116 iodine      CHEBI:24859 14362-44-8          InChI=1/I                   
148 159 benzaldehyde    000100527   CHEBI:17169 100-52-7    240 7849373 InChI=1/C7H6O/c8-6-7-4-2-1-3-5-7/h1-6H      HMDB06115   C00261  D02314  C032175
164 176 diethyl ether   000060297   CHEBI:35702 60-29-7     585984  InChI=1/C4H10O/c1-3-5-4-2/h3-4H2,1-2H3          C13240  D01772  D004986

To test OSRA you will need an image of chemical structure, you can drag the image below to your desktop.

indole

Then in the Terminal type:

osra /Users/usename/Desktop/indole.png 
c1ccc2c(c1)[nH]cc2

For some reason screenshots don’t work at present, a workaround is to open the screenshot image in Preview and save it as a jpeg file.

Or it can be done from the command line using GraphicsMagick

gm convert /Users/swain/Desktop/indole.png /Users/swain/Desktop/indole.gif

or

gm convert /Users/swain/Desktop/indole.png -strip /Users/swain/Desktop/indoleupdated.png

To test filter-it (and the other tools from silicos-it) in the terminal type

 filter-it
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Filter-it v1.0.2 | Apr  5 2014 12:28:23

-> GCC:        4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)
-> Open Babel: 2.3.90

Copyright 2012 by Silicos-it, a division of Imacosi BVBA

Beaker

Chembl_beaker package was developed at ChEMBL group, EMBL-EBI, Cambridge, UK. It is a wrapper for RDKit and OSRA, which exposes the following methods:

As a portable, lightweight, CORS-ready, REST-speaking, SPORE-documented webserver. This particular implementation wraps RDKit in Bottle on Tornado. To start the web server, in a Terminal window type

run_beaker
Bottle v0.12.5 server starting up (using TornadoServer())...
Listening on http://localhost:8080/
Hit Ctrl-C to quit.

The open a web browser and type in the URL

http://localhost:8080/docs

and you should see the following

beaker

If you select smiles23D/:CTAB and type in a SMILES string and click on the green “GET” button you should see the following response.

smiles23d

Installing Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field

A recent paper in J Cheminformatics described Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field DOI a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT’s source code, additional examples and documentation are available on GitHub.

ODDT (Open Drug Discovery Toolkit)

Programming language: Python

Other requirements:

at least one of the toolkits:

OpenBabel (2.3.2+),

RDKit (2012.03)

Python (2.7+)

Numpy (1.6.2+)

Scipy (0.10+)

Sklearn (0.11+)

ffnet (0.7.1+), only for neural network functionality.

Installation of the toolkits using Homebrew is described above.

The easiest way to install ODDT on a Mac is to use PIP

pip install oddt

You may get messages suggesting you upgrade some of the dependencies such as scipy, this can be done using PIP

pip install —upgrade scipy

You can easily check all is working by running python in a terminal window

python
Python 2.7.10 (default, Jun  3 2015, 09:19:56) 
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import oddt
>>> mol = oddt.toolkit.readstring('smi', 'Cc1c(cc(cc1[N+](=O)[O-])[N+](=O)[O-])[N+](=O)[O-]')
>>> mol.atom_dict['atomtype']
array(['C.3', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'C.ar', 'N.pl',
   'O.2', 'O.co', 'N.pl', 'O.2', 'O.co', 'N.pl', 'O.2', 'O.co'], 
  dtype='|S4')
>>> mol.atom_dict['isacceptor']
array([False, False, False, False, False, False, False, False,  True,
    True, False,  True,  True, False,  True,  True], dtype=bool)
>>>

The publication also includes a series of iPython notebooks to get you started.

Update for El Capitan

When El Capitan first came out I upgraded a machine with an existing installation of a variety of cheminformatics tools installed using Homebrew and PIP as described aboveUnder this situation Pymol worked without problem. However I have had a few readers email me saying they are having problems with Pymol so I took a new machine running El Capitan and tried to install the same cheminformatics tools including Pymol using Homebrew and PIP. All worked fine except Pymol which opened but crashed with the following error.

Username:~ prompt$ pymol
PyMOL(TM) Molecular Graphics System, Version 1.7.6.0.
 Copyright (c) Schrodinger, LLC.
All Rights Reserved.

    Created by Warren L. DeLano, Ph.D. 

    PyMOL is user-supported open-source software.  Although some versions
    are freely available, PyMOL is not in the public domain.

    If PyMOL is helpful in your work or study, then please volunteer 
    support for our ongoing efforts to create open and affordable scientific
    software by purchasing a PyMOL Maintenance and/or Support subscription.

    More information can be found at "http://www.pymol.org".

    Enter "help" for a list of commands.
    Enter "help <command-name>" for information on a specific command.

 Hit ESC anytime to toggle between text and graphics.

 Detected OpenGL version 2.0 or greater. Shaders available.
 Detected GLSL version 1.20.
 OpenGL graphics engine:
  GL_VENDOR:   NVIDIA Corporation
 GL_RENDERER: NVIDIA GeForce 8600M GT OpenGL Engine
  GL_VERSION:  2.1 NVIDIA-10.0.40 310.90.10.05b12
 Detected 2 CPU cores.  Enabled multithreaded rendering.
libpng warning: Application built with libpng-1.6.19 but running with 1.5.23
/usr/local/bin/pymol: line 4:  3628 Segmentation fault: 11  "/usr/local/opt/python/bin/python2.7" "/usr/local/Cellar/pymol/1.7.6.0/libexec/lib/python2.7/site-packages/pymol/__init__.py" "$@&ldquo;

The helpful on the Pymol user list pointed me to this message on the Homebrew-Science issues

First uninstall pymol and libpng

brew uninstall pymol
brew uninstall libpng

then install pymol first.

brew install pymol
brew install libpng

If you now type Pymol in a Terminal window it should start fine.

Updated 12 December 2015