Macs in Chemistry

Insanely great science

 

Scripting Vortex 4

This is the fourth tutorial on scriptingVortex a chemically intelligent data visualisation package. In the previous tutorials we have looked at getting data from OpenBabel, sieve, and cxcalc in this tutorial we will be using MOE as the compute engine.

MOE from Chemical Computing Group is probably best known as a graphical user interface to a suite of computational chemistry tools, whilst this is indubitably the means by which many users will interact with the program it is worth finding out about the command-line tools that are available. These tools are often accessed by pipeline tools such as Knime to allow rapid processing of large files. CCG provides four very useful command-line tools

dataflow

It is the last of these we will be using to add descriptors to a Vortex table.

Typing

sddesc -help
in a Terminal window gives a list of the options available.

Usage:

sddesc [options...] [infiles...] [-o outfile]

infile                     name of input file  (- for stdin)
outfile                    name of output file (- for stdout, . for null)

Options:
-help                      prints helpful information
-verbose                   enable information printing
-quiet                     disable information printing
-records       range       process only given range of records
-sdf               output SD file (default)
-ascii             output ascii comma separated files with SMILES
-keepfield     field       SD field to transfer to ASCII output file
-comma             comma/quote separated ASCII output (default)
-tab               tab separated ASCII output
-calc      code_list   calculate descriptors (comma separated)
-nocalc    skip_list   skip a set of descriptors (comma separated)
-class     class       calculate descriptors in class
-forcefield    filename    use given forcefield file for 3D descriptors

Range Syntax:
range = n                  equal to n
range = n-                 less than or equal to n
range = n+                 greater than or equal to n
range = n,m                n through m (inclusive)

So while the command is designed to work with sdf files it can be used to generate ascii output as either “comma” delimited or “tab” delimited text. After a little experimentation I found this command gave the desired result

sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

Note you have to include “-ascii” and “-tab”. In the above example I’ve only calculated the molecular weight but MOE can calculate many, many more descriptors. For a full list of the 300+ molecular descriptors, both 2D and 3D, available for calculation in MOE, contact Chemical ComputingGroup through their website, www.chemcomp.com . Extra, custom descriptors are very straightforward to code up in MOE's Scientific Vector Language platform. It is important to note that if you submit a 2D structure file to the calculations any 3D descriptors generated will be inappropriate.

When I first tried this command in a Vortex script I got no output and a number of cryptic error messages, I then included the full path to sddesc

 /Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

But still got no output and got the following error message in the console,

Vortex: /Applications/moe2011/bin/sddesc: line 3: /bin/moebatch: No such file or directory

After generous help from Matt, Dotmatics and CCG I worked out what was wrong. It seems that line 3 in $MOE/bin/sddec is

$MOE/bin/moebatch -run $0 $*

which will open a MOE/batch session, and “run” $MOE/bin/sddesc as an SVL file, using the arguments that were sent when $MOE/bin/sddesc was launched. The problem is that the program is running in a shell that does not have access to all the environment variables defined in my .bash_profile. We can define the environment variables needed by moebatch thus

my_env = os.environ
my_env["PATH"] =  '/Applications/moe2011/bin/'+my_env.get('PATH', '')
my_env["MOE"] =  '/Applications/moe2011/'+my_env.get('$MOE', '')

The command to run sddesc then becomes

    p = subprocess.Popen(['/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)
output = p.communicate()[0]

The remainder of the script parses the data, adds columns and headers, and then inserts the data. Again the beauty of this approach is that more descriptors can be added to the list for calculation and they will be automatically added to the Vortex table.

The Vortex Script

import sys

# Uncomment the following 2 lines if running in console
#vortex = console.vortex
#vtable = console.vtable

sys.path.append(vortex.getVortexFolder() + '/modules/jythonlib')

import subprocess
import os

my_env = os.environ
my_env["PATH"] =  '/Applications/moe2011/bin/'+my_env.get('PATH', '')
my_env["MOE"] =  '/Applications/moe2011/'+my_env.get('$MOE', '')



# Get the path to the currently open sdf file
sdfFile = vortex.getFileForPropertyCalculation(vtable)

# Run sddesc on the file
#  /Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/swain/Desktop/ChemicalStructures/acetophenones.sdf 
p = subprocess.Popen(['/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)
output = p.communicate()[0]

# Create new columns in table if needed
lines = output.split('\n')
colName = lines[0].split('\t')
for c in colName:
    column = vtable.findColumnWithName(c, 1)
    vtable.fireTableStructureChanged()

keys = []
for i in lines:
    words = i.split('\t')
    if len(words) == 2:
    keys.append(words[0])

# Parse the output
rows = lines[1:len(lines)]
for r in range(0, vtable.getRealRowCount()):
    vals = rows[r].split('\t')
    for j in range(0, len(vals)):
        column = vtable.findColumnWithName(colName[j], 0)
        column.setValueFromString(r, vals[j])

The script can be downloaded from here CCGcolumns.vpy.zip

The Vortex Scripts

Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using filter-it
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
Scripting Vortex 5 Calculating similarities using OpenBabel
Scripting Vortex 6 Filtering compounds
Scripting Vortex 7 Using MayaChemTools
Scripting Vortex 8 Molecular Shape matching
Scripting Vortex 9 Getting a 2D depiction
Scripting Vortex 10 Interacting with the user
Scripting Vortex 11 Interacting with a web service
Scripting Vortex 12 JSON import
Scripting Vortex 13 Using OpenBabel fastsearch
Other Hints, Tips and Tutorials

Last Updated 7 Feb 2012