Scripting Vortex 3
ChemAxon's Calculator (cxcalc) is a really useful command line program in Marvin Beans and JChem that performs chemical calculations using calculator plugins. There are a lot of calculations provided by ChemAxon (e.g. charge, pKa, logP, logD), and others can be added by writing custom plugins, perhaps one of the most useful is the ability to calculate the acidic and basic pKa. Calculation of pKa is essential to get a reasonable hold on the LogD of a molecule. LogD is probably the most critical physicochemical property in drug discovery, it has a major influence on absorption, cell penetration, metabolism, CYP450 inhibition and induction, PGP transporter activity and activity at the HERG channel, and is often a critical component of any structure activity relationship.
Calculator performs plugin calculations in a uniform way: it processes general parameters referring to input, output, and SDF file tag names for storing calculation result as well as plugin specific parameters that are different for each plugin. General Options
cxcalc -h, --help this help message, list of available calculations cxcalc <plugin> -h, --help plugin specific help message -o, --output <filepath> output file path (default: stdout) -t, --tag name of the SDFile tag to store the calculation results default tag name: see plugin help -i, --id <tag name|format> SDFile tag that stores the molecule ID if no such tag exists in the input molecule then molecule ID is the molecule itself converted to the specified format (default: ID = molecule index) -N, --do-not-display <i|h|ih> do not display molecule ID and/or table header (in table output form): i - no molecule ID h - no table header ih - neither molecule ID nor table header -S, --sdf-output SDF output with results in SDF tags -M, --mrv-output result molecule output in MRV format (if neither -S nor -M is specified then plugin results are written in table form) -g, --ignore-error continue with next molecule on error -v, --verbose print calculation warnings to the console
The general format from the command line to calculate the pka is
'/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/username/Desktop/temp.sdf pka -b 1 -a 1
Where -b and -a define the first acidic and basic ionisation. The tab delimited text output looks like this
id apKa1 bpKa1 atoms 1 3.61 4.97 5,2 2 12.07 -4.54 1,5 3 16.30 9.24 3,9 4 10.96 6.43 12,2 5 8.83 9
Where apKa1 is the most acid pka, bpKa1 the most basic and ,atoms the atom numbers that are ionised. The Vortex script is shown below the first part gets the path to the sdf file as before and constructs the cxcalc script. The output is then parsed, using \n to separate each line and \t to separate each value on each line. The first line contains the column names and these are used to populate the Vortex columns, the other lines contain the data and this is used to populate the table.
import sys # Uncomment the following 2 lines if running in console #vortex = console.vortex #vtable = console.vtable sys.path.append(vortex.getVortexFolder() + '/modules/jythonlib') import subprocess # Get the path to the currently open sdf file sdfFile = vortex.getFileForPropertyCalculation(vtable) # Run cxcalc on the file # ''/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/username/Desktop/temp.sdf pka -b 1 -a 1 p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'pka', '-b', '1', '-a', '1'], stdout=subprocess.PIPE) output = p.communicate() # Create new columns in table if needed lines = output.split('\n') colName = lines.split('\t') for c in colName: column = vtable.findColumnWithName(c, 1) vtable.fireTableStructureChanged() keys =  for i in lines: words = i.split('\t') if len(words) == 2: keys.append(words) # Parse the output rows = lines[1:len(lines)] for r in range(0, vtable.getRealRowCount()): vals = rows[r].split('\t') for j in range(0, len(vals)): column = vtable.findColumnWithName(colName[j], 0) column.setValueFromString(r, vals[j])
The result can be seen in the image below.
Cxcalc can be used to calculate other properties such as logP, logD, mass, acceptorcount, donorcount, polarsurfacearea, and rotatablebondcount and the script can be modified to calculate all of these properties. So simply changing the part of the script that calls cxcalc as shown below calculates a new set of properties, and because the output follows a standard format the rest of the script that parses the output to generate the column headings, and populate the data fields etc. does not need to be altered.
# Run cxcalc on the file # '/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/swain/Desktop/temp.sdf logp logd -H 7.4 p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'logp', 'logd', '-H', '7.4', 'mass', 'acceptorcount', 'donorcount', 'polarsurfacearea', 'rotatablebondcount'], stdout=subprocess.PIPE) output = p.communicate()
The results can be seen in the table below.
One of really nice benefits of having command line tools that give the results in a consistent format is that it becomes trivial to add additional properties, simply add them to the command below and the output should be parsed and additional columns added to Vortex without the need to modify the rest of the script.
p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'logp', 'logd', '-H', '7.4', 'mass', 'acceptorcount', 'donorcount', 'polarsurfacearea', 'rotatablebondcount'], stdout=subprocess.PIPE)
I often need to simply classify molecules as acid, base, neutral or zwitterion, so I’ve updated the script to create another column containing a text annotation. First we need to check if a Pka exists and then score it based on the value of both the calculated acid and basic pka. We then annotate on the resulting scores.
# Calculate abnz # check if there is no pka colapka = vtable.findColumnWithName('apKa1', 0) colbpka = vtable.findColumnWithName('bpKa1', 0) rows = vtable.getRealRowCount() for r in range(0, int(rows)): apkaExists = colapka.isDefined(r) bpkaExists = colbpka.isDefined(r) if apkaExists is True: taskaID = colapka.getValue(r) if taskaID <7.0: aScore=1 elif taskaID > 7.0: aScore = 0 elif apkaExists is False: aScore = 0 if bpkaExists is True: taskbID = colbpka.getValue(r) if taskbID >7.5: bScore=1 elif taskbID < 7.5: bScore = 0 elif bpkaExists is False: bScore = 0 if aScore == 1 and bScore == 1: TheScore = 'Zwitterion' elif aScore == 1 and bScore == 0: TheScore = 'Acid' elif aScore == 0 and bScore == 1: TheScore = 'Base' elif aScore == 0 and bScore == 0: TheScore = 'Neutral' column = vtable.findColumnWithName('ABNZ', 1) column.setValueFromString(r, TheScore)
The scripts can be downloaded from here
The orginal pka calculation chemaxon_pka.vpy.zip
Updated script to include acid/base/neutral/zwitterion annotation chemaxon_pka2.vpy.zip
Script for Log P etc. chemaxonlogPlogD_etc.vpy.zip
A reader has contributed a Windows version
Chemists at Cancer Research Technology use this for both sdf’s and from Brower created Vortex workspaces. I can’t test it but it looks OK, but you may need to check the path addressing.
The Vortex Scripts
Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using filter-it
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
Scripting Vortex 5 Calculating similarities using OpenBabel
Scripting Vortex 6 Filtering compounds
Scripting Vortex 7 Using MayaChemTools
Scripting Vortex 8 Molecular Shape matching
Scripting Vortex 9 Getting a 2D depiction
Scripting Vortex 10 Interacting with the user
Scripting Vortex 11 Interacting with a web service
Scripting Vortex 12 JSON import
Scripting Vortex 13 Using OpenBabel fastsearch
Other Hints, Tips and Tutorials
Updated 15 March 2012