Macs in Chemistry

Insanely great science

 

Scripting Vortex 3

ChemAxon's Calculator (cxcalc) is a really useful command line program in Marvin Beans and JChem that performs chemical calculations using calculator plugins. There are a lot of calculations provided by ChemAxon (e.g. charge, pKa, logP, logD), and others can be added by writing custom plugins, perhaps one of the most useful is the ability to calculate the acidic and basic pKa. Calculation of pKa is essential to get a reasonable hold on the LogD of a molecule. LogD is probably the most critical physicochemical property in drug discovery, it has a major influence on absorption, cell penetration, metabolism, CYP450 inhibition and induction, PGP transporter activity and activity at the HERG channel, and is often a critical component of any structure activity relationship.

Calculator performs plugin calculations in a uniform way: it processes general parameters referring to input, output, and SDF file tag names for storing calculation result as well as plugin specific parameters that are different for each plugin. General Options

cxcalc -h, --help              this help message, 
                             list of available calculations
cxcalc <plugin> -h, --help     plugin specific help message
-o, --output <filepath>        output file path (default: stdout)
-t, --tag                      name of the SDFile tag to store the
                             calculation results     
                             default tag name: see plugin help  
-i, --id <tag name|format>     SDFile tag that stores the molecule ID
                             if no such tag exists in the input molecule
                             then molecule ID is the molecule itself
                             converted to the specified format
                             (default: ID = molecule index)
-N, --do-not-display <i|h|ih>  do not display molecule ID and/or
                             table header (in table output form):
                             i  - no molecule ID
                             h  - no table header
                             ih - neither molecule ID nor table header
-S, --sdf-output               SDF output with results in SDF tags
 -M, --mrv-output               result molecule output in MRV format
                             (if neither -S nor -M is specified then
                             plugin results are written in table form)
-g, --ignore-error             continue with next molecule on error
-v, --verbose                  print calculation warnings to the console

The general format from the command line to calculate the pka is

'/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/username/Desktop/temp.sdf pka -b 1 -a 1

Where -b and -a define the first acidic and basic ionisation. The tab delimited text output looks like this

id  apKa1   bpKa1   atoms
1   3.61    4.97    5,2
2   12.07   -4.54   1,5
3   16.30   9.24    3,9
4   10.96   6.43    12,2
5   8.83        9

Where apKa1 is the most acid pka, bpKa1 the most basic and ,atoms the atom numbers that are ionised. The Vortex script is shown below the first part gets the path to the sdf file as before and constructs the cxcalc script. The output is then parsed, using \n to separate each line and \t to separate each value on each line. The first line contains the column names and these are used to populate the Vortex columns, the other lines contain the data and this is used to populate the table.

import sys

# Uncomment the following 2 lines if running in console
#vortex = console.vortex
#vtable = console.vtable

sys.path.append(vortex.getVortexFolder() + '/modules/jythonlib')

import subprocess

# Get the path to the currently open sdf file
sdfFile = vortex.getFileForPropertyCalculation(vtable)

# Run cxcalc on the file
# ''/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/username/Desktop/temp.sdf pka -b 1 -a 1
p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'pka', '-b', '1', '-a', '1'], stdout=subprocess.PIPE)
output = p.communicate()[0]

# Create new columns in table if needed
lines = output.split('\n')
colName = lines[0].split('\t')
for c in colName:
column = vtable.findColumnWithName(c, 1)
vtable.fireTableStructureChanged()

keys = []
for i in lines:
words = i.split('\t')
if len(words) == 2:
    keys.append(words[0])

# Parse the output
rows = lines[1:len(lines)]
for r in range(0, vtable.getRealRowCount()):
vals = rows[r].split('\t')
for j in range(0, len(vals)):
    column = vtable.findColumnWithName(colName[j], 0)
    column.setValueFromString(r, vals[j])

The result can be seen in the image below. cxcalc_pka

Cxcalc can be used to calculate other properties such as logP, logD, mass, acceptorcount, donorcount, polarsurfacearea, and rotatablebondcount and the script can be modified to calculate all of these properties. So simply changing the part of the script that calls cxcalc as shown below calculates a new set of properties, and because the output follows a standard format the rest of the script that parses the output to generate the column headings, and populate the data fields etc. does not need to be altered.

# Run cxcalc on the file
# '/Applications/ChemAxon/MarvinBeans/bin/cxcalc' /Users/swain/Desktop/temp.sdf logp logd -H 7.4 
p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'logp', 'logd', '-H', '7.4', 'mass', 'acceptorcount', 'donorcount', 'polarsurfacearea', 'rotatablebondcount'], stdout=subprocess.PIPE)
output = p.communicate()[0]

The results can be seen in the table below.

cxcalc_all

One of really nice benefits of having command line tools that give the results in a consistent format is that it becomes trivial to add additional properties, simply add them to the command below and the output should be parsed and additional columns added to Vortex without the need to modify the rest of the script.

p = subprocess.Popen(['/Applications/ChemAxon/MarvinBeans/bin/cxcalc', sdfFile, 'logp', 'logd', '-H', '7.4', 'mass', 'acceptorcount', 'donorcount', 'polarsurfacearea', 'rotatablebondcount'], stdout=subprocess.PIPE)

Updated Script

I often need to simply classify molecules as acid, base, neutral or zwitterion, so I’ve updated the script to create another column containing a text annotation. First we need to check if a Pka exists and then score it based on the value of both the calculated acid and basic pka. We then annotate on the resulting scores.

# Calculate abnz
# check if there is no pka

colapka = vtable.findColumnWithName('apKa1', 0)
colbpka = vtable.findColumnWithName('bpKa1', 0)
rows = vtable.getRealRowCount()
for r in range(0, int(rows)):
    apkaExists = colapka.isDefined(r)
    bpkaExists = colbpka.isDefined(r)
    if apkaExists is True:
        taskaID = colapka.getValue(r)
        if taskaID <7.0:
            aScore=1    
        elif taskaID > 7.0:
            aScore = 0  
    elif apkaExists is False:
        aScore = 0
    if bpkaExists is True:  
        taskbID = colbpka.getValue(r)
        if taskbID >7.5:
            bScore=1    
        elif taskbID < 7.5:
            bScore = 0
    elif bpkaExists is False:   
        bScore = 0
    if  aScore == 1 and bScore == 1:
        TheScore = 'Zwitterion'
    elif aScore == 1 and bScore == 0:
        TheScore = 'Acid'
    elif aScore == 0 and bScore == 1:
        TheScore = 'Base'
    elif aScore == 0 and bScore == 0:
        TheScore = 'Neutral'                    
    column = vtable.findColumnWithName('ABNZ', 1)
    column.setValueFromString(r, TheScore)

vortex3

The scripts can be downloaded from here

The orginal pka calculation chemaxon_pka.vpy.zip

Updated script to include acid/base/neutral/zwitterion annotation chemaxon_pka2.vpy.zip

Script for Log P etc. chemaxonlogPlogD_etc.vpy.zip

A reader has contributed a Windows version

MedChemchemaxonpka_etc.vpy,

Chemists at Cancer Research Technology use this for both sdf’s and from Brower created Vortex workspaces. I can’t test it but it looks OK, but you may need to check the path addressing.

The Vortex Scripts

Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using filter-it
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
Scripting Vortex 5 Calculating similarities using OpenBabel
Scripting Vortex 6 Filtering compounds
Scripting Vortex 7 Using MayaChemTools
Scripting Vortex 8 Molecular Shape matching
Scripting Vortex 9 Getting a 2D depiction
Scripting Vortex 10 Interacting with the user
Scripting Vortex 11 Interacting with a web service
Scripting Vortex 12 JSON import
Scripting Vortex 13 Using OpenBabel fastsearch
Other Hints, Tips and Tutorials

Updated 15 March 2012