Applescript Tutorial 3
Reading, Writing and using Lists
The following Applescript uses Chemdraw to calculate to calculate a variety of molecular properties and then stores them as individual values. These can then be used as demonstrated rather trivially by the display dialog command. The script can be downloaded here.
tell application "CS ChemDraw Ultra" set the_SMILES to SMILES of selection set Elem_Anal to Elemental Analysis of selection set Exact_mass to Exact Mass of selection set Mol_Form to Molecular Formula of selection set Mol_weight to Molecular Weight of selection set Chem_props to "SMILES " & the_SMILES & return & "Chem Analysis " & Elem_Anal & return & "Molecular Formula " & Mol_Form & return & "Molecular Weight " & Mol_Form & return & "Molecular Weight " & Mol_weight display dialog Chem_props end tell
This is fine if all you have to do is calculate the properties for a single molecule but what if you want to perform the calculation of a list of structures.
-
c1ccccc1 benzene
Ic1ccccc1 iodobenzene
O=C1CCCCC1 cyclohexanone
NC1CCCCC1 cyclohexamine
CN(C)c1cccnc1 3-dimethylaminopyridine
N1(c2ccccc2)CCNCC1 phenylpiperazine
You can download the file here temp_mac.txt control click on the link and choose "Save link ....". What we need to do now is have the user choose a file, read the contents and then store the data in a list. Lists are just a group of values stuck between {} for example {1,2,3} or {1,"b","hello",{1,3,5}}. As you can see you can mix types, and even have a list within a list. So in the script below we first define the list we will read the molecules into, then get the user to choose a file, read the contents of the file into theData.
set mol_list to {} set theData to "" set theFile to (choose file with prompt "Select the file:" of type {"TEXT"}) as alias open for access theFile set theData to read theFile using delimiter return close access
If you copy and paste the above text into Script Editor, compile select "Event Log" and click "Run" you can choose the temp_mac.txt file and you should see a result as shown below. Each of the lines is read as a value into the list:-
{"c1ccccc1 benzene", "Ic1ccccc1 iodobenzene", "O=C1CCCCC1 cyclohexanone", "NC1CCCCC1 cyclohexamine", "CN(C)c1cccnc1 3-dimethylaminopyridine", "N1(c2ccccc2)CCNCC1 phenylpiperazine"}
Having read the file we will of course want to write out the results at some point so this seems a good time to think about the the file we will be saving to. We do this with the help of a simple sub-routine, we want to save the results in the same folder as the file we read in. We pass "theFile" to the sub-routine which returns the folder in which it resides. It is a simple task to append the output file name.
set the_file_path to GetParentPath(theFile) set theSaveFile to the_file_path & "test2.smi" on GetParentPath(theFile) tell application "Finder" to return container of theFile as text end GetParentPath
So now we have all the data into a list we can begin to manipulate it, first we need to get the SMILES strings. At the moment the first item in the list is "c1ccccc1 benzene" we need to separate the two terms. First change the text delimiter to "tab" then a simple repeat loop selects each item in theData and copies it to the end of a new list called "mol_list". Remember to change the delimiter back!
set text item delimiters to tab repeat with i from 1 to count of theData set theLine to text items of item i of theData copy theLine to the end of mol_list end repeat set text item delimiters to ""
The result is a list of lists:
The result is a list of lists:-
{{"c1ccccc1", "benzene"}, {"Ic1ccccc1", "iodobenzene"}, {"O=C1CCCCC1", "cyclohexanone"}, {"NC1CCCCC1", "cyclohexamine"}, {"CN(C)c1cccnc1", "3-dimethylaminopyridine"}, {"N1(c2ccccc2)CCNCC1", "phenylpiperazine"}}
We can select both the "SMILES" and "name" of each item of "mol_list" and use "ChemDraw to calculate the properties.
set the_compound to item i of mol_list set the_SMILES to item 1 of the_compound set the_name to item 2 of the_compound --display dialog the_SMILES --display dialog the_name set the clipboard to the_SMILES
However getting ChemDraw to create the chemical structure from the SMILES string is not straight-forward, there is not a "Paste SMILES" command in the Applescript dictionary. So we script the menus to paste the SMILES. The rest of the ChemDraw commands you have seen before. We then combine all the different data items for a single compound into a list "molpropslist" and then add them to the end of "allmollist"
tell application "CS ChemDraw Ultra" activate if enabled of menu item "Paste" then do menu item "SMILES" of menu "Paste Special" of menu "Edit" set the_CD_SMILES to SMILES of selection set Elem_Anal to Elemental Analysis of selection set Exact_mass to Exact Mass of selection set Mol_Form to Molecular Formula of selection set Mol_weight to Molecular Weight of selection copy the_SMILES to the end of mol_props_list copy the_name to the end of mol_props_list copy the_CD_SMILES to the end of mol_props_list copy Elem_Anal to the end of mol_props_list copy Exact_mass to the end of mol_props_list copy Mol_Form to the end of mol_props_list copy Mol_weight to the end of mol_props_list if enabled of menu item "Paste" then do menu item "Clear" of menu "Edit" --display dialog (item 3 of mol_props_list) end tell copy mol_props_list to the end of all_mols_list
It only remains to convert the list to tab delimited text and then save the result. The repeat loop does the conversion and the sub-routine adds each line to the file. It is probably worth mentioning that having regularly used snippets of code as sub-routines certainly helps the cut and paste school of programming!
repeat with i from 1 to num_compounds set mol_list to item i of all_mols_list -- convert list to text set old_delim to AppleScript's text item delimiters set AppleScript's text item delimiters to tab set mol_list to mol_list as text --set mol_list to mol_list & "\n" needs UNIX line endings set mol_list to mol_list & " " set AppleScript's text item delimiters to old_delim my write_to_file(mol_list, theSaveFile, true) end repeat on write_to_file(this_data, target_file, append_data) try set the target_file to the target_file as text set the open_target_file to ÅN open for access file target_file with write permission if append_data is false then ÅN set eof of the open_target_file to 0 write this_data to the open_target_file starting at eof close access the open_target_file return true on error try close access file target_file end try return false end try end write_to_file
The complete script is available here ChemPropsMac.scpt.
UNIX rears its head again
The problem is SMILES often arrive as UNIX files, and there are two different line ending conventions in Mac OS X: Mac-style (lines end with return: "\r" or ASCII character 13) and Unix-style (lines end with line-feed: "\n" or ASCII character 10), so if we try to read a Unix file available here temp_unix.txt we have a problem. As you can see the entire text has been read in as a single value.
The next tutorial will deal with this type of issue