Accessing the Chemical Identifier Resolver from Marvin
With the release of Marvin 5.12.0 users can now also access a custom web-service to extend name to structure conversion - for instance, with corporate IDs or common name dictionaries. I thought it might be useful to have a look at this new feature however I don’t have a corporate web service that I can use. This is where the Chemical Identifier Resolver (CIR) comes into play, and I’ve previously written applescripts that use CIR.
About the Chemical Identifier Resolver
The Chemical Identifier Resolver (CIR) by the CADD Group at the NCI/NIH is a web service that performs various chemical name to structure conversions. The service works as a resolver for different chemical structure identifiers and allows one to convert a given structure identifier into another representation or structure identifier. It can help you identify and find the chemical structure if you have an identifier such as an InChIKey or CAS Number. You can either use the resolver web form at the web link above or use the following simple URL.
Example: Chemical name to SMILES:
The input identifier can be a chemical name, SMILES, CAS Number, InChi etc and the returned representation can be SMILES, sdf, png etc. this is achieved by using a combination of OPSIN, ChemSpider and CIR's own database.
However Marvin expects the request to be in this format
So we need a way of reformatting the request
One way to do this is to use the .htaccess file on a web server and I’m very grateful to Matt for help in doing this.
Setting up a web server
The .htaccess file is a configuration file for Apache web servers. The settings in it apply to the directory it is placed in, and every nested subdirectory within that directory.
"RewriteEngine On" just makes sure the mod_rewrite engine is on, which is just a tool for redirecting urls. The next line is a rewrite rule. "^(.+)/(.+)$" is a regular expression that captures two strings separated by a slash. In our case this will capture smiles and aspirin. The next part is the url to redirect to. $1 and $2 are the captured regex groups, so it just swaps around aspirin and smiles. This bits in square brackets are optional - R=301 just means this is a permanent redirect, and L just ensure no further rules are evaluated (but obviously we only have 1 rule).
So we need to create a folder on the web server (I called it cir) and in it create a file called .htaccess, the dot before the name means that this will be an invisible file, I used BBEdit to create the file with the following contents.
RewriteEngine On RewriteRule ^(.+)/(.+)$ http://cactus.nci.nih.gov/chemical/structure/$2/$1 [R=301,L]
Once done you can test it by typing the following in your web browser
and you should get a SMILES string returned. On my server
Setting the Name to structure server in Marvin
Benanserin is a trivial name for 1-Benzyl-2-methyl-5-methoxytryptamine, in Marvin if you select “Edit:Import name” and type Benanserin into the dialog box you will get an error.
If you now open the Marvin preferences (Edit:Preferences) and switch to the “Save/Load” tab and update the Name import service URL and click OK.
If in Marvin if you now select “Edit:Import name” and type Benanserin into the dialog box you will get the structure displayed.
Currently this configuration leaves a little to be desired! In addition to being rather inflexible with respect to the construction of the URL it currently does not work with any queries that contain a space, ideally we need the query to be URL encoded. I’ve been in touch with the developers and they assure me that in the forthcoming release this will be addressed. That said despite the minor teething problems I’m delighted to see the ability to access web services built into applications.