Macs in Chemistry

Insanely Great Science

Tutorials

Predicting sites of metabolism Vortex script

 

It is really useful to have two sites of metabolism tools available that use contrasting methodologies, FAME 2 using curated dataset of experimentally determined metabolism data to build a machine learning model using simple descriptors. In contrast SMARTCyp uses precomputed activation energies from density functional theory (DFT) calculations of model compounds.

I previously wrote a script displaying the [results of a SMARTCyp calculation in a webview. The first part of the script imports the smartcyp.jar, however with each update I was finding issues so I thought it might be better to simply treat SMARTCyp as a command line application and use subprocess to access it.

Using a similar script we can also access FAME2

More details here.

somprediction


Comments

Dealing with Greek characters in column names

 

This is just a very quick tip when dealing with Greek characters in Vortex column names when creating a script. It may be obvious to many but I struggled for several hours before finding the problem and a solution

Read more…


Comments

Flexible UniChem Search

 

UniChem is a web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases. Currently the UniChem contains data from 27 different data sources. Currently UniChem provides links to 108,941,995 structures.

Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3

The previous script showed how to search using ChEMBLID, however one of the attractions of UniChem is that you can search with any molecule identifier if you know the corresponding datasource. This script allows the user to use any molecule identifiers and then search a specified datasource using a common web service.

Read more …


Comments

Getting UniChem data from ChEMBL

 

UniChem is a web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases. Currently the UniChem contains data from 27 different data sources. Currently UniChem provides links to 108,941,995 structures.

Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3

ChEMBL also provide a RESTful Web service that users can use to retrieve data from the UniChem database in a programmatic fashion.

Read more…


Comments

Installing Checkmol/Matchmol under Mac OSX

 

Checkmol is a command-line utility program which reads molecular structure files in different formats and analyzes the input molecule for the presence of various functional groups and structural elements. At present, approx. 200 different functional groups are recognized. Output can be either clear text (English or German), a bitstring or its ASCII representation, or a set of special 8-character codes. This output can be easily placed into a database table, permitting the creation of chemical databases with a functional group search option. It was written by Norbert Haider, Department of Pharmaceutical Chemistry (now: Department of Drug and Natural Product Synthesis), University of Vienna, Austria.

The software is available both as source code and as a binary compiled for Linux (x86 architecture). It is entirely written in Pascal and it was compiled with Free Pascal 1.0.11 or Free Pascal 2.4.0 (starting from v0.4c). So to install we first need to get a Pascal compiler, this can be downloaded from Sourceforge.

Full details are here.

Comments

Importing Open Source Malaria Project data

 

The Open Source Malaria project is trying a different approach to curing malaria. Guided by open source principles, everything is open and anyone can contribute. To date a lot of people around the world have made contributions and the project is at a very exciting stage. Whilst everyone can see the compounds that have been made and the biological data, it is often spread over multiple web pages and can be tricky to link molecule with identifier with data. Over the last couple of months a significant effort has been put into populating a spreadsheet with all the information.

Whilst this is useful for viewing results it is not ideal for trying to build predictive models. Vortex is a chemically intelligent data analysis and visualisation platform. This script provides a one-click access to the OSM data and creates a workspace containing all the data, and since it is linked to the live spreadsheet you will always have access to the latest data.

osmvortex

Comments

Scripting Vortex 25

 

Whilst most of the Vortex scripts mentioned on this site to date involve chemical structures we should not forget that Vortex is an excellent general data analytics tool and the data set does not have to include any molecular structures. Recently I was asked about the number of publications associated with a particular potential therapeutic target and it struck me that Vortex might actually be an excellent tool to investigate this.

Read More.

vorte25_1

Comments

Cheminformatics on a Mac

 

A little while back I wrote a detailed tutorial for getting a wide variety of cheminformatics tools running on a Mac.

Someone just let me know about an issue with OSRA a utility designed to convert graphical representations of chemical structures, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES

It turns out that OSRA requires ghostscript to process pdf images, this can be installed using brew.

brew install ghostscript
Comments

MedChem Wizard KNIME workflow

 

The MedChemWizard is a KNIME workflow designed to assist medicinal chemists with idea generation, ligand design and lead optimization using a number of common functional group transformations and medchem rules-of-thumb, this tutorial provided by Dr. Alastair Donald gives a detailed description of it's use.

mcwizard

Comments

BBEdit tutorial

 

I'm a long time BBEdit user but I still enjoy reading tips for making your use of BBEdit more efficient.

This blog post offers some tips for the various "Find" options within BBEdit.

I'd certainly agree with the final comment.

Text editors with limited capabilities keep you at a beginner level, no matter how long you've been using them. Serious text editors have a depth that rewards their users.

Comments

Bringing Open Source to Drug Discovery

 

I gave a talk at the RSC 25th Symposium on Medicinal Chemistry in Eastern England meeting last week entitled “Bringing Open Source to Drug Discovery”.

The slides and pages of links are available here.

I also captured the laptop screen of the demo which I’ve now put on YouTube.

https://www.youtube.com/watch?v=sG9vDIfp0NE&feature=youtu.be

The aim was to show what was available and to show how they can be integrated into proprietary tools using scripting, many of the scripts are available on the hints and tutorials page.


Comments

Scripting Vortex 12

In the previous tutorial we made use of the Virtual Computational Chemistry Laboratory web service to calculate aLogP and LogS, both these results were returned in a simple text format. More recently there has been an increased use of JSON format for data exchange.

JSON, or JavaScript Object Notation, is a text-based open standard designed for easy human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects. Despite its relationship to JavaScript, it is language-independent, with parsers available for many languages including including C, C++, C#, Java, JavaScript, Perl, Python.

Molinspiration provide a number of cheminformatics tools but also provide a RESTful web service these web services can be used to calculate a range of molecular properties and bioactivity predictions.

The output from both web services is available either as a JSON string or plain text, the web service can be accessed by submitting a URL

Full details of the script are here.

vortex1



Comments

Scripting Vortex:- Accessing a web service

I’ve just added the latest script for Vortex.

In previous scripts we have generated data using a local Java program, C program, PERL script, and SVL program. In this tutorial rather than have a local application generate the data we will use a web service.

mols

There are more scripts on the Hints and Tutorial pages.



Comments

ChemDoodle, WebGL and Protein Ribbons

A tutorial demonstrating the use of ChemDoodle and WebGL to display ribbons on proteins. Read More...
Comments