Macs in Chemistry

Insanely Great Science

Indexing the internet in a chemically intelligent manner

Some time ago I described a Safari extension that uses the chemicalize.org to index a web page for chemical content.

For an example of a “chemicalized” page have a look at this

As you can see below all molecules mentioned in the page become links that on a mouse over reveal the structure, they also provide a handy ribbon of structures across the top of the page that is useful for quickly scanning and navigation.

screnn1

A recent publication by Southan and Stracz, Extracting and connecting chemical structures from text sources using chemicalize.org. Journal of Cheminformatics 2013, 5:20 describes how this information is being used to provide better indexing of the internet in a chemically intelligent manner. They include a demonstration of a number of web pages and document sources that were indexed in this manner including PDF’s from the patent office.

chemicalize.org now has 15000 unique visitors a month – which is a huge growth compared to spring 2012. These users contribute to the database every day, making sure it’s up-to-date and contains new interests as well. The database today contains 327000 structures that were converted from 545000 names and identifiers coming from 367000 webpages.

These structures and links have now been uploaded to PubChem and if you are interested in what sort of molecules have been registered via chemicalize.org you can browse them on the PubChem website here