Way back in the distant past when I first joined the Pharma industry I remember working with a dumb terminal running sub-structure queries on a remote mainframe that seemed to take for ever on our relatively modest corporate database, returning the results would then bring our network to a crawl much to the annoyance of my colleagues. I’ve just downloaded ElementalDB, this an iPad application that does a substructure search of a 1,200,000 structure database in less than a second. I know people have described the iPad as a hand held computer but this stunning demonstration really brings it home.
Dotmatics have downloaded the Chembl 15 database of 1.2 million structures in 2D sdf format
ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs.
The sdf records have been compressed to reduce size and put into a database. They then use classing path based fingerprints similar to the way Pinpoint does in Oracle. They have also arranged the compounds from lowest heavy atom count in the database so you find matches in smaller things first, the molecular weight and XLogP have been calculated. By default only the first 6 matches are returned, however you can change the number of records returned by selecting the help button in the top right corner. You can then click on a record to see the full entry in Chembl.
The search speed is really impressive on what is a very substantial database. At the moment only substructure search is enabled, but a variety of fingerprints could be used to enable similarity searching.
Here are a few screenshots.
The query is created using the intuitive Elemental chemical drawing application, once the query structure is drawn simply touch the “Search” button.
The first 12 results are returned in a 3 by 4 grid, Structure, ID, MWt and XLogP.
If you click on the CHEMBL id a link is opened in the browser displaying the ChEMBL report card.
At the bottom of the results table is an option to email the search results.
The email actually contains the CHEMBLID, molecular weight, XLogP and the SMILES string as shown below.
The best way to get an idea of the application is to see it in action.
The app has been regularly updated with the latest versions of ChEMBL and rendering improved with the latest versions of iOS.
I guess the real question is what would the user want from an iPad application, this could be useful tool for downloading screening results and exploring them over a coffee, it would be interesting to see if the data visualisation capabilities of Vortex could be ported to the iPad. It might also offer the opportunity to explore different ways to visualise or explore data. It could become the basis of a structure searchable electronic notebook, perhaps an electronic text book or your own personal reference section. This is a very impressive technical demonstration of the computing power of the iPad and really opens up the opportunities for the development of novel chemistry applications on mobile platforms.
Page updated 29 April 2015