MZmine is an open-source project mass-spectrometry data processing
Just came across this project, MZmine is an open-source project delivering a graphical, interactive software for mass-spectrometry data processing.
SPINOUT DEVELOPS OPEN-SOURCE SOFTWARE FOR THE PHARMACEUTICAL INDUSTRY
An interesting initiative, OpenBioSim
We work with industrial partners to facilitate integration of open-source software solutions as components of technology platforms. We also provide ongoing software support and maintenance to open-source projects to ensure compatibility with our customer’s technology infrastructure.
Does this sound familiar?
We’ve had companies come along wanting to use one of our research software, but the academic who developed it had moved on, and we couldn’t offer robust technical support. So OpenBioSim allows us to hire staff for the purpose of building on top of existing OSS and providing long-term support.
Ongoing support is the Achilles heel of open source scientific software, it is often written to address a particular scientific problem and not with the idea of use by other (often non-programmers) scientists.
OPSIN 2.7.0 has been released
OPSIN - Open Parser for Systematic IUPAC Nomenclature, has been updated https://github.com/dan2097/opsin/releases/tag/2.7.0.
OPSIN is a Java library for IUPAC name-to-structure conversion offering high recall and precision on organic chemical nomenclature.
Java 8 (or higher) is required for OPSIN 2.7.0
Supported outputs are SMILES, CML (Chemical Markup Language) and InChI (IUPAC International Chemical Identifier)
Convert a chemical name to SMILES
java -jar opsin-cli-2.7.0-jar-with-dependencies.jar -osmi input.txt output.txt
where input.txt contains chemical name/s, one per line
WebMolKit: switched to Apache 2.0
Just saw this.
WebMolKit is a cheminformatics library that I’ve been working on for a long time: it runs on all kinds of JavaScript engines (browsers, desktop via Electron, command line via NodeJS). Its flagship feature is a powerful chemical sketcher, but it also has many supporting functions for handling molecules. As of now, the licensing terms have been switched to Apache 2.0, which basically means you are allowed to use it for non-open projects, as long as proper credit is given
I've updated the Open Source Cheminformatics Toolkits page
Open-Source Tools workshops
Registration for the next batch of Open-Source Tools workshops run by the RSC Chemical Information and Computer Applications Group is now open.
https://www.eventbrite.com/e/open-source-tools-for-chemistry-tickets-294585512197?.
These workshops have been enormously popular and the interactions with the instructors have been especially valuable. Details of the next 3 workshops are described below.
All meetings start at 2 pm UK time (5 min break after 1 hour). All run using Zoom Webinar
21 April 2022 PDBe Knowledge Base (David Armstrong)
This workshop explores the Protein Data Bank in Europe Knowledge Base (PDBe-KB https://www.ebi.ac.uk/pdbe/) resource and its tools for the investigation, analysis, and interpretation of biomacromolecular structures. PDBe-KB brings together data from all PDB entries and displays this data as aggregated information for individual proteins, including ligand binding sites, macromolecular interactions and more. Furthermore, this community-led resource brings together structural and functional information from a host of other related resources. In this workshop, you will learn how to use the PDBe-KB aggregated views for proteins to investigate structural and function information for proteins and their associated ligands. We will also demonstrate effective use of novel visualisation components of large-scale structural data on these pages, including 3D visualisation of superposed protein structures with their bound ligands.
19 May 2022 KILFS database (Albert Jelke Kooistra, Andrea Volkamer )
Over the past three decades, six thousand structures of the catalytic kinase domain have been made publicly available via the Protein Data Bank. But to what extent are we making use of this wealth of information? In order to harness this data in a better way and to make it readily available for all to use in their research, KLIFS (https://klifs.net) was constructed. KLIFS, i.e. the Kinase–Ligand Interaction Fingerprints and Structures database, is a structural kinase database that systematically collects and processes all structures of the catalytic kinase domain. With the database, you can - for example - easily get a complete overview of all structures, search for ligands with a specific binding mode, identify analogs or your ligands of interest, collect data for your data mining and machine learning applications.
For this workshop, the developers of KLIFS have teamed up with the Volkamer Lab and therefore the workshop will be divided into two segments. First, Albert J. Kooistra will give an introduction to KLIFS and demonstrate different functionalities of the KLIFS website and the integration of KLIFS in KNIME via the 3D-e-Chem nodes. In the second half, Andrea Volkamer and Dominique Sydow will demonstrate, based on their new kinase-focused TeachOpenCADD workflow, how to assess kinase similarity from different data perspectives. They will emphasize their Python package KiSSim – a KLIFS-based kinase structural similarity fingerprint, and OpenCADD-KLIFS – a Python module to facilitate the integration of KLIFS data into kinase research workflows.
23 June 2022 Scoring of shape and ESP similarity (Ester Heid)
Electrostatic effects along with volume restrictions play a major role in enzyme and receptor recognition. Evaluating electrostatic and shape similarities of pairs of molecules such as proposed versus known ligands can therefore be valuable indicators of prospective binding affinities. This workshop will demonstrate how to compute electrostatic and shape similarities using the open-source tool ESP-Sim (github.com/hesther/espsim, doi.org/10.26434/chemrxiv-2021-sqvv9-v3). Available options for comparing electrostatics will be discussed interactively on selected examples of public datasets, along with advice on embedding and aligning molecules prior to computing similarities.
New additions to MayaChemTools
There have been a couple of new additions to the fabulous list of tools and scripts on MayaChemTools.
MayaChemTools is a growing collection of Perl and Python scripts, modules, and classes to support a variety of day-to-day computational discovery needs.
o Psi4GenerateConstrainedConformers.py http://www.mayachemtools.org/docs/scripts/html/Psi4GenerateConstrainedConformers.html>
o Psi4PerformConstrainedMinimization.py http://www.mayachemtools.org/docs/scripts/html/Psi4PerformConstrainedMinimization.html.
o Psi4PerformTorsionScan.py http://www.mayachemtools.org/docs/scripts/html/Psi4PerformTorsionScan.html.
These scripts rely on the presence of Psi4 https://psicode.org/ and RDKit in your environment. In addition, the script RDKitPerformTorsionScan.py
MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU LGPL as published by the Free Software Foundation.
Open Source Antibiotics Docking and Scoring
There is an interesting thread on the Open Source Antibiotics GitHub repository.
Jan Jensen is looking at docking and scoring molecules into Mur Ligase C (MurC). In preparation for looking at a genetic algorithm to score docked poses they have docked 260K ligands using Glide. All details are in a Google CoLab Notebook. https://github.com/opensourceantibiotics/murligase/issues/46
As with all the work on OSA everything is in the public domain.
If you want an opportunity to test out your docking algorithm, scoring function or binding affinity prediction tool this looks like a great opportunity.
Importing Open Source Antibiotics Data into DataWarrior
A couple of DataWarrior macros to import data directly from an online spreadsheet. Full details are here
The Open Source Antibiotics Consortium is a group of researchers working together to discover new antibiotics in an open source manner. All data is freely available and in the open. There is also an effort to develop open source solutions to the computational needs of the project. The CompChem Tools page provides links to a variety of Open-Source tools, scripts, Jupyter notebooks.
Open-Source Tools for Chemistry Workshops
Over the years RSC CICAG have held workshops providing tuition for various software packages, because of the physical limitations these workshops had to be limited to around 50 attendees leaving us with a waiting list of folks who missed out. Because of the COVID-19 pandemic the meetings have been moved online allowing much greater access. The recent Open Chemical Science workshops had attendees from 45 different countries. There was considerable interest in including further workshops and so CICAG have started a monthly workshop series, each month a workshop will be held highlighting a particular package/resource.
The first four workshops are now organised, you can register here. These look to be very popular with registrations already up to 150 for the first workshop, which is three-fold higher than we could have accommodated in a physical workshop.
There are four two-hour sessions in this series which will be run on Zoom.
25th March ChimeraX (https://www.rbvi.ucsf.edu/chimerax/) from Tom Goddard Intro to ChimeraX for visualizing proteins, ligands and X-ray and electron microscopy maps. Install ChimeraX beforehand (https://www.rbvi.ucsf.edu/chimerax/) and follow along as we look at nanobody binding to SARS-CoV-2 spikes (https://www.rbvi.ucsf.edu/chimerax/data/nanobody-feb2021/nanobody.html).
21 April Chemical Structure validation/standardisation (Greg Landrum) Possibly the most important step in model or database building is data curation, this workshop will deal with chemical structure validation and standardisation. This workshop will use Python and Jupyter notebooks, delegates will need to install Jupyter and RDKit best installed using conda https://www.rdkit.org/docs/Install.html#cross-platform-under-anaconda-python-fastest-install.
27 May GNINA 1.0 (https://chemrxiv.org/articles/preprint/GNINA10MolecularDockingwithDeep_Learning/13578140) (David Koes) The use of docking to predict ligand binding to a receptor is now well established, this workshop will cover docking and structure-based virtual screening, with an introduction to the theory followed by practical examples.
24 June Advanced DataWarrior (http://www.openmolecules.org/datawarrior/) Isabelle Giraud. The previous very popular introductory workshop brought DataWarrior to a new, wider audience, this workshop will highlight advanced features, macros and other topics that were brought up by users. So feel free to submit requests.
Registration is free and you will be sent login details at a later date.
Register here https://cicag-open-source-tools-for-chemistry.eventbrite.com/.
These workshops are sponsored by LiverpoolChiroChem.
The previous workshops are now available on the RSC CICAG YouTube channel.
Open-Source Tools for Chemistry Workshops
Last year RSC CICAG held a five day conference entitled Open Chemical Science this online event proved to be enormously popular with attendees from 45 different countries. One feature of the meeting was a series of workshops highlighting a number of open-source software tools, these workshops are now available on the RSC CICAG YouTube channel.
There was considerable interest in including further workshops and so CICAG have started a monthly workshop series, each month a workshop will be held highlighting a particular package/resource.
The first four workshops are now organised, you can register here.
There are four two-hour sessions in this series which will be run on Zoom.
25th March ChimeraX (https://www.rbvi.ucsf.edu/chimerax/) from Tom Goddard Intro to ChimeraX for visualizing proteins, ligands and X-ray and electron microscopy maps. Install ChimeraX beforehand (https://www.rbvi.ucsf.edu/chimerax/) and follow along as we look at nanobody binding to SARS-CoV-2 spikes (https://www.rbvi.ucsf.edu/chimerax/data/nanobody-feb2021/nanobody.html).
21 April Chemical Structure validation/standardisation (Greg Landrum) Possibly the most important step in model or database building is data curation, this workshop will deal with chemical structure validation and standardisation. This workshop will use Python and Jupyter notebooks, delegates will need to install Jupyter and RDKit best installed using conda https://www.rdkit.org/docs/Install.html#cross-platform-under-anaconda-python-fastest-install.
27 May GNINA 1.0 (https://chemrxiv.org/articles/preprint/GNINA10MolecularDockingwithDeep_Learning/13578140) (David Koes) The use of docking to predict ligand binding to a receptor is now well established, this workshop will cover docking and structure-based virtual screening, with an introduction to the theory followed by practical examples.
24 June Advanced DataWarrior (http://www.openmolecules.org/datawarrior/) Isabelle Giraud. The previous very popular introductory workshop brought DataWarrior to a new, wider audience, this workshop will highlight advanced features, macros and other topics that were brought up by users. So feel free to submit requests.
Registration is free and you will be sent login details at a later date.
Register here https://cicag-open-source-tools-for-chemistry.eventbrite.com/.
OpenChem
OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend. The goal of OpenChem is to make Deep Learning models an easy-to-use tool for Computational Chemistry and Drug Design Researchers.
You can read about in this publication DOI.
All code is available on GitHub https://github.com/Mariewelt/OpenChem.
Requires
- Modern NVIDIA GPU, compute capability 3.5 or newer.
- Python 3.5 or newer (we recommend Anaconda distribution)
- CUDA 9.0 or newer
numpy, pyyaml, scipy, ipython, mkl, scikit-learn, six, pytest, pytest-cov
The software is licensed under the MIT license
Open Chemical Sciences Workshops
The RSC CICAG Open Chemical Sciences meeting was a great success with attendees from around the world able to attend this online event. You can read more about the meeting on the CICAG website http://www.rsccicag.org/previous%20meetings.htm.
One component of the meeting was a series of workshops highlighting key Open-Source software applications for chemists. These sessions were recorded and are now available on the CICAG YouTube channel.
DataWarrior workshop by Isabelle Giraud DataWarrior combines dynamic graphical views and interactive row filtering with chemical intelligence. Scatter plots, box plots, bar charts and pie charts not only visualize numerical or category data, but also show trends of multiple scaffolds or compound substitution patterns. This workshop is an introductory tutorial, DataWarrior can be downloaded here http://www.openmolecules.org/datawarrior/download.html.
PyMOL workshop by Garrett Morris PyMOL is a comprehensive software package for rendering and animating 3D structures, in particular biomolecules. Website https://pymol.org/2/ You can install PyMOL via Conda: Conda: https://www.anaconda.com/distribution/ or Miniconda: https://docs.conda.io/en/latest/miniconda.html https://anaconda.org/psi4/pymol or https://omicx.cc/2019/05/26/install-pymol-windows/ or PyMOL from GitHub: https://github.com/schrodinger/pymol-open-source
UsingGoogleCoLab workshop by Jan Jensen Colaboratory, or "Colab" for short, allows you to write and execute Python in your browser, with Zero configuration required.
- Initial notebook: https://colab.research.google.com/drive/1t2ED9woH_cTOhCiiUZZzrTIUUfwsSds5?usp=sharing
- Final notebook: https://colab.research.google.com/drive/1rJOE6RNTCjByMqR-L3B7tZtwML3nIQVf?usp=sharing
ChEMBL workshop by Anna Gaulton ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.. This workshop is an introductory tutorial. Website https://www.ebi.ac.uk/chembl/
Fragalysis workshop by Rachel Skyner Fragalysis (fragment analysis) is a web-based platform for fragment-based drug discovery https://fragalysis.diamond.ac.uk/viewer/react/landing/). It’s initial use case is focussed around the fragment screening experiment at Diamond.. This workshop is an introductory tutorial.
Knime workshop by Greg Landrum KNIME Analytics Platform is the open source software for creating data science. Intuitive, open, and continuously integrating new developments, KNIME makes understanding data and designing data science workflows and reusable components accessible to everyone. This workshop is an introductory tutorial. Knime can be downloaded here https://www.knime.com/downloads. Data used in tutorial are here
Workshop on Open-Source Tools for Chemistry
Just a couple of notes for software installs prior to the event for those attending the free online Workshop on Open-Source Tools for Chemistry 9-13 November 2020.
Monday 13-30 to 15-30 Cheminformatics and Data Analysis using DataWarrior (Isabelle Giraud)
DataWarrior can be downloaded from here http://www.openmolecules.org/datawarrior/download.html
The training files can all be downloaded from here
Monday 16 - 00 to 18-00 Molecular visualisation using Pymol (Garrett Morris)
Software to install:
PyMOL via Conda:
Conda: https://www.anaconda.com/distribution/
or Miniconda: https://docs.conda.io/en/latest/miniconda.html
https://anaconda.org/psi4/pymol or https://omicx.cc/2019/05/26/install-pymol-windows/
PyMOL via MacPorts:
http://www.ub.edu/cbdd/?q=content/installing-pymol-macports
% sudo port install tcl -corefoundation
% sudo port install tk -quartz
% sudo port install pymol
PyMOL from GitHub:
https://github.com/schrodinger/pymol-open-source
Tuesday 11 to 13-00 Chemistry in the cloud: leveraging Google Colab for quantum chemistry (Jan Jensen)
Participants should download Chrome and have a Google account
Participants should make sure they can access this page: https://bit.ly/37fIYbp.
Some basic degree of Python proficiency is required for the course
It would be great if participants could fill out this survey https://forms.gle/pjwsnJTb4X6QpiHK9 early enough to help me design the course
Wednesday 13-30 to 15-30 Accessing biological and chemical data in ChEMBL (Anna Gaulton)
Requires a modern web-browser (with javascript not blocked) such as Chrome/Safari
Thursday 16-00 to 18-00 Fragment based screening, XChem at Diamond (Rachel Skyner)
Requires Chrome web browser, if there is time Rachel would like to give an introduction to the new Python API, we can go through the installation at the workshop but you must have Anaconda installed.
Friday 11-00 to 13-00 An introduction to KNIME workflows (Greg Landrum)
Knime can be downloaded here https://www.knime.com/downloads
Registration This event will be free to attend but registration is required.
More details and registration can be found here https://www.rsc.org/events/detail/43180/workshop-on-open-source-tools-for-chemistry.
Last Updated 28 October 2020
Workshop on Open-Source Tools for Chemistry
All scientists working in chemistry need software tools for accessing, handling and storing chemical information, or performing molecular modelling and computational chemistry. There is now a wealth of open-source tools to help in these activities; however, many are not as well-known as commercial offerings. This workshop offers a unique opportunity for attendees to try out a range of open-source software packages for themselves with expert tuition in different aspects of chemistry.
The software packages will be presented over six two-hour sessions as follows:
09 November: 13.30 - 15.30 Cheminformatics and data analysis using Data Warrior (Isabelle Giraud) 09 November: 16.00 - 18.00 Molecular visualization using PyMOL (Garrett M Morris)
10 November: 11.00 - 13.00 Chemistry in the cloud: leveraging Google Colab for quantum chemistry (Jan Jensen)
11 November: 13.30 - 15.30 Accessing biological and chemical data in ChEMBL (Anna Gaulton)
12 November: 16.00 - 18.00 Fragment-based screening, XChem at Diamond (Rachael Skyner)
13 November: 11.00 - 13.00 Interactive and automated chemical data analysis with KNIME (Greg Landrum)
Registration This event will be free to attend but registration is required.
More details and registration can be found here https://www.rsc.org/events/detail/43180/workshop-on-open-source-tools-for-chemistry.
Online Events
The current global pandemic means that more events are moving online, here are details of a few that have been sent to me
Dotmatics User Symposium | Cambridge 2020 14th & 15th October Details and Registration.
KNIME Introduction to Working with Chemical Data October 12 - 16, 2020 details and registration.
Virtual RDKit UGM 6-8 October 2020 details and registration.
16th German Conference on Cheminformatics and EuroSAMPL Satellite Workshop 2-3 November 2020 details
Open Chemical Science 9 - 13 November 2020 details.
An open source chemical structure curation pipeline using RDKit
I just thought I'd flag a recent paper in Journal of Cheminformatics described "An open source chemical structure curation pipeline using RDKit" DOI. As anyone who has had to curate a molecular dataset knows standardising the chemical structures is one of the absolutely key elements of the process.
A chemical curation pipeline has been developed using the open source toolkit RDKit. It comprises three components: a Checker to test the validity of chemical structures and flag any serious errors; a Standardizer which formats compounds according to defined rules and conventions and a GetParent component that removes any salts and solvents from the compound to create its parent. This pipeline has been applied to the latest version of the ChEMBL database as well as uncurated datasets from other sources to test the robustness of the process and to identify common issues in database molecular structures.
The ChEMBLStructurePipeline is freely available on GitHub https://github.com/chembl/ChEMBLStructurePipeline/releases/tag/1.0.0.
Or using Anaconda
conda install -c conda-forge chembl_sructure_pipeline
fpocket a very fast open source protein pocket detection algorithm
The fpocket suite of programs is a very fast open source protein pocket detection algorithm based on Voronoi tessellation. The platform is suited for the scientific community willing to develop new scoring functions and extract pocket descriptors on a large scale level.
What's new compared to fpocket 2.0 (old sourceforge repo)
fpocket:
- is now able to consider explicit pockets when you want to calculate properties for a known binding site
- cli changed a bit
- pocket flexibility using temperature factors is better considered (less very flexible pockets on very solvent exposed areas)
- druggability score has been reoptimized vs original paper. Yields now slightly better results than the original implementation.
- compiler bug on newer compilers fixed
mdpocket:
- can now read Gromacs XTC, netcdf and dcd trajectories
- can also read prmtop topologies
- if topology provided, interaction energy grids can be calculated for transient pockets and channels (experimental)
The GitHub page https://github.com/Discngine/fpocket contains detailed instructions for installation. This project is licensed under the MIT License
250 downloads of iBabel4
I just noticed there have been 250 downloads of iBabel4, I can count these like citations right ? 😁 x You can download iBabel here.
iBabel is a graphical user interface (GUI) to the open-source cheminformatics toolkit Open Babel
I'm starting to compile a list of updates/additions, if you have any suggestions let me know.
AmberTools20 is now available!
New and updated forcefields:
- FF19SB for proteins
- Amber-Dyes parameters for fluorescent dyes and linkers
- SIRAH model for coarse-grained simulations
- Amoeba and GEM polarizable force fields, via gem.pmemd
Also
Now builds with cmake; conversion to python3
AmberTools consists of several independently developed packages that work well by themselves, and with Amber20 itself. The suite can also be used to carry out complete molecular dynamics simulations, with either explicit water or generalized Born solvent models.
The AmberTools suite is free of charge, and its components are mostly released under the GNU General Public License (GPL)
Open Chemical Science – three-day series of events
The Royal Society of Chemistry’s Chemical Information and Computer Applications Group (CICAG) is organising a series of three one-day events with the theme “Open Chemical Science” on 11-13 November 2020. Venue: RSC, Burlington House, London, United Kingdom To find out more and to register, visit the web pages for the events:
- Open Access Publishing for Chemistry: https://www.rsc.org/events/detail/43178/open-access-publishing-for-chemistry
- Open Data for Chemistry: https://www.rsc.org/events/detail/43179/open-data-for-chemistry
- Workshop on Open-Source Tools for Chemistry: https://www.rsc.org/events/detail/43180/workshop-on-open-source-tools-for-chemistry
Submissions of abstracts for oral or poster presentations are invited for the "Open Access Publishing for Chemistry" and "Open Data for Chemistry" days. The submission forms, information about deadlines and the submission process are on the web pages above. Please contact Gillian Bell, cicageventsmanager@gmail.com, with any enquiries.
A number of bursaries are available to support registration and travel costs for postgraduate students, the retired and those on career breaks. An application form is available on this page and further information regarding travel grants available for RSC members can be found at: www.rsc.org/awards-funding/funding/#funding-options.
* If it is not possible to run the meetings at Burlington House due to Covid-19, they will be held as webinars or online events.
MyChem cheminformatics extension for MySQL and MariaDB
After being dormant for a while this project seems to have come back to life.
Mychem is a chemoinformatics extension for MySQL and MariaDB released under the GNU GPL license. It provides a set of functions that permits to handle chemical data within the database. These functions permit to search, analyze and convert chemical data. It is based on Open Babel.
A complete documentation for Mychem is available online and will give a good overview of its capabilities.
https://mychem.github.io/docs/
I'd be interested to hear if anyone has installed under Mac OSX.
RDKit 2019_09_1 (Q3 2019) Release
A new version of RDKit has been released https://github.com/rdkit/rdkit/releases/tag/Release201909_1.
Highlights:
- The substructure matching code is now about 30% faster. This also improves the speed of reaction matching and the FMCS code.
- A minimal JavaScript wrapper has been added as part of the core release.
- It's now possible to get information about why molecule sanitization failed.
- A flexible new molecular hashing scheme has been added.
There are however a number of backward incompatible changes detailed in the documents.
Also the old MolHash code should be considered deprecated. This release introduces a more flexible alternative.
Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit). The available conda binaries for this release are:
- Linux 64bit: python 3.6, 3.7
- Mac OS 64bit: python 3.6, 3.7
- Windows 64bit: python 3.6, 3.7
Some things that will be finished over the next couple of days:
- The conda build scripts will be updated to reflect the new version
- The homebrew script
A handy guide to financial support for open source
I've previously written about my thoughts on the sustainability of scientific software and tried to help publicise by compiling a listing of open-source cheminformatics toolkits and Open Source Python Data Science Libraries.
I've just come across this document that may be of interest A handy guide to financial support for open source.
This document aims to provide an exhaustive list of all the ways that people get paid for open source work. Hopefully, projects and contributors will find this helpful in figuring out the best options for them.
Well worth a read and please share.
Fortran on a Mac
I've done another update to the Fortran on a Mac page.
Added a number of open-source comp chem packages.
Many thanks to Sebastian Ehlert for highlighting dftd4.
Durrant Lab Software
A reader recently pointed out BlendMol part of a suite of software tools developed by the Jacob Durrant Lab.
BlendMol is a Blender plugin that can easily import VMD 'Visualization State' and PyMOL 'Session' files. BlendMol empowers scientific researchers and artists by marrying molecular visualization and industry-standard rendering techniques. The plugin works seamlessly with popular analysis programs (i.e., VMD/PyMOL). Users can import into Blender the very molecular representations they set up in VMD/PyMOL.
This looks like a very interesting open-source project available on GitHub, however looking at the software page https://durrantlab.pitt.edu/durrant-lab-software/ I see there are a number of other interesting packages.
Dimorphite-DL adds hydrogen atoms to molecular representations, as appropriate for a user-specified pH range. It is a fast, accurate, accessible, and modular open-source program for enumerating small-molecule ionization states.
Gypsum-DL is a free, open-source program that converts 1D and 2D small-molecule representations (SMILES strings or flat SDF files) into 3D models. It outputs models with alternate ionization, tautomeric, chiral, cis/trans isomeric, and ring-conformational states.
PCAViz is an open-source Python/JavaScript toolkit for sharing and visualizing MD trajectories via a web browser. To encourage use, an easy-to-install PCAViz-powered WordPress plugin enables ‘plug-and-play’ trajectory visualization.
Scoria is a Python package for manipulating three dimensional molecular data. Unlike similar packages, Scoria is written in pure Python and so requires no dependencies or installation. One can incorporate the Scoria source code directly into their own programs. But Scoria is not designed to compete with other similar packages. Rather, it complements them. Our package leverages others (e.g., NumPy, SciPy, MDAnalysis), if present, to speed and extend its own functionality.
Looks like a great resource.
Novartis Open Source tools for Drug Discovery
I'm sure most readers of this site are aware of the Open-Source cheminformatics toolkit RDKit that was first developed in Novartis. However I wonder how many are aware of the other Open-Source tools that Novartis have supported.
You can read more about them here
The Novartis Institutes for BioMedical Research (NIBR) is pioneering new informatics tools for drug discovery. We believe in the power of open-sourced, global collaboration for the greater good. Join us to help patients worldwide.
They are available on GitHub here.
They include Habitat an object management system, OntoBrowser a tool to manage ontologies and controlled terminologies. YAP is an extensible parallel framework, written in Python using OpenMPI libraries, and GridVar a jQuery plugin that visualises multi-dimensional datasets as layers organised in a row-column format
An interactive RDKit widget for Jupyter: a first pass
This looks like it could be very interesting.
A blog post by Greg Landrum a widget for displaying molecules where you can select atoms and find out which atoms are selected propagating to Python in a Jupyter Notebook.
This is basic, but I think it's a decent start towards something that could be really useful. Interested? Have suggestions (ideally accompanied by code!) on how to improve it? If it looks like this is actually likely to be used, I will figure out how to create a standalone nbwidget out of this and create a separate github repo for it.
Looks like a useful tool for selecting bonds for conformational analysis, selecting bonds for creating a Ramachandran plot, selecting groups for bioisosteric replacement……
Sounds like Greg is looking for input.
Open Force Field Toolkit update
The Open Force Field Consortium is an industry-funded effort to develop small molecule force fields.
0.4.1 - Bugfix Release This update fixes several toolkit bugs that have been reported by the community. Details of these bugfixes are provided here. It also refactors how ParameterType and ParameterHandler store their attributes, by introducing ParameterAttribute and IndexedParameterAttribute. These new attribute-handling classes provide a consistent backend which should simplify manipulation of parameters and implementation of new handlers.
Openforcefield
The Open Force Field Initiative is an open source, open science, and open data approach to better force fields. All the code is on GitHib and they also provide highly curated datasets.
The idea is to enable molecular mechanics on small and macromolecules jointly using open and freely available software.
A recent blog post from Peter Schmidtke caught my eye.
Recently a few updates of the openforcefield toolkit came out … a game changer, as you’ll see.
The work investigated whether the 768 fragments from the XChem fragment library at Diamond can be parametrised with the new version of Open Force Field (0.4) and how they behave after a simple minimisation.
In short all fragments technically pass the parametrisation and minimisation step, this was supported by visual inspection.
All the code is on GitHub.
Open Source Python Data Science Libraries
When I wrote the article entitled A few thoughts on scientific software one of the responses I got was that people did not know about the existence of open-source chemistry toolkits so I thought I'd publish a page that hopefully prevent stop people reinventing the wheel. Here are a few open-source cheminformatics toolkits that I'm aware of.
As a follow up I thought I'd put together a list of useful python libraries for data science
As always happy to hear comments or suggestion for additions.
Open Source Cheminformatics Tookits
When I wrote the article entitled A few thoughts on scientific software one of the responses I got was that people did not know about the existence of open-source chemistry toolkits so I thought I'd publish a page that hopefully prevent stop people reinventing the wheel. Here are four open-source toolkits that I'm aware of and if I've missed any, my apologies and send me details. Listing of Open-source cheminformatics toolkits
NWChem updated
Just catching up.
NWChem 6.8 is now available on Github https://github.com/nwchemgit/nwchem.
NWChem provides many methods for computing the properties of molecular and periodic systems using standard quantum mechanical descriptions of the electronic wavefunction or density. Its classical molecular dynamics capabilities provide for the simulation of macromolecules and solutions, including the computation of free energies using a variety of force fields. These approaches may be combined to perform mixed quantum-mechanics and molecular-mechanics simulations.
Instructions for compiling NWChem on various platforms including Mac OSX https://github.com/nwchemgit/nwchem/wiki/Compiling-NWChem.
Manuscriptsapp is now open source
Manuscriptsapp is a great writing tool designed from the ground up for creating scientific publications. This week we heard an interesting development, it's now free, it will be open source.
There is a detailed blog post here giving the background.
I integrates nicely with a variety of reference managers (Mendeley, Zotero, Papers 3, Bookends and EndNote) with a couple of clicks and you can cite directly with specially supported reference managers, F1000Workspace New, Papers (Magic Citations) or Bookends. It has a Simple table editor with header, body and footer styles built-in and customizable. Tables can be imported from and exported to Word, Markdown, even LaTeX. You can create equations in LaTeX markup, or paste from MathType. Chemistry support is limited but is certainly on their todo list and they would love to have interested chemists to work with.
If you have not used it before now would be a good time to download and try it out. http://updates.manuscriptsapp.com/apps/manuscripts/download.
SketchEl 2
As highlighted recently SketchEl2 a chemical drawing package is now open source.
The SketchEl 2 project is underway as a desktop app, based on web technology and delivered as an Electron package. The GitHub repository is now public, on account of there being enough functionality to be arguably useful. This is a very early release, so do be ready to give some useful feedback if you feel so inclined to try it out.
The repository can be found here https://github.com/aclarkxyz/web_sketchel2
Psi4 1.1: An Open-Source Electronic Structure Program
A recent paper describes Psi4 1.1: An Open-Source Electronic Structure Program Emphasizing Automation, Advanced Libraries, and Interoperability DOI
Psi4 is an ab initio electronic structure program providing methods such as Hartree–Fock, density functional theory, configuration interaction, and coupled-cluster theory. The 1.1 release represents a major update meant to automate complex tasks, such as geometry optimization using complete-basis-set extrapolation or focal-point methods. Conversion of the top-level code to a Python module means that Psi4 can now be used in complex workflows alongside other Python tools.
Psi4 1.1 can be downloaded from here with versions supporting Python 2.7, 3.5 and 3.6.
Note the installation instructions for Mac: Install XCode via the App Store, Make sure you open XCode and accept the license agreement after you install.
Scaffold Hunter update
Scaffold Hunter is a chemical data organization and analysis tool and that has been continuously enhanced since the start of its development in 2007. The platform-independent open-source tool was first released in 2009 and provided an interactive visualisation of the so-called scaffold tree, which is a hierarchical classification scheme for molecules based on their common scaffolds. A recent publication describes recent extensions that significantly increase the applicability for a variety of tasks DOI.
When I first opened the application I did not find it particularly intuitive, fortunately there is a online tutorial and sample datasets available.
aRMSD: A Comprehensive Tool for Structural Analysis
aRMSD is an open toolbox for structural comparison between two molecules with various capabilities to explore different aspects of structural similarity and diversity. Crystallographic data provided from cif files is fully supported and the results can be rendered with the help of the vtk package.
A. Wagner, H.-J. Himmel, J. Chem. Inf. Model, 2017, 57, 428-438 DOI
MayaChemTools: An Open Source Package for Computational Drug Discovery
Just noticed this paper.
MayaChemTools: An Open Source Package for Computational Drug Discovery 10.1021/acs.jcim.6b00505">DOI.
MayaChemTools is a growing collection of Perl scripts, modules, and classes to support a variety of computational drug discovery needs, such as manipulation and analysis of data, generation of two-dimensional (2D) fingerprints, similarity searching, and calculation of physicochemical properties.
MayaChemTools is freely available online at www.MayaChemTools.org, under the terms of the GNU LGPL, as published by the Free Software Foundation.
It is possible to access them using a Vortex script.
Darwin source code released
It is sometimes difficult to remember that the heart of Mac OSX is the open-source Darwin source code.
Apple have recently released the latest update OS X 10.12 Source.
In addition Apple have made Swift open-source which supports a wider variety of platforms.
OpenBabel 2.4.0 released
A major new update to OpenBabel has been released, version 2.4.0 is a significant change and is highly recommended.
New file formats
- DALTON output files (read only) and DALTON input files (read/write) (Casper Steinmann)
- JSON format used by ChemDoodle (read/write) (Matt Swain)
- JSON format used by PubChem (read/write) (Matt Swain)
- LPMD's atomic configuration file (read/write) (Joaquin Peralta)
- The format used by the CONTFF and POSFF files in MDFF (read/write) (Kirill Okhotnikov)
- ORCA output files (read only) and ORCA input files (write only) (Dagmar Lenk)
- ORCA-AICCM's extended XYZ format (read/write) (Dagmar Lenk)
- Painter format for custom 2D depictions (write only) (Noel O'Boyle)
- Siesta output files (read only) (Patrick Avery)
- Smiley parser for parsing SMILES according to the OpenSMILES specification (read only) (Tim Vandermeersch)
- STL 3D-printing format (write only) (Matt Harvey)
- Turbomole AOFORCE output (read only) (Mathias Laurin)
- A representation of the VDW surface as a point cloud (write only) (Matt Harvey)
New file format capabilities and options
- AutoDock PDBQT: Options to preserve hydrogens and/or atom names (Matt Harvey)
- CAR: Improved space group support in .car files (kartlee)
- CDXML: Read/write isotopes (Roger Sayle)
- CIF: Extract charges (Kirill Okhotnikov)
- CIF: Improved support for space-groups and symmetries (Alexandr Fonari)
- DL_Poly: Cell information is now read (Kirill Okhotnikov)
- Gaussian FCHK: Parse alpha and beta orbitals (Geoff Hutchison)
- Gaussian out: Extract true enthalpy of formation, quadrupole, polarizability tensor, electrostatic potential fitting points and potential values, and more (David van der Spoel)
- MDL Mol: Read in atom class information by default and optionally write it out (Roger Sayle)
- MDL Mol: Support added for ZBO, ZCH and HYD extensions (Matt Swain)
- MDL Mol: Implement the MDL valence model on reading (Roger Sayle)
- MDL SDF: Option to write out an ASCII depiction as a property (Noel O'Boyle)
- mmCIF: Improved mmCIF reading (Patrick Fuller)
- mmCIF: Support for atom occupancy and atom_type (Kirill Okhotnikov)
- Mol2: Option to read UCSF Dock scores (Maciej Wójcikowski)
- MOPAC: Read z-matrix data and parse (and prefer) ESP charges (Geoff Hutchison)
- NWChem: Support sequential calculations by optionally overwriting earlier ones (Dmitriy Fomichev)
- NWChem: Extract info on MEP(IRC), NEB and quadrupole moments (Dmitriy Fomichev)
- PDB: Read/write PDB insertion codes (Steffen Möller)
- PNG: Options to crop the margin, and control the background and bond colors (Fredrik Wallner)
- PQR: Use a stored atom radius (if present) in preference to the generic element radius (Zhixiong Zhao)
- PWSCF: Extend parsing of lattice vectors (David Lonie)
- PWSCF: Support newer versions, and the 'alat' term (Patrick Avery)
- SVG: Option to avoid addition of hydrogens to fill valence (Lee-Ping)
- SVG: Option to draw as ball-and-stick (Jean-Noël Avila)
- VASP: Vibration intensities are calculated (Christian Neiss, Mathias Laurin)
- VASP: Custom atom element sorting on writing (Kirill Okhotnikov)
Other new features and improvements
- 2D layout: Improved the choice of which bonds to designate as hash/wedge bonds around a stereo center (Craig James)
- 3D builder: Use bond length corrections based on bond order from Pyykko and Atsumi (http://dx.doi.org/10.1002/chem.200901472) (Geoff Hutchison)
- 3D generation: "--gen3d", allow user to specify the desired speed/quality (Geoff Hutchison)
- Aromaticity: Improved detection (Geoff Hutchison)
- Canonicalisation: Changed behaviour for multi-molecule SMILES. Now each molecule is canonicalized individually and then sorted. (Geoff Hutchison/Tim Vandermeersch)
- Charge models: "--print" writes the partial charges to standard output after calculation (Geoff Hutchison)
- Conformations: Confab, the systematic conformation generator, has been incorporated into Open Babel (David Hall/Noel O'Boyle)
- Conformations: Initial support for ring rotamer sampling (Geoff Hutchison)
- Conformer searching: Performance improvement by avoiding gradient calculation and optimising the default parameters (Geoff Hutchison)
- EEM charge model: Extend to use additional params from http://dx.doi.org/10.1186/s13321-015-0107-1 (Tomáš Raček)
- FillUnitCell operation: Improved behavior (Patrick Fuller)
- Find duplicates: The "--duplicate" option can now return duplicates instead of just removing them (Chris Morley)
- GAFF forcefield: Atom types updated to match Wang et al. J. Comp. Chem. 2004, 25, 1157 (Mohammad Ghahremanpour)
- New charge model: EQeq crystal charge equilibration method (a speed-optimized crystal-focused charge estimator, http://pubs.acs.org/doi/abs/10.1021/jz3008485) (David Lonie)
- New charge model: "fromfile" reads partial charges from a named file (Matt Harvey)
- New conversion operation: "changecell", for changing cell dimensions (Kirill Okhotnikov)
- New command-line utility: "obthermo", for extracting thermochemistry data from QM calculations (David van der Spoel)
- New fingerprint: ECFP (Geoff Hutchison/Noel O'Boyle/Roger Sayle)
- OBConversion: Improvements and API changes to deal with a long-standing memory leak (David Koes)
- OBAtom::IsHBondAcceptor(): Definition updated to take into account the atom environment (Stefano Forli)
- Performance: Faster ring-finding algorithm (Roger Sayle)
- Performance: Faster fingerprint similarity calculations if compiled with -DOPTIMIZE_NATIVE=ON (Noel O'Boyle/Jeff Janes)
- SMARTS matching: The "-s" option now accepts an integer specifying the number of matches required (Chris Morley)
- UFF: Update to use traditional Rappe angle potential (Geoff Hutchison)
Language bindings
- Bindings: Support compiling only the bindings against system libopenbabel (Reinis Danne)
- Java bindings: Add example Scala program using the Java bindings (Reinis Danne)
- New bindings: PHP (Maciej Wójcikowski)
- PHP bindings: BaPHPel, a simplified interface (Maciej Wójcikowski)
- Python bindings: Add 3D depiction support for Jupyter notebook (Patrick Fuller)
- Python bindings, Pybel: calccharges() and convertdbonds() added (Patrick Fuller, Björn Grüning)
- Python bindings, Pybel: compress output if filename ends with .gz (Maciej Wójcikowski)
- Python bindings, Pybel: Residue support (Maciej Wójcikowski)
Development/Build/Install Improvements
- Version control: move to git and GitHub from subversion and SourceForge
- Continuous integration: Travis for Linux builds and Appveyor for Windows builds (David Lonie and Noel O'Boyle)
- Python installer: Improvements to the Python setup.py installer and "pip install openbabel" (David Hall, Matt Swain, Joshua Swamidass)
- Compilation speedup: Speed up compilation by combining the tests (Noel O'Boyle)
- MacOSX: Support compiling with libc++ on MacOSX (Matt Swain)
Importing Open Source Malaria Data into DataWarrior
Thomas Sander from openmolecules.org has provided a version of DataWarrior that can directly import the Open Source Malaria Data.
The new version can be downloaded here http://www.openmolecules.org/datawarrior, once downloaded and you will need to temporarily adjust your security settings to open it the first time. This is because DataWarrior is not from the Mac App Store or an identified developer. Once open make sure you reset your security settings.
Once installed and opened select the macro as shown below to retrieve the Open Source Malaria Data.
The import only takes a few seconds and pulls the data directly from the Open Source Malaria spreadsheet so it will contains the latest information.
There are now a variety of different options for accessing the Open Source Malaria data you can use the Cheminfo spreadsheet, or use a Vortex script or even an iPython notebook.
Open Source Molecular Modeling
A great publication on Open Source Molecular Modeling.
The success of molecular modeling and computational chemistry efforts are, by definition, dependent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io.
From toolkits to desktop applications a fantastic and comprehensive listing.
The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development
I just came across an interesting paper on cross-platform OpenCL programming. The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development. In particular it highlights a number of issues and offers workarounds. These include Framework bugs, Specification limitations and Program bugs.
There are an increasing number of scientific applications taking advantage of GPU acceleration.
Parkinson disease mobile data collected using ResearchKit
ResearchKit is an open-source framework that allows researchers and developers to create powerful apps for medical research.
The Parkinson app is one of the first five apps built using ResearchKit.
mPower is a unique iPhone application that uses a mix of surveys and tasks that activate phone sensors to collect and track health and symptoms of Parkinson Disease (PD) progression - like dexterity, balance or gait. The goal of this app is to learn more about the variations of PD, and to improve the way we describe these variations and to learn how mobile devices and sensors can help us to measure PD and its progression to ultimately improve the quality of life for people with PD.
The initial results have now been published Scientific Data 3, Article number: 160011 (2016) DOI, with around 15,000 people contributed data to the study.
Open Science prize applications

The applications for the Open Science Prize are now in, 92 proposals, all look brilliant. They include apps for mobile devices, machine learning from public datasets, linking science and scientists, tracking disease and much, much more. Well worth popping over and having a look.
FreeSASA: An open source C library for solvent accessible surface area calculations
Calculating solvent accessible surface area is an important calculation in the study of protein structure and whilst there are many tools to undertake this sort of calculation FreeSASA represents the first open-source free standing tool for this sort of calculation. FreeSASA is an open source C library for SASA calculations that provides both command-line and Python interfaces.
Source code is available for download here and building the FreeSASA library and command-line interface only requires standard C and GNU libraries and a C99-compliant compiler, and should be straightforward on any UNIX system (has been tested in Mac OS X 10.8 and Debian 8).
Mitternacht S. FreeSASA: An open source C library for solvent accessible surface area calculations [version 1; referees: awaiting peer review]. F1000Research 2016, 5:189 DOI
Tabula is awesome!
I recently needed to download the supplementary information provided with a publication, my heart sank when I saw it was provided as a PDF file. My worst fears were justified when I tried to simply copy and paste SMILES strings together with 5 columns of data into a spreadsheet, no chance of it copying across in an ordered manner!
Then I tried Tabula a tool for "liberating data tables locked inside PDF files". It worked perfectly, nearly 2000 rows of data spread over 11 pages converted to a csv file in a couple of mouse clicks. This is wonderful and should be part of any data scientists toolkit.
It is included on the Data Analysis Tools page but really deserves a special mention.
Apple and Open Source
Whilst the decision to make Swift open source certainly captured the headlines, it is worth noting that Apple contributes to many more open source projects, there are more details about these open source projects on the developer and main Apple websites.
Swift Open Source
As I previously highlighted after the WWDC Apple have announced that Swift is now open source.
More details are on the Swift blog
Swift is now open source. Today Apple launched the open source Swift community, as well as amazing new tools and resources including: Swift.org – a site dedicated to the open source Swift community Public source code repositories at github.com/apple A new Swift package manager project for easily sharing and building code A Swift-native core libraries project with higher-level functionality above the standard library Platform support for all Apple platforms as well as Linux
Swift.org is an entirely new site dedicated to open source Swift. This site hosts resources for the community of developers that want to help evolve Swift, contribute fixes, and most importantly, interact with each other. It also provides development snapshots for Apple and Linux platforms, requires OS X 10.11 (El Capitan) or Ubuntu 14.04 or 15.10 (64-bit).
Source code is available on Github
Polyphony
Polyphony is an open source software suite written in python. Its purpose is the superimposition free analysis and comparison of multiple 3D structures of the same or closely related protein molecules.
Absolute Requirements
python 2.6 or later, scipy, numpy, Biopython, especially the Bio.PDB module
Highly recommended
All following documentation assumes that you have these installed.
ipython , for interactive python scripting, matplotlib, for graph plotting, PyMOL, for interactive 3D visualisation. Open source version available on SourceForge
William R Pitt, Rinaldo W Montalvão and Tom L Blundell, BMC Bioinformatics, 2014, 15:324 doi
Importing Open Source Malaria Project data
The Open Source Malaria project is trying a different approach to curing malaria. Guided by open source principles, everything is open and anyone can contribute. To date a lot of people around the world have made contributions and the project is at a very exciting stage. Whilst everyone can see the compounds that have been made and the biological data, it is often spread over multiple web pages and can be tricky to link molecule with identifier with data. Over the last couple of months a significant effort has been put into populating a spreadsheet with all the information.
Whilst this is useful for viewing results it is not ideal for trying to build predictive models. Vortex is a chemically intelligent data analysis and visualisation platform. This script provides a one-click access to the OSM data and creates a workspace containing all the data, and since it is linked to the live spreadsheet you will always have access to the latest data.
Installing Open Drug Discovery Toolkit (ODDT)
A recent paper in J Cheminformatics described Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field DOI a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT’s source code, additional examples and documentation are available on GitHub.
To install ODDT on a Mac you first need to install the appropriate toolkits, the easiest way is to use Homebrew, I've written a page detailing how to do this here.
Once installed you can install ODDT using PIP as described here.
Swift 2.0
More news on Swift 2.0 on the Swift Blog
Today at WWDC, we announced Swift 2.0. This new version has even better performance, a new error handling API, and first-class support for availability checking. And platform APIs feel even more natural in Swift with enhancements to the Apple SDKs.
Open Source In addition to new features, the big news is that Apple will be making Swift open source later this year. We are all incredibly excited about this, and look forward to giving you a lot more information as the open source release gets nearer. Here is what we can tell you so far:
Swift source code will be released under an OSI-approved permissive license. Contributions from the community will be accepted — and encouraged. At launch we intend to contribute ports for OS X, iOS, and Linux. Source code will include the Swift compiler and standard library. We think it would be amazing for Swift to be on all your favorite platforms. We are excited about the opportunities an open source Swift creates for our industry. Baked-in safety features combined with excellent speed mean it has the chance to dramatically improve software versus using C-based languages. Swift is packed with modern features, it’s fun to write, and we believe it will get used in a lot of places. Together, we have an exciting road ahead.
Swift Open Source
SpeckTackle
The latest issue of Journal of Cheminformatics has a paper that might be of interest to a variety of people involved in spectroscopy or data visualsation. SpeckTackle: JavaScript charts for spectroscopy.
We present SpeckTackle, a custom-tailored JavaScript charting library for spectroscopy in life sciences. SpeckTackle is cross-browser compatible and easy to integrate into existing resources, as we demonstrate for the MetaboLights database. Its default chart types cover common visualisation tasks following the de facto ‘look and feel’ standards for spectra visualisation.
SpeckTackle is an open-source JavaScript library to create custom-tailored charts for spectroscopy in life sciences. Implemented charts exist for mass spectrometry, one- and two-dimensional NMR, UV/VIS, IR, and general continuous data use cases such as chromatograms.
The authors kindly supply a demo web page demonstrating different chart types and functions of the SpeckTackle library. Example data is embedded in the web page (800 kb file size). Click on the buttons at the top of the page to see the data displayed. For the Chromatogram, Difference Chart and Spectral Match click the button then the Add Data button.
Highlighting a section of the spectra expands the view and mouseover on the 2D NMR spectra provides a tooltip giving chemical shifts
I've added this to the spectroscopy resources page
HackaMol: An Object-Oriented Modern Perl Library for Molecular Hacking on Multiple Scales
To be honest I can't remember when I last used Perl but this publication brought back a few memories DOI.
HackaMol is an open source, object-oriented toolkit written in Modern Perl that organizes atoms within molecules and provides chemically intuitive attributes and methods.
Source code and example scripts are available online at http:// github.com/demianriccardi/HackaMol. There is also a description of an IPerl Notebook in the supporting information.
There is also a very interesting extension HackaMol::X::Vina, a structured class that provides an interface with the AutoDock Vina docking program
Open Phacts API update
The OpenPhacts API has been updated to include two new data sets and the corresponding API calls.
1) DisGeNet target-disease associations These API calls use URIs inputs that correspond to either diseases or targets (proteins or genes). The disease identifiers correspond to UMLS CUIs, Mesh ids or ConceptWiki and can use several namespaces, e.g. http://linkedlifedata.com/resource/umls/id/C0004238, http://purl.bioontology.org/ontology/MSH/D001281, or http://www.conceptwiki.org/concept/index/095cb66f-76ef-41b5-a8ae-c39352e6007e
2) neXtProt nanopublications for tissue expression (PREVIEW mode) These API calls use URIs that correspond to either tissues or targets. The tissue identifiers correspond to the Caloha tissue ontology from neXtProt. These identifiers can use either the namespace from the neXtProt database (e.g. http://www.nextprot.org/db/term/TS-0564, will be operational next week) or the Caloha ontology (ftp://ftp.nextprot.org/pub/currentrelease/controlledvocabularies/caloha.obo#TS-0564, operational now).
To reduce the barriers to drug discovery in industry, academia and for small businesses, the Open PHACTS Discovery Platform provides tools and services to interact with multiple integrated and publicly available data sources. To integrate this data, extensive cross-referencing of scientific concepts is needed across all databases.
Canonical SMILES
I’m a great fan of SMILES notation (simplified molecular-input line-entry system) as a compact means of storing chemical structures, and whilst there are many tools for creating SMILES strings they often give different (but acceptable) results. Various algorithms for generating Canonical SMILES have been developed, including those by Daylight Chemical Information Systems, OpenEye Scientific Software, MEDIT, Chemical Computing Group, MolSoft LLC, all use proprietary code. In the latest issue of Journal of Cheminformatics Noel O’Boyle describes the development of Universal SMILES and Inchified SMILES as implemented in Open Babel an open source cheminformatics toolkit. DOI