Macs in Chemistry

Insanely Great Science

Programming

Extending Jupyter

 

I'm a great fan of Jupyter notebooks and I'm always looking for ways to get more out of them. I came across this blog post recently which is packed with useful tips

99 ways to extend the Jupyter ecosystem

Whenever someone says ‘You can do that with an extension’ in the Jupyter ecosystem, it is often not clear what kind of extension they are talking about. The Jupyter ecosystem is very modular and extensible, so there are lots of ways to extend it. This blog post aims to provide a quick summary of the most common ways to extend Jupyter, and links to help you explore the extension ecosystem.

I've also published some notebooks under Tips and Tutorials, Jupyter notebooks


Comments

Autocompletion with deep learning

 

This looks really interesting

TabNine is an autocompleter that helps you write code faster by adding a deep learning model which significantly improves suggestion quality. You can see videos at the link above.

There has been a lot of hype about deep learning in the past few years. Neural networks are state-of-the-art in many academic domains, and they have been deployed in production for tasks such as autonomous driving, speech synthesis, and adding dog ears to human faces. Yet developer tools have been slow to benefit from these advances

Deep TabNine is trained on around 2 million files from GitHub. During training, its goal is to predict each token given the tokens that come before it. To achieve this goal, it learns complex behaviour, such as type inference in dynamically typed languages.

An interesting idea, my only concern is the quality of code in the training set.

Comments

Swift for TensorFlow Models

 

This repository contains TensorFlow models written in Swift.

Swift for TensorFlow is a next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond. This is an early-stage project: it is not feature-complete nor production-ready, but it is ready for pioneers to try in projects, give feedback, and help shape the future!

This is the second public release of Swift for TensorFlow, available across Google Colaboratory, Linux, and macOS.


Comments

Chemfiles 0.9

 

Just got this message

We are very happy to announce the release of the 0.9 version of Chemfiles. Chemfiles is a C++ library providing write and read access to chemistry file formats. Chemfiles also has bindings to other languages and can be used from C, Fortran, Python, Julia and Rust.

Source code is available on GitHub also described in detail here DOI.

It can be installed using Conda

conda install -c conda-forge chemfiles

There are other libraries for file conversion in particular OpenBabel a C++ library providing conversions between more than 110 formats.


Comments

Python Collection

 

I was sent to this recently.

Python Collection

This collection publishes articles describing new Python modules and libraries, as well as applications developed in Python. Python is a free, open source programming language with an emphasis on readability which is widely used in science due to its ease of use and high-performance. Python’s usefulness in research is further bolstered by scientific libraries and tools such as Numpy, Scipy, Pandas, IPython and MatPlotlib. As for example demonstrated by Biopython, Python libraries can be incredibly valuable to other researchers. Publishing a citable, peer reviewed article outlining a new package boosts its visibility and enables its creators to receive proper credit for their contribution.

Very little there at present but I'll keep an eye on it for the future.


Comments

Swift 5 Released

 

Swift 5 is a major milestone in the evolution of the language.

Swift 5 makes shipping apps dramatically better. The Swift runtime is now built right in to iOS, macOS, watchOS and tvOS. Your app no longer needs to bundle this library for these latest OS releases. And with great App Store support, your users will get faster downloads and smaller apps. Additional Features in Swift 5 * String reimplemented with UTF-8 encoding which can often result in faster code * Exclusive access to memory is now enforced by default on debug and release builds * SIMD Vector and Result types added to the Standard Library * Performance improvements to Dictionary and Set * Support for dynamically callable types to improve interoperability with dynamic languages such as Python, JavaScript and Ruby


Comments

BBEdit Updated

 

BBEdit 12.6 has been released and this is a very significant update. BBEdit is now a sandboxed application which means there are a number changes to the way permissions are handled.

It is well worth reading the Release Notes which offer a very detailed explanation of the situation.

Without unrestricted access to your files and folders, many of BBEdit’s most useful features, from the basic to the most powerful, won't work at all; or they may misbehave in unexpected ways. At the very least, this hinders your ability to work done.

In order to resolve this fundamental conflict between security and usability, we have devised a solution in which BBEdit requests that you permit it the same sort of access to your files and folders that would be available to a non-sandboxed version.

There are also many additions, changes and fixes.

Comments

Programming Languages for Chemical Information

 

This looks like it should be well worth bookmarking.

https://www.biomedcentral.com/collections/programming-languages

This thematic series comprises a set of invited papers, each one describing the use of a single language for the development of cheminformatics software that implement algorithms and analyses and aims to cover a variety of language paradigms. The issue will be rolling, such that as papers on new languages are submitted they will be automatically added to this issue.

The first article DOI is by Kevin Theisen (of ChemDoodle fame) reviewing HTML5/Javascript. Apparently there have been more lines of Javascript written than all other programming languages combined so it seems appropriate as a kick off article.


Comments

A few thoughts on scientific software

 

When a wrote "A few thoughts on scientific software" I was somewhat surprised by the interest and amount of feedback I got. I've since added two more pages based on the feedback,

A listing of open-source cheminformatics toolkits and Open Source Python Data Science Libraries.

If you have any other suggestions feel free to let me know.


Comments

Making a Random Selection

 

Sometimes it is the simplest scripts that prove to be the most useful, the most downloaded AppleScript on the site is the one that simply prints the text on the clipboard.

I regularly need to select a specified number of molecules in a random fashion and this script does just that. Import a sdf file containing structures into Vortex and run the script to make a random selection.

results

Full details here….


Comments

How to contribute to RDKit

 

I just noticed that Greg Landrum has posted a page on how to contribute to RDKit. https://github.com/rdkit/rdkit/wiki/HowToContribute.

There many ways to contribute, you don't have to be Python or C++ developer, simply being an active user and asking questions and contributing solutions helps other users. Improving the documentation is always a great place from newcomers to start, particularly highlighting things that are not as clear as they could be.

I've also added the link to the Toolkits page.


Comments

Open Source Cheminformatics Tookits

 

When I wrote the article entitled A few thoughts on scientific software one of the responses I got was that people did not know about the existence of open-source chemistry toolkits so I thought I'd publish a page that hopefully prevent stop people reinventing the wheel. Here are four open-source toolkits that I'm aware of and if I've missed any, my apologies and send me details. Listing of Open-source cheminformatics toolkits


Comments

Deep Replay

 

This looks rather neat, Deep Replay

Deep Replay is a package designed to allow you to replay in a visual fashion the training process of a Deep Learning model in Keras.

part1

To install Deep Replay just type:

pip install deepreplay

Comments

Fortran on a Mac update

 

As I've noted on several occasions I'm not a big Fortran user but looking at the website stats the Fortran on a Mac page is now the third most regularly read page on the site and page views seem to be increasing.

I was recently sent a new link and I have added it to the Fortran on a Mac page.

Sourcery Institute a variety of resources for Fortran programmers, Sourcery institute tap for Homebrew formulae not in homebrew/homebrew-core, a Coarray Fortran Jupyter notebook kernel, forks of flang and gcc and OpenCoarrays a transport layer for coarray Fortran compilers.

Comments

A few thoughts on scientific software

 

Whilst this website is aimed at providing a resource for Mac using chemists regular readers will know that much of the content is platform agnostic and includes much code/software that will be of interest to all scientists.

software

I recently got a rather sad email

It seems that Third Street Software quietly disappeared, breaking the syncing for Sente (reference management).

I've also heard about a couple of other smaller software developers who are finding life very tough and it started me thinking about the status of scientific software, after exchanging emails with a number of people in the industry (many thanks for their input) I thought I'd collect a few thoughts on my blog.

You can read it here https://www.macinchem.org/reviews/scientificsoftware/software.php.

Comments

Accessing a Jupyter Notebook HERG model from Vortex

 

A recent paper "The Catch-22 of Predicting hERG Blockade Using Publicly Accessible Bioactivity Data" DOI described a classification model for HERG activity. I was delighted to see that all the datasets used in the study, including the training and external datasets, and the models generated using these datasets were provided as individual data files (CSV) and Python Jupyter notebooks, respectively, on GitHub https://github.com/AGPreissner/Publications).

The models were downloaded and the Random Forest Jupyter Notebooks (using RDKit) modified to save the generated model using pickle to store the predictive model, and then another Jupyter notebook was created to access the model without the need to rebuild the model each time. This notebook was exported as a python script to allow command line access, and Vortex scripts created that allow the user to run the model within Vortex and import the results and view the most significant features.

All models and scripts are available for download.

Full details are here…

hergactiveVortex


Comments

Scaling Python with Dask webinar

 

This looks to be an interesting webinar on Dask

https://know.anaconda.com/Scaling-Python-Dask-Webinar.html Wednesday, May 30th at 2:00PM CDT.

Dask is a flexible parallel computing library for analytic computing.

Dask is composed of two components:

  • Dynamic task scheduling optimized for computation. This is similar to Airflow, Luigi, Celery, or Make, but optimized for interactive computational workloads.
  • “Big Data” collections like parallel arrays, dataframes, and lists that extend common interfaces like NumPy, Pandas, or Python iterators to larger-than-memory or distributed environments. These parallel collections run on top of the dynamic task schedulers.

Comments

Implementing AB-MPS scoring

 

Whilst the rule of 5 (Ro5) has provided a useful way to describe small molecule drug space it is also clear that there are a significant number of molecular classes that exist beyond the rule of 5 boundaries (bRo5). In a review of the AbbVie compound collection DOI they were able to identify key findings that might explain the success (or failure) of bRo5 projects. From an analysis of a variety of calculated physicochemical properties they proposed a simple multiparametric scoring function (AB-MPS) was devised that correlated preclinical PK results with cLogD, number of rotatable bonds, and number of aromatic rings.

AB-MPS = Abs(cLogD-3) + NAR + NRB

Now implemented as a Vortex script.


Comments

Intel® Distribution for Python

 

Anyone fancy taking this for a test drive and providing some information on performance?

Get real performance results and download the free Intel Distribution for Python that includes everything you need for blazing-fast computing, analytics, machine learning, and more. Use Intel Python with existing code, and you’re all set for a significant performance boost.

The core computing packages, Numpy, SciPy, and scikit-learn, are accelerated under the hood with powerful, multithreaded native performance libraries such as Intel® Math Kernel Library, Intel® Data Analytics Acceleration Library, and others, to deliver native code-like performance results to Python. We leverage Intel® hardware capabilities using multiple cores and the latest Intel® Advanced Vector Extensions (Intel® AVX) instructions, including Intel® AVX-512. The Intel Python team reimplemented select algorithms to dramatically improve their performance. Examples include NumPy FFT and random number generation, SciPy FFT, and more.

Available for Windows, Linux and macOS.

Minimum System Requirements

  • Processors: Intel Atom® processor or Intel® Core™ i3 processor
  • Disk space: 1 GB
  • Operating systems: Windows* 7 or later, macOS, and Linux
  • Python* versions: 2.7.X, 3.5.X, 3.6
  • Included development tools: Conda, conda-env, Jupyter Notebook (IPython)

Comments

Google Sumer of code, Open Chemistry Projects

 

The details of some of the projects taking part in the Google Summer of Code are now online here https://summerofcode.withgoogle.com/organizations/6513013473935360/ under the Open Chemistry header.

Really interesting work includes 3-D coordinate generation, standardising fingerprint APIs, a framework for molecular validation, and standardization and molecular dynamics in Avogadro.

Good luck to all that are taking part!!


Comments

Jupyter and Fortran

 

Well after my last post about Swift and Jupyter a reader sent me link to the use of both Julia and Fortran programming languages in a Jupyter Notebook.

fortranJupyter

More information in this lecture Project Jupyter: Architecture and Evolution of an Open Platform for Modern Data Science by Fernando Perez.

Project Jupyter, evolved from the IPython environment, provides a platform for interactive computing that is widely used today in research, education, journalism and industry. The core premise of the Jupyter architecture is to provide tools for human-in-the-loop interactive computing. It provides protocols, file formats, libraries and user-facing tools optimized for the task of humans interactively exploring problems with the aid of a computer, combining natural and programming languages in a common computational narrative.


Comments

Swift 4.1 in a Jupyter Notebook

 

I'm a great fan of Jupyter Notebooks but I only ever use python.

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text

A recent post by Ray Yamamoto Hilton caught my eye who recently put together a little experiment to demonstrate using Swift 4.1 from within Jupyter Notebooks.

You can download a demo notebook here.

swiftjupyter


Comments

RDKit code changes

 

I just saw this on the RDKit email circulation list and since I know a number of readers use RDKit I thought I'd mention it.

When we do the beta for the 2018.03.1 release we're going to switch the C++ backend to use modern C++ (=C++11). For people who can't switch to use that code, we will continue to provide bug fixes for the 2017.09 release for at least another 6 months.

This should only affect people who need to build the RDKit C++ code themselves. If you use a binary version of the RDKit like the ones available inside of Anaconda Python or KNIME, this change should have no impact upon you.

It looks like we're almost there. Hopefully we will be able to do a beta of the 2018.03 release by the end of the week.


Comments

Updated Literature search script

 

I've updated the Vortex script to run text based queries of PubMed.

If you regularly use the E-utilities API you might want to read this.

After May 1, 2018, NCBI will limit your access to the E-utilities unless you have one of these keys. Obtaining an API key is quick, and simple, and will allow you to access NCBI data faster. If you don’t have an API key, E-utilities will still work, but you may be limited to fewer requests than allowed with an API key.

After May 1, 2018, any computer (IP address) that submits more than 3 E-utility requests per second will receive an error message. This limit applies to any combination of requests to EInfo, ESearch, ESummary, EFetch, ELink, EPost, ESpell, and EGquery.

If you write software of scripts that access the E-utilities API then the users will need to get their own api key. Calls will have this format

https://www.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?db=pubmed&api_key=ABCD123

I've updated this script to reflect this change, and I've highlighted where you need to add your api key in the script. I've also tried to ensure that any query string should be encoded to make it URL safe and I've extended the search range up to 2018.

AIsearchresults


Comments

Rodeo: A Python IDE for Data Scientists

 

Just added Rodeo a python IDE built for analysing data to the page of data analysis tools.

rodeo-overview-shot


Comments

Flagging Potential Kinase Inhibitors

 

Most of kinase inhibitors bind in the region of the ATP binding site using the hydrogen bonding interactions of the hinge region shown in the schematic below. We can use the knowledge of these hinge binding motifs to flag potential kinase inhibitors.

schematicatpbinding

Read more ….


Comments

Top 20 programming languages

 

Red Monk have published their Programming Language Rankings. The data source used for these queries is the GitHub Archive.

  1. JavaScript
  2. Java
  3. Python
  4. PHP
  5. C#
  6. C++
  7. CSS
  8. Ruby
  9. C
  10. Swift
  11. Objective-C
  12. Shell
  13. R
  14. TypeScript
  15. Scala
  16. Go
  17. PowerShell
  18. Perl
  19. Haskell
  20. Lua

Swift (+1): Finally, the apprentice is now the master. Technically, this isn’t entirely accurate, as Swift merely tied the language it effectively replaced – Objective C – rather than passing it. Still, it’s difficult to view this run as anything but a changing of the guard. Apple’s support for Objective C and the consequent opportunities it created via the iOS platform have kept the language in a high profile role almost as long as we’ve been doing these rankings. Even as Swift grew at an incredible rate, Objective C’s history kept it out in front of its replacement. Eventually, however, the trajectories had to intersect, and this quarter’s run is the first occasion in which this has happened. In a world in which it’s incredibly difficult to break into the Top 25 of language rankings, let alone the Top 10, Swift managed the chore in less than four years. It remains a growth phenomenon, even if its ability to penetrate the server side has not met expectations.


Comments

Script Debugger 7 released

 

A new version of Script Debugger has been released.

Script Debugger is an integrated development environment focused entirely on AppleScript. This focus allows it to deliver a suite of tools that make AppleScript development amazingly productive. You can use it to write and edit code, analyze target applications, debug scripts, and more.

SDFeatureSteppingII


Comments

Microsoft Quantum Development Kit Samples and Libraries under MacOSX

 

Well this is well out of my comfort zone but I thought I'd mention it.

Welcome to the Microsoft Quantum Development Kit! This repository contains the libraries and samples provided with the Quantum Development Kit https://github.com/microsoft/quantum.

The Microsoft Quantum Development Kit has been tested under MacOSX, Ubuntu Linux, but may work on other distributions. The Python interoperability feature has been developed for the Anaconda distribution of Python 3.6. Please see the README file provided with the Python sample for more details

Thank you for your interest in Microsoft Quantum Development Kit preview. The development kit contains the tools you'll need to build your own quantum computing programs and experiments.

So off you go…..


Comments

Google Summer of Code:- Open Chemistry

 

There are a number of interesting projects being undertaken in this years Google Summer of Code.

If you know of any students that might be interested then perhaps point them to the Open Chemistry Project.

The Open Chemistry project is a collection of open source, cross platform libraries and applications for the exploration, analysis and generation of chemical data. The organization is an umbrella of leading projects developed by long-time collaborators and innovators in open chemistry such as the Avogadro, Open Babel, and cclib projects. These three alone have been downloaded over 700,000 times and cited in over 2,000 academic papers. Our goal is to improve the state of the art, and facilitate the open exchange of chemical data and ideas while utilizing the best technologies from quantum chemistry codes, molecular dynamics, informatics, analytics, and visualization.

There is a list of the GSoC Ideas 2018 here but of course students can add their own.


Comments

Awesome Python Chemistry

 

A curated list of awesome Python frameworks, libraries, software and resources related to Chemistry.

https://github.com/lmmentel/awesome-python-chemistry

A blog post giving more details http://lukaszmentel.com/blog/awesome-python-chemistry/index.html.


Comments

Google summer of code chemistry ideas

 

The Open Chemistry project have collected together project ideas for GSoC 2018. The projects cover a wide range of projects in chemistry

The full listing is available here and includes projects that make use of a number of open source toolkits such as Open Babel, RdKit and cclib.


Comments

MayaChem Tools

 

MayaChemTools is a fabulous collection of Perl and Python scripts, modules, and classes to support a variety of day-to-day computational discovery needs.

The core set of command line Perl scripts available in the current release of MayaChemTools has no external dependencies and provide functionality for the following tasks:

  • Manipulation and analysis of data in SD, CSV/TSV, sequence/alignments, and PDB files
  • Listing information about data in SD, CSV/TSV, Sequence/Alignments, PDB, and fingerprints files
  • Calculation of a key set of physicochemical properties, such as molecular weight, hydrogen bond donors and acceptors, logP, and topological polar surface area
  • Generation of 2D fingerprints corresponding to atom neighborhoods, atom types, E-state indices, extended connectivity, MACCS keys, path lengths, topological atom pairs, topological atom triplets, topological atom torsions, topological pharmacophore atom pairs, and topological pharmacophore atom triplets
  • Generation of 2D fingerprints with atom types corresponding to atomic invariants, DREIDING, E-state, functional class, MMFF94, SLogP, SYBYL, TPSA and UFF
  • Similarity searching and calculation of similarity matrices using available 2D fingerprints
  • Listing properties of elements in the periodic table, amino acids, and nucleic acids
  • Exporting data from relational database tables into text files

The command line Python scripts based on RDKit provide functionality for the following tasks:

  • Calculation of molecular descriptors
  • Comparison 3D molecules based on RMSD and shape
  • Conversion between different molecular file formats
  • Enumeration of compound libraries and stereoisomers
  • Filtering molecules using SMARTS, PAINS, and names of functional groups
  • Generation of graph and atomic molecular frameworks
  • Generation of images for molecules
  • Performing structure minimization and conformation generation based on distance geometry and forcefields
  • Picking and clustering molecules based on 2D fingerprints and various clustering methodologies
  • Removal of duplicate molecules

These invaluable scripts can be used in other applications, I've written a Vortex Script that uses them.


Comments

Fortran on a Mac

 

I was sent a few updates over the Christmas break and so I've updated the Fortran on a Mac page.


Comments

Behind the Scenes in Real-Life Software Design By Stephen_Wolfram · 48 videos

 

I just stumbled across a fascinating series of lectures. These are recordings of the live discussions behind the ongoing software development led by Stephen Wolfram.

Of particular interest might be the discussion on incorporating chemistry into the Wolfram language.

https://www.twitch.tv/videos/181269427?collection=F82InZg17BQFzw.


Comments

Python and compchem

 

Python seems to becoming the lingua franca for scientific scripting/progamming and it is perhaps not surprising that we now see increasing support for computational chemistry.

Chemtools is a set of modules that is intended to help with more advanced computations using common electronic structure methods/ programs. Currently the is some limited support for Gamess-US and MolPro program packages but other codes can be easily interfaced. It requires:

  • Python works with Python 2.7.x and 3.x
  • numba
  • numpy
  • mendeleev
  • scipy
  • setuptools

Chemtools is NOT hosted on pypi yet but in can be installed by pip from the bibbucket repository with:

pip install https://bitbucket.org/lukaszmentel/chemtools/get/tip.tar.gz

Pygamess is a GAMESS wrapper for Python, it requires:

  • Python 2.6 or later (not support 3.x)
  • RDKit
  • GAMESS

It can be installed using pip

pip install pygamess

Usage

single point calculation with RDKit

from pygamess import Gamess
from rdkit import Chem
from rdkit.Chem import AllChem
m = Chem.MolFromSmiles("CC")
m = Chem.AddHs(m)
AllChem.EmbedMolecule(m)
0
AllChem.UFFOptimizeMolecule(m,maxIters=200)
0
g = Gamess()
nm = g.run(m)
nm.GetProp("total_energy")
'-78.302511990200003'

PyQuante: Python Quantum Chemistry, an open-source suite of programs for developing quantum chemistry methods, it currently supports

  • Hartree-Fock: Restricted closed-shell HF and unrestricted open-shell HF;
  • DFT: LDA (SVWN, Xalpha) and GGA (BLYP) functionals;
  • Optimized-effective potential DFT;
  • Two electron integrals computed using Huzinaga, Rys, or Head-Gordon/Pople techniques; C and Python interfaces to all of these programs;
  • MINDO/3 semiempirical energies and forces;
  • CI-Singles excited states;
  • DIIS convergence acceleration;
  • Second-order Moller-Plesset (MP2) perturbation theory.

cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. The goals of cclib are centered around the reuse of data obtained from these programs and contained in output files, specifically

  • ADF (versions 2007 and 2013)
  • DALTON (versions 2013 and 2015)
  • Firefly, formerly known as PC GAMESS (version 8.0)
  • GAMESS (US) (version 2012)
  • GAMESS-UK (version 7.0)
  • Gaussian (versions 03 and 09)
  • Jaguar (versions 7.0 and 8.3)
  • Molpro (versions 2006 and 2012)
  • NWChem (versions 6.0 and 6.5)
  • ORCA (versions 2.9 and 3.0)
  • Psi (versions 3.4 and 4.0)
  • Q-Chem (version 4.2)

FragBuilder a tool to create, setup and analyse QM calculations on peptides. DOI.

Update

And of course there is OpenBabel that can be used create input files for a variety of computational chemistry packages.

If I've missed anything please feel free to let me know.


Comments

Scripting PubMed searches

 

PubMed comprises more than 24 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. They also provide a number of programming tools that allow access to the information, E-utilities are a set of server-side programs that provide a stable interface into the Entrez query and database system.

To access these data, a piece of software first posts an E-utility URL to NCBI, then retrieves the results of this posting, after which it processes the data as required. The software can thus use any computer language that can send a URL to the E-utilities server and interpret the XML response; examples of such languages are Perl, Python, Java, and C++.

A while back I wrote a vortex script that helps with these sort of searches if you have multiple terms you want to search. I've updated this script to incorporate the changes requiring api keys to allow multiple requests to the E-utilities api, and I've highlighted where you need to add your own api key in the script. I've also tried to ensure that any query string should be encoded to make it URL safe.

The update is detailed more fully here….

tut25result


Comments

Downloading from the RCSB Protein Data Bank using Python

 

The RCSB Protein Data Bank is an absolutely invaluable resource that provides archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps scientists understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. Currently the PDB contains over 134,000 data files containing structural information on 42547 distinct protein sequences of which 37600 are human sequences. They also provide a series of tools to search, view and analyse the data.

Downloading an individual pdf file is pretty trivial and can be done from the web page as shown in the image below. They also provide a Download Tool launched as stand-alone application using the Java Web Start protocol. The tool is downloaded locally and must be then opened. I've found this a little temperamental and had issues with Java versions and security settings.

Since I've been making extensive use of the web services to interact with RCSB I decided to explore the use of Python to download multiple files. I started off creating a Jupyter notebook using the web services provided by RCSB.

I've also used variations on this code to create a python script and a Vortex script.

Full details are here …


Comments

New API Keys for the E-utilities

 

If you regularly use the E-utilities API you might want to read this.

After May 1, 2018, NCBI will limit your access to the E-utilities unless you have one of these keys. Obtaining an API key is quick, and simple, and will allow you to access NCBI data faster. If you don’t have an API key, E-utilities will still work, but you may be limited to fewer requests than allowed with an API key.

After May 1, 2018, any computer (IP address) that submits more than 3 E-utility requests per second will receive an error message. This limit applies to any combination of requests to EInfo, ESearch, ESummary, EFetch, ELink, EPost, ESpell, and EGquery.

If you write software of scripts that access the E-utilities API then the users will need to get their own api key. Calls will have this format

https://www.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?db=pubmed&api_key=ABCD123

Comments

2017 Wolfram Technology Conference

 

An interesting blog entry on the recent 2017 Wolfram Technology Conference. This is a unique experience where researchers and professionals interacted directly with those who build each component of the Wolfram technology

I particularly like this comment.

It was not uncommon for software engineers or physicists to glean new tricks and tools from a social scientist or English teacher—or vice versa—a testament to the diversity and wide range of cutting-edge uses Wolfram technologies provide.

The blog entry is well worth a read.

Delighted to see the ubiquitous presence of MacBook Pros!


Comments

Interacting with the RCSB Protein Data Bank

 

The RCSB Protein Data Bank is an absolutely invaluable resource that provides archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps scientists understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. Currently the PDB contains over 134,000 data files containing structural information on 42547 distinct protein sequences of which 37600 are human sequences. They also provide a series of tools to search, view and analyse the data.

The latest addition to the Hints and Tutorials page is a couple of Vortex scripts for interacting with the RCSB Protein Data Bank, specifically they search for PDB structures associated with a list of Uniprot codes, and then search for associated information. Read more here…

Comments

Xcode 9.0 is available for download

 

The new version of Xcode is available for download. Xcode 9.0 includes Swift 4 and SDKs for iOS 11, watchOS 4, tvOS 11 and macOS High Sierra 10.13.

  • The source code editor has been completely rebuilt for amazing speed. It scrolls at a constantly smooth rate, no matter the files size, also supports Markdown.
  • Refactoring to easily select and modify structure of code
  • Swift 4 compiler can also compile Swift 3 to aid transition
  • Xcode 9 makes working with source control – and with GitHub – easier and more tightly integrated.
  • Simulator app updated.

Comments

Fortran on a Mac

 

First let me say I’m not a big Fortran user but any blog posts about Fortran always seem to be very popular, and I do get asked regularly about how to compile Fortran applications.

So I've decided to gather together all the Fortran news, tips and resources onto a dedicated Fortran on a Mac page.

If you know of anything else it would be useful to include please let me know.


Comments

Weekend Reading

 

A couple of things for your weekend reading ;-)

When not to use deep learning

What makes Python super popular

Googles online Python Class

Machine Learning in Python tutorial


Comments

Clustering Update

 

I previously mentioned a comparison of various tools to cluster large datasets. I've now updated the Vortex to allow the user to select the centroid of each cluster. I tried it on a 4.3 million structure clustered dataset and the script only took a few seconds to run.

The page on clustering is here and the Vortex script can be downloaded here http://macinchem.org/reviews/vortex_scripts/ChoseCentreFromClusters.vpy.zip.


Comments

Getting PDB information

 

A while back I published two scripts that use UniChem a web resource provided by the EBI, a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases.

Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3

The first script uses the ChEMBL ID to search for other identifiers, the second script allows more flexible searching using any of the identifiers available within UnicChem. One of the identifiers returned is from the PDBe (Protein Data Bank Europe) and represents the ID of the ligand in the PDB. Whilst this is interesting it would also be very useful to have the identity of the crystal structures that contain the ligand. Fortunately PBDe provide a series of web services that can be used to interrogate the database, together with a really useful page to help build the calls.

Full details of the script are here..

There is a comprehensive listing of scripts, tips, jupyter notebooks etc here.


Comments

App Development with Swift

 

Apple Education has just announced a new app development curriculum designed to teach students how to start using Swift to create fully functional iPhone apps.

The course is available for free on iBooks and can be read on iPad, iPhone, and Mac..

This course is designed to teach you the skills needed to be an app developer capable of bringing your own ideas to life. Whether you’re new to coding or want to expand your skills, by the end of this course you should be able to build a fully functioning app of your own design.

The 900 page book is available here https://itunes.apple.com/gb/book/app-development-with-swift/id1219117996?


Comments

Functional Swift Conference 2017 slides available

 

The slides for the Functional Swift Conference 2017 are now online here http://2017.funswiftconf.com.

Comments

Working with MOL2 Structures in DataFrames

 

A great tutorial describing how to use 'Biopandas' MOL2 DataFrames to analyze molecules conveniently.

The Tripos MOL2 format is a common format for working with small molecules.


Comments

Several ways of scripting Name to Structure

 

Too often I come across datasets that Chemical names or identifiers but no actual molecular structure, recently Dan at Dotmatics suggested I look at OPSIN. There are also several web services for converting names to structure and I've highlighted a couple of options here and described three scripts that allow them to be used from within Vortex.

vortexopsinstructures.png

There are many more scripts on the Hints and Tutorials Page.


Comments

Simply Fortran added to Fortran Resources

 

I just came across Simply Fortran includes the GNU Fortran compiler, an advanced development environment including project management, and an integrated debugger for fast and easy installation.

Simply Fortran is tested on OSX Snow Leopard through macOS Sierra.

Added to the Fortran on a Mac page.


Comments

Apple publish AI research

 

Apple have published some of their artificial intelligence research, arXiv:1612.07828

With recent progress in graphics, it has become more tractable to train models on synthetic images, potentially avoiding the need for expensive annotations. However, learning from synthetic images may not achieve the desired performance due to a gap between synthetic and real image distributions. To reduce this gap, we propose Simulated+Unsupervised (S+U) learning, where the task is to learn a model to improve the realism of a simulator's output using unlabeled real data, while preserving the annotation information from the simulator. We develop a method for S+U learning that uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors. We make several key modifications to the standard GAN algorithm to preserve annotations, avoid artifacts and stabilize training: (i) a 'self-regularization' term, (ii) a local adversarial loss, and (iii) updating the discriminator using a history of refined images. We show that this enables generation of highly realistic images, which we demonstrate both qualitatively and with a user study. We quantitatively evaluate the generated images by training models for gaze estimation and hand pose estimation. We show a significant improvement over using synthetic images, and achieve state-of-the-art results on the MPIIGaze dataset without any labeled real data.


Comments

Expressions

 

I've just been sent details of an app to aid generating regular expressions, Expressions. I use BBEdit for most of my regular expression searching but this looks a brilliant way to build the query.

screen800x500


Comments

gXXfortran

 

Everytime I mention Fortran there is an uptick in the site views so I know there are plenty of readers with an interest in Fortran on a Mac.

There is an interesting post on the NextMove Software blog, Just what you wanted for Christmas – a compiler for Gaussian.

This package provides a “pgf77” script that emulates the Portland Group’s PGI fortran 77 compiler, instead using the Free Software Foundation’s GNU gfortran compiler instead. This emulation is sufficient to allow packages such as Gaussian03, that would otherwise require a commercial compiler, to be built using open source tools. In addition, this package also allows Gaussian03 to be built on a case-insensitive file system (such as when using Mac OS X, cygwin or a FAT32 drive) by overriding the behaviour of “cp” and “gau-cpp” such that they don’t cause problems when used by Gaussian’s build scripts on non case-sensitive file systems.

gXXforrtran is available on GitHub In theory, it should work for a standard Linux or Mac system.

However, as they don’t have access to the Gaussian source code they can't check it. Anyone out there care to try it?

See also Fortran on a Mac page


Comments

Swift Playgrounds

 

Swift Playgrounds is a revolutionary new app for iPad that makes learning Swift interactive and fun. Solve puzzles to master the basics using Swift, ideal for keeping occupied over the Christmas break.

Learning to code with Swift Playgrounds is incredibly engaging. The app comes with a complete set of Apple-designed lessons. Play your way through the basics in “Fundamentals of Swift” using real code to guide a character through a 3D world. Then move on to more advanced concepts.

There is a video here

I also notice that there have been a few updates to the Swift Algorithm Club there are now 79 contributors and an ever increasing list of algorithms.

All content is licensed under the terms of the MIT open source license.



Comments

Scriptarian: Scripting Studio for macOS

 

Scriptarian allows you to easily automate macOS using the Swift programming language, providing an alternative to AppleScript.

Scriptarian is built using Swift the new open-source programming language developed by Apple. Scriptarian analyzes all of your installed applications for AppleScript support and dynamically generates native Swift interfaces for them.

In addition to full support for the Swift Standard Library, Scriptarian includes ScriptingKit, a scripting framework we built from the ground up with Swift in mind. It lets you communicate with any AppleScript-enabled app and even provides various utility functions for speech synthesis, sound playback, file management, process management, and more. keynote


Comments

Objective-C id as Swift Any

 

The latest update on the Swift Blog describes one of the important changes in Swift 3.

In Swift 3, the id type in Objective-C now maps to the Any type in Swift, which describes a value of any type, whether a class, enum, struct, or any other Swift type.


Comments

Swift Algorithm Club

 

The Swift Algorithm Club is a new site that described implementations of popular algorithms and data structures in Swift. However there is also an added bonus in that there are also detailed explanations of how they work. The list below gives an idea of what is available or under construction, and I’m sure they would be delighted to receive contributions.

The algorithms

Searching

  • Linear Search. Find an element in an array.
  • Binary Search. Quickly find elements in a sorted array.
  • Count Occurrences. Count how often a value appears in an array.
  • Select Minimum / Maximum. Find the minimum/maximum value in an array.
  • k-th Largest Element. Find the k-th largest element in an array, such as the median.
  • Selection Sampling. Randomly choose a bunch of items from a collection.
  • Union-Find. Keeps track of disjoint sets and lets you quickly merge them.

String Search

  • Brute-Force String Search. A naive method.
  • Boyer-Moore. A fast method to search for substrings. It skips ahead based on a look-up table, to avoid looking at every character in the text.
  • Knuth-Morris-Pratt
  • Rabin-Karp
  • Longest Common Subsequence. Find the longest sequence of characters that appear in the same order in both strings.

Sorting

It's fun to see how sorting algorithms work, but in practice you'll almost never have to provide your own sorting routines. Swift's own sort() is more than up to the job. But if you're curious, read on...

Basic sorts:

  • Insertion Sort
  • Selection Sort
  • Shell Sort

Fast sorts:

  • Quicksort
  • Merge Sort
  • Heap Sort

Special-purpose sorts:

  • Counting Sort
  • Radix Sort
  • Topological Sort

Bad sorting algorithms (don't use these!):

  • Bubble Sort

Compression

  • Run-Length Encoding (RLE). Store repeated values as a single byte and a count.
  • Huffman Coding. Store more common elements using a smaller number of bits.

Miscellaneous

  • Shuffle. Randomly rearranges the contents of an array.
  • Comb Sort. An improve upon the Bubble Sort algorithm.

Mathematics

  • Greatest Common Divisor (GCD). Special bonus: the least common multiple.
  • Permutations and Combinations. Get your combinatorics on!
  • Shunting Yard Algorithm. Convert infix expressions to postfix.
  • Statistics

Machine learning

  • k-Means Clustering. Unsupervised classifier that partitions data into k clusters.
  • k-Nearest Neighbors
  • Linear Regression
  • Logistic Regression
  • Neural Networks
  • PageRank

Comments

Applescript & ASObjC ’Things to watch out for’ list

 

Brian Christmas has compiled an absolutely invaluable list of tips and samples of code for those using AppleScript or AppleScriptObjC. He has kindly allowed me to host a page containing all these tips

Applescript & ASObjC ’Things to watch out for’ list.

This list is a great resource for those just starting out but will also be invaluable for more experienced scripters.

If you would like to contribute probably the best way is to subscribe to one of the Apple mailing lists

https://lists.apple.com/mailman/listinfo/applescriptobjc-dev

https://lists.apple.com/mailman/listinfo/applescript-users

There is a page of other AppleScript Resources here.


Comments

nteract a desktop-based, interactive computing application.

 

This blog post looks very interesting, a notebook environment for coding, data visualisation based on Juypter (aka iPython) notebooks

With nteract, you can create documents, that contain executable code, textual content, and images, and convey a computational narrative. Unlike Jupyter, your documents are stand-alone, cross-platform desktop applications, providing a seamless desktop experience and offline usage.

nteract can run your existing Jupyter notebooks without any modification, and supports multiple Jupyter kernels: Python, R, Julia, and JavaScript. Being a native Jupyter notebook, nteract applications can be easily saved to Domino, versioned, shared, and if needed, run on high-performance machines in the cloud, in your VPC, or on-premise.

More details are on GitHub.


Comments

ASObjC Explorer has just been updated

 

ASObjC Explorer has just been updated to version 4.1.17, with fixes relating to Sierra. Choose 'Check for Updates...' to update.

Note also that ASObjC Explorer is no longer available for general sale. Barring bug-fixes, this is the last release -- there will be no further development. You can read more here:

http://www.macosxautomation.com/applescript/apps/explorer.html

Registered users are eligible for a 50% discount on Script Debugger 6 until mid-November as compensation, and should email Shane Stanley for details.

Development of Script Debugger 6

Its AppleScriptObjC code completion will be familiar to users of ASObjC Explorer, but goes further. And Script Debugger 6 ability to step through scripts and explore Cocoa results, to the point of being able to explore the contents of collection classes.

ASObjCDebugging


Comments

Scripting Vortex 34, analysis of catagorical information

 

I often need to tag individual molecules within a dataset with a specific property, perhaps the results of clustering algorithms, the results of PAINS filtering, or Liver toxicity filters. Alternatively if you have a drug discovery project with multiple chemotypes you might want to tag particular groups of compounds as belonging to a named series to aid analysis.

A question that might then arise is “How many molecules belong to each category?”. Whilst you can see the numbers in the sidebar there is not an easy way to export the results.

Hopefully this script can help.

livertoxoutput


Comments

Swift 3.0 released

 

The latest version of swift has been released.

Swift 3 is a source-breaking release, largely due to the changes in SE-0005 and SE-0006. These changes not only impact the names of the Standard Library APIs, but also completely change how Objective-C APIs (particularly from Cocoa) import into Swift. Many of the changes are largely mechanical, but they can be numerous in a typical Swift project. To help with moving to Swift 3, Xcode 8.0 contains a code migrator that can automatically handle many of the need source changes. There is also a migration guide available to guide you through many of the changes — especially through the ones that are less mechanical and require more direct scrutiny.

Thus there is better translation of Objective-C APIs into Swift, meaning that code imported from Objective-C and translated into Swift will be more readable and Swift-like. The bad news is any code previously imported from Objective-C into Swift will not work in Swift 3; it will need to be re-imported.

There was also a blog post describing how to work with JSON in swift, this is particularly important if your app communicates with a web application, information returned from the server is more often than not formatted as JSON.

import Foundation

let data: Data // received from a network request, for example
let json = try? JSONSerialization.jsonObject(with: data, options: [])

Comments

Xcode 8 released

 

With the release of iOS 10 comes an update to Xcode. Xcode 8.0 is a free download for OS X 10.11 or later. (An Apple ID is required for iOS development, and App Store submissions require registration in the Apple Developer Program.) The latest version brings Swift 3 and SDKs for iOS 10, watchOS 3, tvOS 10, macOS Sierra, Siri extensions, iMessage apps, and sticker packs for Messages, along with many other changes.

Xcode 8 includes everything you need to create amazing apps for iPhone, iPad, Mac, Apple Watch, and Apple TV. This radically faster version of the IDE features new editor extensions that you can use to completely customize your coding experience. New runtime issues alert you to hidden bugs by pointing out memory leaks, and a new Memory Debugger dives deep into your object graph. Swift 3 includes more natural and consistent API naming, which you can experiment with in the new Swift Playgrounds app for iPad.

Swift 3 is the first major release of the innovative programming language built completely in the open with the community of developers at Swift.org. This release unifies core API naming rules under a new public API Naming Guidelines document that makes writing Swift code feel even more natural. Popular system APIs such as Core Graphics and Grand Central Dispatch are more expressive and harmonize well with Swift.


Comments

Mathematica 11 Is Now Available

 

Mathematica 11 has been released.

We are pleased to announce that Mathematica 11 has arrived, with over 500 new functions! Continuing on the path of aggressive innovation that Stephen Wolfram first embarked on 30 years ago, Version 11 embraces new areas of modern technology and introduces cutting-edge functionality to match. With Mathematica, you can now print 3D models and plots directly through either local or cloud-based 3D printers. Or instead, identify over 10,000 objects, and classify and extract features in your data with the customizable suite of enhanced machine learning tools. You can also construct, train and evaluate high-performance neural networks with both CPU and GPU support, enabling powerful deep learning in just a few lines of code. Integrated support for audio, from trimming and filters to synthesizing sounds and measuring audio, makes Mathematica 11 a flexible platform for digital audio processing and analysis.

You can read more about it in Stephen Wolfram’s blog post.


Comments

End to End Swift

 

This looks interesting, Perfect

Perfect is an application server for Linux or OS X which provides a framework for developing web and other REST services in the Swift programming language. Its primary focus is on facilitating mobile apps which require backend server software, enabling you to use one language for both front and back ends.

Perfect relies on Home Brew for installing dependencies on OS X, once done you are up and running and can follow the Perfect tutorials


Comments

Accessing ZINC supplier information

 

ZINC is a free database of commercially-available compounds for virtual screening. ZINC contains over 100 million purchasable compounds in ready-to-dock, 3D formats. Sterling and Irwin, J. Chem. Inf. Model, 2015. This is an invaluable resource for any type of virtual screening or for anyone looking to create a physical screening or fragment collection.

Once you have done the virtual screening you will rapidly realise that the really time-consuming a tedious part now lies ahead. Finding out which vendors stock a particular molecule and then ordering them. Looking up the vendor details for individual compounds is extremely tedious and so this Vortex script may be very useful.

Many more scripts, iPython notebooks and tutorials can be found here.


Comments

Swift 3

 

Swift Blog Update

Swift 3 beta was just released as part of Xcode 8 beta and includes numerous enhancements, many contributed by the open source community. The primary goal of Swift 3 is to implement the last major source changes necessary to allow Swift to coalesce as a consistent language throughout, resulting in a much more stable syntax for future releases.


Comments

17th annual KDnuggets Software Data Analysis Poll

 

The results of the annual data analysis poll are in and show some interesting trends, in particular the dramatic increase in Python use.

R remains the leading tool, with 49% share (up from 46.9% in 2015), but Python usage grew faster and it almost caught up to R with 45.8% share (up from 30.3%).

Actually looking down the list I notice there is also an entry for scikit-learn, which is Python based, and if you add that in Python is now the most commonly used data analysis tool.

There was a 10% drop in the use of KNIME, and a 36% drop in the use of TIBCO Spotfire two products used in cheminformatics.

In terms of programming languages Python is by far the most extensively used.

Python 45.8% share (was 30.3%) 51% increase
Java 16.8% share (was 14.1%) 19% increase
Unix shell/awk/gawk 10.4% share (was 8.0%) 30% increase
C/C++ 7.3% share (was 9.4%) 23% decrease
Other programming languages 6.8% share (was 5.1%) 34.1% increase

In the Big Data area Hadoop (22.1%) and Spark (21.6%) dominate.

There is a listing of data analysis tools for MacOSX here.


Comments

Fortran Modernisation Workshop

 

I’m not a Fortran user but overtime I mention it there is a sharp uptick in page views so there are obviously a significant number of users who might be interested in this workshop.

Fortran modernisation workshop at Culham Centre for fusion energy

Date: 24 - 25 August 2016. Time: 09:30 - 17:30. Location: Learning Resource and Development Centre (Library), E building, Culham Centre for Fusion Energy, Culham Science Centre, Abingdon, OX14 3DB

TOPICS WILL INCLUDE:

  • Software engineering for computational science;
  • Modern Fortran standards and how to write optimized and efficient Fortran;
  • NetCDF and HDF5 scientific file formats for data sharing in Fortran;
  • GNU Automake to automate the build process;
  • pFUnit unit testing framework for testing Fortran codes;
  • Doxygen for Fortran code documentation;
  • Git version control for collaborative code development;
  • In-memory visualisation using PLplot in Fortran;
  • IEEE Floating Point Exception Handling
  • Software verification and portability using the NAG Fortran compiler
  • Fortran interoperability with C, Python and R;
  • Introduction to parallelism for Fortran.

There is a page of Fortran Resources here.


Comments

SAR visualization with RDKit

 

One of the issues for machine learning models in helping understand structure activity relationships (SAR) is providing a nice chemist friendly visualisation. This excellent blog post provides a description of how to colour code the parts of molecules that are predicted to contribute to an activity.

inactive


Comments

Dealing with Greek characters in column names

 

This is just a very quick tip when dealing with Greek characters in Vortex column names when creating a script. It may be obvious to many but I struggled for several hours before finding the problem and a solution

Read more…


Comments

Apple’s Worldwide Developers Conference

 

Apple today announced that it will hold its 27th annual Worldwide Developers Conference (WWDC) from June 13 through 17 in San Francisco.

Developers can apply for tickets via the WWDC website (developer.apple.com/wwdc/register/) now through Friday, April 22 at 10:00 a.m. PDT. Tickets will be issued to attendees through a random selection process, and developers will be notified on the status of their application by Monday, April 25 at 5:00 p.m. PDT. For the second consecutive year, there will be up to 350 WWDC Scholarships available, giving students and STEM organization members from around the world an opportunity to earn a ticket to meet and collaborate with some of the most talented developers of Apple’s ever-growing app ecosystem (developer.apple.com/wwdc/scholarships/). Additionally, this year, we will provide travel assistance to up to 125 scholarship recipients to ensure aspiring developers with financial limitations have an opportunity to participate.


Comments

Molecular visualization in the Jupyter Notebook with nglview

 

I'm making increasing use of iPython notebooks and this package looks like it will be very useful.

nglview is a Python package that makes it easy to visualize molecular systems, including trajectories, directly in the Jupyter Notebook. The recent 0.4.0 release of nglview brings a convenient interface for visualizing MDAnalysis Universe and AtomGroup objects directly:

More details here…

The notebook widget allows you to rotate and zoom the molecule and lets you select atoms by clicking on the molecule.

Easily installed using PIP

pip install nglview

Update

There have been a number of comments and responses via twitter highlighting this superb demo.

http://arose.github.io/ngl/

nglviewer

The project is on Github, feel free to contribute!


Comments

The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development

 

I just came across an interesting paper on cross-platform OpenCL programming. The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development. In particular it highlights a number of issues and offers workarounds. These include Framework bugs, Specification limitations and Program bugs.

There are an increasing number of scientific applications taking advantage of GPU acceleration.


Comments

ChEMBL Models iPython Notebook

 

With the release of ChEMBL 21 has come a set of updated target predicted models.

The good news is that, besides the increase in terms of training data (compounds and targets), the new models were built using the latest stable versions of RDKit (2015.09.2) and scikit-learn (0.17). The latter was upgraded from the much older 0.14 version, which was causing incompatibility issues while trying to use the models.

I've been using the models and I thought I'd share an iPython Notebook I have created. This is based on the ChEMBL notebook with code tidbits taken from the absolutely invaluable Stack Overflow. I'm often in the situation where I actually want to know the predicted activity at specific targets, and specifically want to confirm lack of predicted activity at potential off-targets. I could have a notebook for each target but actually the speed of calculation means that I can calculate all the models and then just cherry pick those of interest.

Read on…


Comments

In which journals should I publish my software?

 

I just stumbled across this page recently

In which journals should I publish my software?

Looks like a very comprehensive listing of journals that accept submissions about software.


Comments

Stack Overflow Developers Survey

 

Stack Overflow is a community of nearly 5 million developers who ask and answer programming questions, if I ever have questions about programming or scripting it is the first site I look to for answers.

They also run an annual survey looking at current trends within the developer community, the results probably represent the most comprehensive survey of it's type. The results make interesting reading and I'd certainly suggest that you go an look at the results in details, but I've pulled out a couple of interesting points.

JavaScript is the most commonly used programming language, but that is probably because the web is now the most popular front-end to applications and services. If you look at the developers involved in Math and Data the profile is rather different with Python now the dominant language.

mathdata

The most loved programming languages are Rust and Swift, and the most dreaded Visual Basic, interestingly none of which appear in the most used technologies.

Among the desktop operating systems MacOSX is increasingly popular rising from 18% in 2013, largely at the expense of Windows.

desktops


Comments

Swift Programming Language Evolution

 

If you want to stay on top of the proposed changes to the Swift programming language then this repository is worth a browse https://github.com/apple/swift-evolution

This repository tracks the ongoing evolution of Swift. It contains:

Goals for upcoming Swift releases (this document). The Swift evolution review schedule tracking proposals to change Swift. The Swift evolution process that governs the evolution of Swift. Commonly Rejected Changes, proposals which have been denied in the past. This document describes goals for the Swift language on a per-release basis, usually listing minor releases adding to the currently shipping version and one major release out. Each release will have many smaller features or changes independent of these larger goals, and not all goals are reached for each release.


Comments

Summer of Code

 

I was just sent details of this

Interested in doing some chemistry programming this summer? Have students that might be interested?

Open Chemistry has been accepted into the Google Summer of Code for 2016 - including Open Babel, Avogadro, cclib and 3DMol.js.

If you are a student and interested in doing open chemistry software development this summer (or know of someone who is), we're definitely up for good proposal ideas. Take a look at our suggestions or come up with one on your own:

https://summerofcode.withgoogle.com/organizations/6290185763422208/ http://wiki.openchemistry.org/GSoCIdeas2016

Student proposals can be submitted between March 14th and March 25th. Instructions are at the Summer of Code website.




Comments

Swift @ IBM

 

For those interested in learning more about Swift there are a couple of blogs that are worth keeping an eye on. Swift is a modern open-source programming language for iOS, OS X, tvOS, watchOS and Linux.

There is the official Swift blog from Apple which has updates and code snippets, however more recently there has been a lot of activity of the Swift@ IBM blog.

In particular Bring Swift to the Cloud and the IBM Swift Sandbox that allows you to write Swift code in a browser and then execute it. You can try it out here https://swiftlang.ng.bluemix.net/#/repl.

Introduced in 2014, Swift is one of the fastest growing and most widely used programming languages. In just over two months since Apple open sourced the Swift language and IBM released its Swift Sandbox for early exploration of server-side programming in Swift, more than 100,000 developers from around the world have used the IBM Swift Sandbox and more than half a million code runs have been executed in the Sandbox to date

Press release


Comments

Flexible UniChem Search

 

UniChem is a web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases. Currently the UniChem contains data from 27 different data sources. Currently UniChem provides links to 108,941,995 structures.

Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3

The previous script showed how to search using ChEMBLID, however one of the attractions of UniChem is that you can search with any molecule identifier if you know the corresponding datasource. This script allows the user to use any molecule identifiers and then search a specified datasource using a common web service.

Read more …


Comments

Getting UniChem data from ChEMBL

 

UniChem is a web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases. Currently the UniChem contains data from 27 different data sources. Currently UniChem provides links to 108,941,995 structures.

Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3

ChEMBL also provide a RESTful Web service that users can use to retrieve data from the UniChem database in a programmatic fashion.

Read more…


Comments

ObjCConverter

 

If you want to convert your Objective_Code to Swift this could be very useful ObjCConverter.

There is also an online converter if you want to try before you buy

The latest post on the Swift blog highlights interactive playgrounds.

Xcode 7.3 beta 3 adds interactive iOS and OS X playgrounds that allow you to click, drag, type, and otherwise interact with the user interfaces you code into your playground. These interfaces react just as they would within a full application. Interactive playgrounds help you to quickly prototype and build your applications, and simply provide another great way to interact with your code.


Comments

BBEdit 11.5

 

BBEdit 11.5 has been released.

BBEdit 11.5 adds a variety of new features, and includes changes to existing features and behaviors as well as fixes for reported issues.

Of particular note are

BBEdit now supports the use of iCloud Drive for sharing application support and setup items. This works similarly to the existing Dropbox support: in your "iCloud Drive" folder, create a folder named "Application Support", and then within that create a folder named "BBEdit". You can populate that folder with the contents of your /Users/USERNAME/Library/Application Support/BBEdit/ folder.

BBEdit now supports the use of iCloud Drive for a shared backup folder: in your "iCloud Drive" folder, create a folder named "BBEdit Backups", and if you have turned on Use Historical Backups, BBEdit will use this folder.


Comments

NASA Needs a FORTRAN Programmer

 

Any posts on Fortran always seem to attract attention so I thought I'd flag this opportunity.

Larry Zottarelli, the last original Voyager engineer still on the project, is retiring after a long and storied history at JPL. While there are still a few hands around who worked on the original project, now the job of keeping this now-interstellar spacecraft going will fall to someone else. And that someone needs to have some very specific skills …Know Cobol? Can you breeze through Fortran? Remember your Algol

More details here

Comments

Apple and Open Source

 

Whilst the decision to make Swift open source certainly captured the headlines, it is worth noting that Apple contributes to many more open source projects, there are more details about these open source projects on the developer and main Apple websites.

Comments

Swift Open Source

 

As I previously highlighted after the WWDC Apple have announced that Swift is now open source.

More details are on the Swift blog

Swift is now open source. Today Apple launched the open source Swift community, as well as amazing new tools and resources including: Swift.org – a site dedicated to the open source Swift community Public source code repositories at github.com/apple A new Swift package manager project for easily sharing and building code A Swift-native core libraries project with higher-level functionality above the standard library Platform support for all Apple platforms as well as Linux

Swift.org is an entirely new site dedicated to open source Swift. This site hosts resources for the community of developers that want to help evolve Swift, contribute fixes, and most importantly, interact with each other. It also provides development snapshots for Apple and Linux platforms, requires OS X 10.11 (El Capitan) or Ubuntu 14.04 or 15.10 (64-bit).

Source code is available on Github

Comments

Flagging potential aggregators in Vortex

 

Promiscuous inhibition caused by small molecule aggregation is a major source of false positive results in high-throughput screening. A recent particularly valuable publication, Irwin, Duan, Torosyan, Doak, Ziebart, Sterling, Tumanian and Shoichet, J Med Chem, 2015, 58(1 7), 7076-7087 DOI, has collated over 12,000 organic molecules known to act as aggregators at concentrations used in screening campaigns, and provides a resource Aggregation Advisor that can be used to try and predict possible false positives. However in many instances it would be unwise to submit proprietary information to the public web service. Potential aggregators are flagged based on calculated LogP >3 and/or similarity >0.85 to a known aggregator (using path based fingerprint) this script calculates xLogP using the algorithm provided by Dotmatics and then uses OpenBabel fast search to calculate the closest similarity to a known aggregator.

Full details of the Vortex script are here.

xlogpaggscore

Comments

CodeSwitch

 

This is probably the application that many programmers have been waiting for, CodeSwitch is the first Objective-C to Swift code converter for Mac OS X.

CodeSwitch will convert all of your Objective-C code into Swift code instantly. Simply copy-and-paste any Objective-C code into this program, and it will translate it to Swift in milliseconds.

So those legacy programs won't need a complete rewrite!

The latest post on the Swift blog highlights a new feature in Xcode 7.1, the ability to embed file, image, and color literals into your playground’s code.

For instance, there’s no need to type “myImage.jpg” in the editor – just drag your image from the Finder and the actual image will appear in-line with your code.

Useful if you tend to name images, image1, image 2 etc ;-)

Comments

Scripting Openbabel

 

@MatToddChem recently tweeted

Chemdraw file containing lots of molecules --> separate png/jpg images of each molecule. Anyone got a script that automates that? #headache

Whilst it is possible to convert a ChemDraw file to an image the problem is you get a single png file containing all the structures. In order to get individual image files it is first necessary to separate the individual structures. The easiest way to this is to convert from cdx to SMILES format. We can then take each of the individual SMILES strings and generate an image using OpenBabel all controlled by an Applescript.

Comments

Swift

 

I've mentioned the Swift blog before and I note there is a new entry describing the additions to Xcode 7.

I've also come across a couple of blogs that also be of interest Inessential by Brent Simmons, Kickingbear, and Weheartswift which helps you learn Swift from scratch.

Comments

KNIME 2.12 released

 

The latest update to KNIME has been released.

The KNIME Analytics Platform incorporates hundreds of processing nodes for data I/O, preprocessing and cleansing, modeling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all of the analysis modules of the well known Weka data mining environment and additional plugins allow R-scripts to be run, offering access to a vast library of statistical routines.

What's New in KNIME 2.12

Analytics - Decision Tree to Rule Set (New node) - Rule Handling (New node) - Statistics measure as aggregation methods in GroupBy node - Extended PMML Support (New node) - Data Generation (New node) - More Statistics Nodes (New set of nodes)

Tool Integration - New MongoDB Integration (New set of nodes) - Javascript Integration (New set of nodes) - Extended JSON Processing (New set of nodes) - XML XPath Interactive extraction (New node) - Extended Python Integration (New node)

Comments

FTranProjectBuilder Updated

 

Fortran has a long history with scientific programming and so it is perhaps not surprising that overtime I mention Fortran there is an uptick in readers. FTranProjectBuilder is the only IDE specifically designed for Fortran programming on the Mac and has recently been updated to version 2.0.

This update brings improvements to the interface, context sensitive autocompletes, and in addition derived type, Intrinsic procedure, module and procedure popovers

showingautocomplete_med

In addition the Nocturnal Aviation Software has been updated with screenshots of the updated FTranProjectBuilder, Mac Fortran Blog and a section of free Fortran code snippets.

A while back I compiled a list of resources for Fortran on a Mac.

Comments

Strings in Swift 2

 

The latest entry on the Swift Blog discusses a change in the way Strings are handled in Swift.

Read more here

Comments

ChEMBL python update.

 

Excellent blog post on the ChEMBL python update.

http://chembl.blogspot.co.uk/2015/07/chembl-python-client-update.html

Comments

Accessing Open Source Malaria Data using an iPython Notebook

 

The Open Source Malaria project is trying a different approach to curing malaria. Guided by open source principles, everything is open and anyone can contribute. To date a lot of people around the world have made contributions and the project is at a very exciting stage. Whilst everyone can see the compounds that have been made and the biological data, it is often spread over multiple web pages and can be tricky to link molecule with identifier with data. Over the last couple of months a significant effort has been put into populating a spreadsheet with all the information.

I've recently published a Vortex script to access the information, I've now published an iPython notebook that also shows how to import the data. Why not give it a try and then contribute your findings and suggestions to the Open Source Malaria project.

Comments

Script to remove duplicates in Vortex

 

When working with multiple data sets of molecules, particularly if combining them from multiple sources, one of the most common tasks is removal of duplicates. This can be a time-consuming and error prone process if carried out manually and this script should hopefully make this a much easier task.

http://macinchem.org/reviews/vortex/tut27/scripting_vortex27.php.

There are many more Hints, scripts and tutorials here.

Comments

InChI, the IUPAC International Chemical Identifier

 

InChI is the International Chemical Identifier developed under the auspices of IUPAC and are intended to be unique identifiers, they are freely usable and non-proprietary; they can be computed from structural information and do not have to be assigned by some organization;most of the information in an InChI is human readable (in theory!).

A recent paper in J Cheminformatics DOI describes the design, layout and algorithms of InChI, if you want to understand or implement the code this is a great starting point.

The paper is organized as follows. First, we discuss the general concepts associated with chemical identifiers. Then we outline the design goals of InChI and our general approach, focussing on the InChI model of chemical structure and the hierarchical layered structure of the Identifier; the concept of Standard InChI is introduced. This is followed by a detailed description of each of the possible major InChI layers, accounting for molecular connectivity, charge, stereochemistry, isotopic enrichment, position of hydrogen atoms and bonding in metal compounds, and the sublayers associated with these layers. We then describe the workflow of InChI generation (normalization, canonicalization, and serialization stages), as well as generation of the compact hashed code derived from InChI (InChIKey); the related algorithms and implementation details are briefly discussed. Finally, we provide information about InChI Software, licensing, known problems/limitations, and future prospects for InChI.

The source code and documentation can also be downloaded from here http://www.inchi-trust.org/downloads/

Comments

Importing Open Source Malaria Project data

 

The Open Source Malaria project is trying a different approach to curing malaria. Guided by open source principles, everything is open and anyone can contribute. To date a lot of people around the world have made contributions and the project is at a very exciting stage. Whilst everyone can see the compounds that have been made and the biological data, it is often spread over multiple web pages and can be tricky to link molecule with identifier with data. Over the last couple of months a significant effort has been put into populating a spreadsheet with all the information.

Whilst this is useful for viewing results it is not ideal for trying to build predictive models. Vortex is a chemically intelligent data analysis and visualisation platform. This script provides a one-click access to the OSM data and creates a workspace containing all the data, and since it is linked to the live spreadsheet you will always have access to the latest data.

osmvortex

Comments

Installing Open Drug Discovery Toolkit (ODDT)

 

A recent paper in J Cheminformatics described Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field DOI a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT’s source code, additional examples and documentation are available on GitHub.

To install ODDT on a Mac you first need to install the appropriate toolkits, the easiest way is to use Homebrew, I've written a page detailing how to do this here.

Once installed you can install ODDT using PIP as described here.

Comments

Swift 2.0

 

More news on Swift 2.0 on the Swift Blog

Today at WWDC, we announced Swift 2.0. This new version has even better performance, a new error handling API, and first-class support for availability checking. And platform APIs feel even more natural in Swift with enhancements to the Apple SDKs.

Open Source In addition to new features, the big news is that Apple will be making Swift open source later this year. We are all incredibly excited about this, and look forward to giving you a lot more information as the open source release gets nearer. Here is what we can tell you so far:

Swift source code will be released under an OSI-approved permissive license. Contributions from the community will be accepted — and encouraged. At launch we intend to contribute ports for OS X, iOS, and Linux. Source code will include the Swift compiler and standard library. We think it would be amazing for Swift to be on all your favorite platforms. We are excited about the opportunities an open source Swift creates for our industry. Baked-in safety features combined with excellent speed mean it has the chance to dramatically improve software versus using C-based languages. Swift is packed with modern features, it’s fun to write, and we believe it will get used in a lot of places. Together, we have an exciting road ahead.

Comments

Swift Open Source

 

Perhaps one of the more unexpected news items from WWDC2015.

Swift is now Open Source!

Comments

OLCF’s second OpenACC hackathon

 

The GPU Science page is always pretty popular so I thought I'd thought I'd mention an upcoming event.

OLCF’s second OpenACC hackathon will take place the week of October 19th, 2015

The goal of each hackathon is for current or prospective user groups of large hybrid CPU-GPU systems to send teams of 3-6 developers along with either (1) a (potentially) scalable application that needs to be ported to GPU accelerators, or (2) an application running on accelerators which needs optimization. There will be intensive mentoring during this 5-day hands-on workshop, with the goal that the teams leave with applications running on GPUs, or at least with a clear roadmap of how to get there. Our mentors come from national laboratories, universities and vendors, and besides having extensive experience in programming with OpenACC, many of them develop the OpenACC-capable compilers and help define the OpenACC standard.

The application period is now open and closes on 3 July, 2015. Space will be limited to a maximum of eight teams, with two mentors for each team. Groups will be notified about acceptance or rejection of their application by Friday, July 31, 2015. See below how to apply. Prior GPU experience is not required! Those groups whose application successfully passes the selection process will receive further information regarding registration.

Comments

Replacing Photoshop With NSString

 

A really clever way to create icons using ascii art. It is open-source and released under the MIT license on GitHub.

I think one of the real advantages of this is you can actually see the image you want to create in the code.

asciiart

is rendered as..

asciart2

You can read more here

Comments

HackaMol: An Object-Oriented Modern Perl Library for Molecular Hacking on Multiple Scales

 

To be honest I can't remember when I last used Perl but this publication brought back a few memories DOI.

HackaMol is an open source, object-oriented toolkit written in Modern Perl that organizes atoms within molecules and provides chemically intuitive attributes and methods.

Source code and example scripts are available online at http:// github.com/demianriccardi/HackaMol. There is also a description of an IPerl Notebook in the supporting information.

There is also a very interesting extension HackaMol::X::Vina, a structured class that provides an interface with the AutoDock Vina docking program

Comments

The 3rd International Workshop on OpenCL

 

The 3rd IWOCL (International Workshop on OpenCL) takes place at Stanford University, California from Monday 11 to Wednesday 13 May 2015, and includes the addition of an Advanced Hands-On OpenCL course to the schedule on Monday.

More details

Acceleware will be offering two 4-day training courses in Canada. The first course will be in Calgary Alberta from May 26-29, 2015. The second course will be offered in Montreal, June 9-12, 2015. These professional four day courses are designed for programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage data parallel processing capabilities of GPUs.

More details

Also

The Khronos Group has released revision 30 of the SPIR-V specification. This revision of SPIR-V includes multiple corrections and synchronizes all token spellings to the official headers. These official C/C++ headers are available along with the specification in the registry.

Comments

Swift Updates

 

The latest update to Xcode 6.3 includes Swift 1.2, latest reports suggests a significant improvement in speed.

Rather than feeling like using a computer that’s 10 years obsolete, algorithms that were borderline rate limiting running in the main UI thread just happen like they ought to. As a reality check, I re-ran the horrendously underperforming algorithm that I complained about awhile back, and rather than taking 320 seconds to calculate 7 log P values, it now gets the job done in 30 seconds.

The latest entry on the Swift blog also describes means to improve performance.

Comments

New playgrounds in Swift

 

The latest entry on the Swift blog highlights additions to the playgrounds within Xcode 6.3 beta 3.

Playgrounds are now represented within Xcode as a bundle with a disclosure triangle that reveals Resources and Sources folders when clicked. These folders contain additional content that is easily accessible from your playground’s main Swift code. To see these folders, choose View > Navigators > Show Project Navigator (or just hit Command-1)

There is an example playground that calculates Mandelbrot set which looks like fun to play with.

mandlebrot

Comments

CotEditor is an open source text editor

 

I use markdown extensively on my websites, “Markdown” is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML allowing you to build HTML documents in an easily readable form. As the use of Markdown has increased in popularity there are now a wide choice of plain text/Markdown editors available for MacOSX and recently iOS so I’ve updated the listing of Markdown Editors.

The latest addition is CotEditor an open source text editor designed specifically for Mac OS X. There is syntax highlighting for 40 different languages, and it is scriptable using Python, Ruby, Perl, PHP, UNIX shell or AppleScript (and even JavaScript on Yosemite). It also contains a find and replace using regular expressions.

Comments

Swift course on iTunes U

 

I just noticed that there is are new Swift programming courses available on iTunes U.

The outstanding Stanford University iOS development courses on iTunes U has been updated to use Swift. The first two lectures from the new “Developing iOS 8 Apps with Swift” class are now live and additional lessons will be added as they are taught. In addition there is also a new course from Plymouth University in the UK.

Comments

MedChem Wizard KNIME workflow

 

The MedChemWizard is a KNIME workflow designed to assist medicinal chemists with idea generation, ligand design and lead optimization using a number of common functional group transformations and medchem rules-of-thumb, this tutorial provided by Dr. Alastair Donald gives a detailed description of it's use.

mcwizard

Comments

Swift's unprecendented growth

 

he latest RedMonk Programming Language Rankings are now available, these rankings have been run since 2010 using essentially the same methodology so we can compare with historical data. The plot below compares popularity of Stack Overflow and GitHub.

lang.rank_.plot_.q1151

Whilst the top ten is pretty much as expected

1 JavaScript
2 Java
3 PHP
4 Python
5 C#
5 C++
5 Ruby
8 CSS
9 C
10 Objective-C

Perhaps the most important change is just outside the top 20.

the growth that Swift experienced is essentially unprecedented in the history of these rankings. When we see dramatic growth from a language it typically has jumped somewhere between 5 and 10 spots, and the closer the language gets to the Top 20 or within it, the more difficult growth is to come by. And yet Swift has gone from our 68th ranked language during Q3 to number 22 this quarter, a jump of 46 spots.

Swift is an innovative new programming language for Cocoa and Cocoa Touch and you can find out more about Swift on the Apple Developer site,



Comments

ASObjC Explorer, version 4.1

 

Just got this email.

A major update of ASObjC Explorer, version 4.1, is now available, just in time for the holiday season. This version incorporates a new and improved logging engine, incorporating extended AppleScript syntax styling and now resolving Cocoa objects -- no more will you have to deal with «class ocid» id «data optr000...» entries. ASObjC Explorer is the editor for Mavericks and Yosemite users wishing to write ASObjC code. You can read more here: http://www.macosxautomation.com/applescript/apps/explorer.html. You can also download it and try it out free for 30 days. To celebrate, you can use the coupon code NEWLOG to receive a $US10 discount between now and the end of January. Ho, ho.

Comments

CodeRunner 2

 

I just noticed that an update to CodeRunner has been released. CodeRunner 2 is a complete overhaul of the original app and in addition to a host of new features also brings support for Yosemite. CodeRunner can run code in 20 languages out-of-the-box, and can be easily extended to run code in any other language. Adding a language is as easy as entering its terminal command.

Comments

Learn to code. Change the world

 

As part of the Computer Science Education Week (Dec 8-14th) Apple are hosting workshops in their retails stores.

Join us on December 11 for the Hour of Code, a free one-hour introduction to the basics of computer programming.

Comments

PyCharm

 

I must admit I’m a big fan of BBEdit for all my text editing, Markdown and python programming but I still keep an eye out for interesting alternatives. I was recently sent a link to PyCharm a Python IDE. PyCharm's code editor provides support for Python, JavaScript, CoffeeScript, TypeScript, CSS, and a number of other languages. What caught my eye was the recently added support for iPython notebooks, with PyCharm 4 you can perform all the usual IPython Notebook actions with *.ipynb files. Everything you're used to doing with the ordinary IPython Notebook is now supported inside PyCharm.

ipythonnotebook

Another very useful features for scientific programming is the NumPy array viewer to easily get a graphical view of a NumPy array and support for matplotlib.

There is a really comprehensive support section that includes demos and screencasts .

Comments

Vortex scripts to access ChEMBL

 

ChEMBL is a manually curated chemical database of bioactive molecules . It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK. The database currently contains over 1.4 million unique structures with the associated activity at 10,579 different targets. It also acts as a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases.

Whilst the database can be downloaded, the data can also be accessed via a web interface (shown below) and a series of web services, these Vortex scripts show how it is possible to pull data from ChEMBL into Vortex.

As usual I’ve written it as a tutorial to try and offer some explanation how the script works, Scripting Vortex 23:- Accessing ChEMBL using Web Services

I think this rather nicely shows the power of web services and json.

There is a list of other Vortex scripts on the Hints and Tutorials page

Comments

Scripting is a fundamental lab skill

 

I was very much struck by this blog post on the Cryptogenomicon blog, in particular the comment that biologists need to be able to script.

The most important thing I want you to take away from this talk tonight is that writing scripts in Perl or Python is both essential and easy, like learning to pipette. Writing a script is not software programming. To write scripts, you do not need to take courses in computer science or computer engineering. Any biologist can write a Perl script. A Perl or Python script is not much different from writing a protocol for yourself. The way you get started is that someone gives you one that works, and you learn how to tweak it to do more of what you need. After a while, you'll find you're writing your own scripts from scratch.

I'd probably expand that to say that in most scientific disciplines being able to script is extremely useful. Whilst Perl has been (and maybe still is) a preferred scripting language for biologists I suspect that python is now the “lingua franca” for scientific scripting. I've given a couple of talks recently on the use of open source tools for drug discovery and I asked the audience for a show of hands for different programming/scripting languages and Python seems to be by far the most widespread. Indeed one of the key features that determines the selection of a program is that it provides a scripting interface that allows it to be integrated into a scientific workflow.

I'd also add that scripting does not have to result in some major software program, often the most useful scripts are those that do very simple things that just make life easier. The most popular download on this site is the applescript that simply prints the clipboard, it allows you select a piece of text from a page, copy, then print, no opening another application to paste it into.

Another example of everyday scripts are the Safari extensions, not sure what the structure of a named drug on a web page is, select the name and right click (or control click) and an option appears to search for the highlighted drug on ChemSpider and as if by magic a small window opens displaying the structure. Pretty neat and the beauty is that it only involves a small amount of javascript programming, all the heavy lifting has been done by Apple who provide the Safari extension framework and the ChemSpider folks who provide the database and web service.

Pasted Graphic

Similarly the Chemical Identifier Resolver script makes use of the Chemical Identifier Resolver (CIR) by the CADD Group at the NCI/NIH to convert a huge variety of chemical identifiers into a structure.

I’d also echo the view that you don’t need to have attended a computer science course, I very firmly believe in “cut and paste” school of programming. Take an existing script and modify it to see how it works, then customise it to solve a problem that you have.

In addition to the Applescripts section the “Hints and Tutorials” section of this site includes many more scripts that use Python, Python, javascript etc., you are free to use any as starting points, if you feel they maybe generally useful I’d be happy to add them to the site.

Comments

OS X Source code

 

Apple has released the Darwin source code for OS X 10.10 here.

Comments

New Swift website

 

In addition to the Swift Blog there is now a dedicated Swift website

Swift is an innovative new programming language for Cocoa and Cocoa Touch. Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.

The latest entry on the Swift blog deals with “Failable Initialisers” a new feature in Swift 1.1 part of Xcode 6.1.

Comments

Impressions of Apple’s Swift, after a bit of practice

 

Swift is a new programming language from Apple for iOS and OS X apps that builds on the best of C and Objective-C, without the constraints of C compatibility. I’m delighted to hear that people are starting to explore it’s use in scientific applications. Dr. Alex M. Clark has posted his early impressions on the Cheminformatics blog, well worth a read.

There is also the Swift blog for more interesting tips.

Comments

Cheminformatics iPython notebook

 

George Papadatos, from the ChEMBL group, has produced a superb iPython notebook tutorial demonstrating the use of RDkit.

ipypng

Comments

Swift now at 1.0

 

It has just been announced that Swift has reached 1.0.

Today is the GM date for Swift on iOS. We have one more GM date to go for Mac. Swift for OS X currently requires the SDK for OS X Yosemite, and when Yosemite ships later this fall, Swift will also be GM on the Mac. In the meantime, you can keep developing your Mac apps with Swift by downloading the beta of Xcode 6.1.

Swift is an innovative new programming language for Cocoa and Cocoa Touch. Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.



Comments

Open Phacts API update

 

The OpenPhacts API has been updated to include two new data sets and the corresponding API calls.

1) DisGeNet target-disease associations These API calls use URIs inputs that correspond to either diseases or targets (proteins or genes). The disease identifiers correspond to UMLS CUIs, Mesh ids or ConceptWiki and can use several namespaces, e.g. http://linkedlifedata.com/resource/umls/id/C0004238, http://purl.bioontology.org/ontology/MSH/D001281, or http://www.conceptwiki.org/concept/index/095cb66f-76ef-41b5-a8ae-c39352e6007e

2) neXtProt nanopublications for tissue expression (PREVIEW mode) These API calls use URIs that correspond to either tissues or targets. The tissue identifiers correspond to the Caloha tissue ontology from neXtProt. These identifiers can use either the namespace from the neXtProt database (e.g. http://www.nextprot.org/db/term/TS-0564, will be operational next week) or the Caloha ontology (ftp://ftp.nextprot.org/pub/currentrelease/controlledvocabularies/caloha.obo#TS-0564, operational now).

To reduce the barriers to drug discovery in industry, academia and for small businesses, the Open PHACTS Discovery Platform provides tools and services to interact with multiple integrated and publicly available data sources. To integrate this data, extensive cross-referencing of scientific concepts is needed across all databases.

Comments

BBEdit 10.5.11 Released

 

Everybody’s favourite text editor have been updated BBEdit 10.5.11 release consists entirely of fixes for reported issues, and contains no new features.

I also saw this note about future Yosemite support.

OS X Yosemite (10.10) We cannot fully support the use of our products on beta versions of OS X. If you encounter difficulty when using the latest version of BBEdit, TextWrangler, or Yojimbo on any pre-release version of OS X Yosemite, please first verify whether the problem also occurs when running on the latest public release of OS X Mavericks (10.9); and then file a bug report with Apple using an appropriate feedback channel. They will contact us as necessary.

Comments

Latest update on Swift blog

 

The latest post on the Swift developers blog concerns Value and Reference Types

Types in Swift fall into one of two categories: first, “value types”, where each instance keeps a unique copy of its data, usually defined as a struct, enum, or tuple. The second, “reference types”, where instances share a single copy of the data, and the type is usually defined as a class. In this post we explore the merits of value and reference types, and how to choose between them.

Comments

Balloons playground

 

An update on the swift blog

Many people have asked about the Balloons playground we demonstrated when introducing Swift at WWDC. Balloons shows that writing code can be interactive and fun, while presenting several great features of playgrounds. Now you can learn how the special effects were done with this tutorial version of ‘Balloons.playground’, which includes documentation and suggestions for experimentation.

This playground uses new features of SpriteKit and requires the latest beta versions of Xcode 6 and OS X Yosemite

Comments

OpenEye Toolkits Updated

 

OEToolkits 2014.Jun This release of the OpenEye toolkits is focused on stability and new platform support. The last release, 2014.Feb, was a major feature release introducing numerous new features. This release focused on fixing many bugs and improving the overall stability of the OpenEye toolkits.

There is still a major new feature being added in this release:

FreeForm API added to Szybki TK

Mac Users should note this release will be the last release to support OSX 10.7.

Comments

RegExRX

 

I don’t use regular expressions often enough to become an expert but I do use them often enough to know how valuable they are. I always seem to spend more time than I’d like sorting out the regular expression and I often feel that I’ve done something similar before.

I came across an application that I think will make my life a lots easier, RegExRX is a regular expression development tool.

A complete regular expression development tool meant for novices and professionals alike, this editor has many features designed to help in the development and storage of regular expressions. Based on the PCRE library, RegExRX will allow a user to craft patterns that are compatible with most regular expression flavors and will let them easily copy those patterns to other languages like Objective-C, Perl, Ruby, PHP and Xojo.

regexrx

Comments

Swift blog

 

Apple have started a new blog to help developers interested in Swift, Swift is a new programming language for Cocoa and Cocoa Touch.

Get started with Swift by downloading Xcode 6 beta, now available to all Registered Apple Developers for free. The Swift Resources tab has a ton of great links to videos, documentation, books, and sample code to help you become one of the world's first Swift experts. There's never been a better time to get coding!

The first blog entry deals with the issue of compatibility, and how to ensure your app will continue to function as the language evolves.

Comments

Fortran on a Mac

 

Last month I compiled a page of Fortran resources for the Mac. At the time I was hoping it would be a useful resource but thought it would draw a limited audience. In fact it turned out to very popular, the page has been accessed nearly 1000 times with readers spending between 3 and 4 minutes on the page. I've also been contacted by a couple of fortran developers who have suggested additional resources and tips for compiling.

Comments

ASObjC Explorer has been updated

 

ASObjC Explorer for Mavericks

Changes from version 3.2.0 to 3.2.2

Code-completion enhancements. Code-completion has been enhanced for relevance. As part of this, Explorer supports new variable-naming conventions. Please read the section entitled 'Contextual Completion' in the Help file for more details.

Bug fix. Choosing 'Look Up in AppKiDo' from the contetual menu in the Library Pane erroneously entered the resulting script in the log. This no longer happens.

Sparkle update. A newer version of the Sparkle update framework is included. If you choose automatic updates, they should now happen automatically.

Updated scripts and user shortcuts. You can extract these from the application's bundle, or remove the existing ones and relaunch the application to have them instlled automatically.

Updated example scripts. These have been updated to reflect the new variable naming conventions

Comments

Fortran on a Mac

 

First let me say I’m not a big Fortran user but any blog posts about Fortran always seem to be very popular, and I do get asked regularly about how to compile Fortran applications. So I’ve put together a page summarising all the resources that I’m aware of, together with some installation instructions.

If you know of anything I’ve missed feel free to email me or add them to the comments.


Comments

Scientific Programming

 

Whenever I post anything about Fortran there is a noticeable uptick to page views so I thought I’d post a link to this review on Ars Technica

Scientific computing’s future: Can any coding language top a 1950s behemoth?. This is a very interesting review and discussion on scientific programming and it is no surprise that for the more computationally intensive number crunching applications Fortran still rules.

“I don't know what the language of the year 2000 will look like, but I know it will be called Fortran.” —Tony Hoare, winner of the 1980 Turing Award, in 1982.

Comments

Porting of BUDE (Bristol University Docking Engine) to OpenCL.

 

A recently publication “High Performance in silico Virtual Drug Screening on Many-Core Processors” DOI describes porting BUDE (Bristol University Docking Engine) to OpenCL.

Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single NVIDIA GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, includ- ing GPUs from NVIDIA and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets.

BUDE is now one the fastest HPC applications ever developed and nicely demonstrates the portability of OpenCL across different architectures.

There is a list of GPU accelerated applications here.

Comments

Absoft Pro Fortran 2014 v14 Compiler Suite

 

I’ve just been told that the Absoft Pro Fortran 2014 v14 Compiler Suite For Mac OS X is available.

Pro Fortran 2014 v14 - A few of the new features for this release are: AWE-Chart, AWE-Plot, AWE-Form & Enhanced AVX Instruction set performance. Pro Fortran 2014 builds faster code with Absoft's Exclusive Dynamic AP Load Balancing Technology, OpenMP 3.0 support, SMP Analyzer, Tools Plug-in, New HPC Scientific & Engineering Math Library and more.

The Absoft IDE is the only commercial Fortran/C++ development environment designed by Fortran experts. It includes: programmer's editor, Absoft's SMP and Vector analyzer, Fx3 graphical debugger, SMP and MPI control features, optimized math libraries and 2D/3D graphics.

Comments

Un1Chem Vortex script updated.

 

The Vortex script that accesses Un1Chem has been updated. This is a minor bug fix.

Un1Chem is a new web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between databases. Currently the uniChem contains data from 21 different data sources:-



Comments

CLFORTRAN – Pure Fortran Interface to OpenCL

 

I know that Fortran is still very important in scientific computing so this may be of interest.

CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard.

Added to the GPU Science page.

Comments

Graph Builder Updated

 

Graph Builder has been updated to version 10.9.16.

  • Made the heat map (aka: image map, point fill) and 3D scatter, surface and volume color mapping editor significantly better.
  • Added a palette that shows how to script a multi-level animated pie chart.
  • Removed depreciated system calls.
  • Adjusted many items under the hood in preparation for v11.
  • Special Note: The v11 build is being worked on and your feedback to support@vvi.com is very welcome.

Graph Builder is a powerful application rich in graphic editing, creation and programming to facilitate the visualization of information. It has a good complement of 2D and 3D graph features, a full-fledged user interface and is programmable. Paste data into table editors, write scripts to generate data, load a Xcode plugin you write for data generation and to retrieve data from external sources.

There is a comprehensive list of data analysis tools for Mac OSX here.

Comments

Vortex runs on Raspberry Pi

 

A while back in a very neat demonstration of the portable coding approach Dotmatics released ElementalDB for the iPad an iPad application that does a substructure search of a 1,200,000 Chembl structure database in less than a second. Well now they have gone even further and ported their data visualisation tool Vortex to the raspberry Pi.

Raspberry Pi is a $35 credit-card sized computer that plugs into your TV and keyboard. It is used in electronics projects and for many of the functions usually assigned to a desktop PC such as spreadsheets, word-processing and games.  It features a 700MHz ARM processor and can run a Debian Linux derived operating system. Compiling Vortex on this platform took just a few minutes as it involved building upon Oracle’s JDK 7 which was recently released for the Pi.

pi

Comments

Scripting Vortex to access Un1Chem

 

Un1Chem is a new web resource provided by the EBI, it is a 'Unified Chemical Identifier' system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between databases. Currently the uniChem contains data from 21 different data sources.

This script originally created by Sune Askjær first calculates the InChiKey for molecules in a workspace and then uses Un1Chem to search for information in multiple databases, then it provides a summary and a link to a locally generated summary table.

unichem2

Full details are here Scripting Vortex 18.

Comments

SVGgh

 

I’ve just added SVGgh from GenerallyHelpfulSoftware to the MobileScience website.

SVGgh is a collection of classes for using SVG as artwork in iOS Apps. Includes a UIView and a button class.

No excuse for using bit mapped images!

Comments

Scripting Vortex 17 tutorial

 

In the tutorial Scripting Vortex 15 I showed how it is possible to create a contextual script for Vortex that downloaded a specific PDB file, then a FlexAlign Vortex script first identifies the structure column and then get the SMILES string of the selected molecule generates a 3D structure and uses Flex Align to do a one-shot flexalign between the ligand in the system in MOE, and the incoming ligand.

While this is useful if you have similar structures (perhaps analogues in a series) there will certainly be situations where it may be preferable to dock the new ligand into the binding site. The Scripting Vortex 17 tutorial describes how to achieve this.

Comments

The Julia Language

 

Whilst I’m not a full time programmer I do keep an eye out for tools that are suited for scientific computing on a Mac. Julia is a high-level, high-performance dynamic programming language for technical computing with an extensive mathematical function library.

The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing. In addition, the Julia developer community is contributing a number of external packages through Julia’s built-in package manager

Julia’s LLVM-based just-in-time (JIT) compiler combined with the language’s design allow it to approach and often match the performance of C. The table below taken from the Julia website gives an idea of the relative performance for a number of simple benchmarks.

julia

Julia was designed to support multiple cores and cloud computing. IJulia is a Julia-language backend combined with the IPython interactive environment. This combination allows you to interact with the Julia language using IPython's powerful graphical notebook, which combines code, formatted text, math, and multimedia in a single document

For the Mac a Julia-version.dmg file is provided, which contains Julia.app. Installation is the same as any other Mac software – copy the Julia.app to your hard-drive (anywhere) or run from the disk image. The core of the Julia implementation is licensed under the MIT license.

Comments

iOS7 Tech talks

 

The iOS7 tech talk videos are now online.

Registered Apple Developers can watch full sessions from the iOS 7 Tech Talks to get in-depth guidance on developing for iOS 7. Discover how to create innovative, flexible, and intuitive apps that integrate the latest Apple technologies.

ios7

For science apps for iOS have a look at the mobile science page.

Comments

Graph Builder

 

I just got a message about an update to Graph Builder a very popular and powerful application from VVimaging, Inc rich in graphic editing, creation and programming to facilitate the visualization of information. It has a excellent complement of 2D and 3D graph features, a full-fledged user interface and is programmable. Paste data into table editors, write scripts to generate data, load a Xcode plugin you write for data generation and to retrieve data from external sources. Also supports dynamic graphs.


Comments

iOS programming

 

As the MobileScience site expands I’ve started to add other resources in addition to the applications. There are now a couple of useful additions for iOS programmers if you are looking for a training course, a plotting library or a chemistry toolkit.

I’d be delighted to add any more useful resources.

Comments

FTranProjectBuilder

 

I’ve just updated the listing of Scientific Applications under Mavericks and I thought I’d highlight one application. The page I have on Tools for Fortran Programmers is consistently one of the top accessed Blog entries. I’m sure one of the reasons for this popularity is FTranProjectBuilder the only Mac-native Fortran development environment (IDE) it works with the gfortran, g95, ifort, Absoft Pro Fortran, NAG nagfor and PGI pgfortran compilers. Since I mentioned it last FTranProjectbuilder has been updated six more times, in April, May, July, August and October with new features like the ability to build static libraries, trackpad interaction and compiler errors now being marked in the source code. Full details are in the project notes on the Nocturnal Aviation Software website , and yes, all of the tools, including FTranProjectbuilder are compatible with 10.6.8+, and run fine on 10.9 Mavericks. 

Fortran still has a significant user base in the scientific community where the speed, portability, array handling and access to shared memory make it a very powerful option.

There is a nice comparison of programming languages from the viewpoint of scientists here

Comments

Scripting Vortex 16

 

OCHEM is a free open access site of annotated models and chemical data. OCHEM contains 1831772 experimental records for about 477 properties collected from 12457 sources you are free to upload your own data and also build predictive models using existing or your own data.

There are also a number of already built models that the public can access, these include

  • Ames test
  • CYP1A2 inhibition
  • LogP and Solubility

You can run predictions on OCHEM using simple REST-like web services, these vortex scripts submit tasks to the various models and then retrieve the resulting prediction.

Comments

SeqAn

 

I’ve recently been sent details of SeqAn, an open source C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. It is released under BSD/3-clause license and is supported under Mac OS X: GNU/g++, LLVM/Clang (3.0+).

Andreas Döring, David Weese, Tobias Rausch and Knut Reinert. SeqAn an efficient, generic C++ library for sequence analysis. BMC Bioinformatics, 9:11, 2008. DOI

Comments

OpenCL training course

 

Simon McIntosh-Smith has just released a new OpenCL training course “HandsOnOpenCL" via Github. It Includes a comprehensive set of exercises and solutions in C, C++ & Python.

There is a list of GPU-accelerated scientific applications here.