Macs in Chemistry

Insanely Great Science

Khronos Releases OpenCL 2.2


I see that OpenCL 2.2 has been released and reading through the press release there are a couple of notes that might be of wider interest.

By finalizing OpenCL 2.2, Khronos has delivered on its promise to make C++ a first-class kernel language in the OpenCL standard,” said Neil Trevett, OpenCL chair and Khronos president. “The OpenCL working group is now free to continue its work with SYCL, to converge the power of single source parallel C++ programming with standard ISO C++, and to explore new markets and opportunities for OpenCL — such as embedded vision and inferencing. We are also working to converge with, and leverage, the Khronos Vulkan API — merging advanced graphics and compute into a single API.

Vulkan is a new generation graphics and compute API that provides high-efficiency, cross-platform access to modern GPUs used in a wide variety of devices from computers and consoles to mobile phones and embedded platforms.

There is page of GPU accelerated applications in science applications here


Updated the GPU Science page


I've just updated the GPU Science page adding a couple of new applications including cyroSPARC.


The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development


I just came across an interesting paper on cross-platform OpenCL programming. The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development. In particular it highlights a number of issues and offers workarounds. These include Framework bugs, Specification limitations and Program bugs.

There are an increasing number of scientific applications taking advantage of GPU acceleration.


International Workshop on OpenCL


Now in its fourth year, the International Workshop on OpenCL will be held during April at the C3 Convention Center in Vienna, Austria. The majority of sessions will run April 20–21 and will consist of a mix of keynotes, academic papers, technical presentations, tutorials and poster sessions. The workshop kicks-off on April 19 with an Advanced Hands On OpenCL tutorial.

There is a page of scientific applications taking advantage of GPU here.


GPU-Accelerated Scientific Applications


There has been a discussion about GPU-Accelerated Scientific Applications on CCL so I thought it was a good time to update the GPU Science page.

Many thanks to all those who contributed.


The 3rd International Workshop on OpenCL


The 3rd IWOCL (International Workshop on OpenCL) takes place at Stanford University, California from Monday 11 to Wednesday 13 May 2015, and includes the addition of an Advanced Hands-On OpenCL course to the schedule on Monday.

More details

Acceleware will be offering two 4-day training courses in Canada. The first course will be in Calgary Alberta from May 26-29, 2015. The second course will be offered in Montreal, June 9-12, 2015. These professional four day courses are designed for programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage data parallel processing capabilities of GPUs.

More details


The Khronos Group has released revision 30 of the SPIR-V specification. This revision of SPIR-V includes multiple corrections and synchronizes all token spellings to the official headers. These official C/C++ headers are available along with the specification in the registry.


Vulkan: New graphics API


The Khronos group have announced a series of lectures describing Vulcan the new cross-platform Graphics API Vulkan: High-Efficiency GPU Graphics and Compute. Also known as the Next Generation OpenGL Initiative.

There are more details on the Kronos website.


Porting of BUDE (Bristol University Docking Engine) to OpenCL.


A recently publication “High Performance in silico Virtual Drug Screening on Many-Core Processors” DOI describes porting BUDE (Bristol University Docking Engine) to OpenCL.

Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single NVIDIA GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, includ- ing GPUs from NVIDIA and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets.

BUDE is now one the fastest HPC applications ever developed and nicely demonstrates the portability of OpenCL across different architectures.

There is a list of GPU accelerated applications here.


CLFORTRAN – Pure Fortran Interface to OpenCL


I know that Fortran is still very important in scientific computing so this may be of interest.

CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard.

Added to the GPU Science page.


OpenCL training course


Simon McIntosh-Smith has just released a new OpenCL training course “HandsOnOpenCL" via Github. It Includes a comprehensive set of exercises and solutions in C, C++ & Python.

There is a list of GPU-accelerated scientific applications here.



I was at the Cresset UGM last week and had a chance to hear more about BlazeGPU. The original CPU application Blaze uses the shape and electrostatic character of known ligands to rapidly search large chemical collections for molecules with similar properties. The latest version BlazeGPU runs at 40 times the speed of the CPU version of Blaze but loses nothing in accuracy. At a fraction of the hardware cost, BlazeGPU delivers the same effective, ligand based virtual screening as Blaze, based on the shape and electrostatic nature of molecules.

BlazeGPU is written in OpenCL and OpenCL libraries are available from NVidia and AMD for their graphics cards, but also from Intel for the CPU and for their new Xeon Phi coprocessor cards. BlazeGPU is currently designed only to run on the GPU - for CPU-only clusters the original code is just as fast, and on a machine with a reasonably fast GPU or two the CPU tends to run flat out just feeding data to the graphics card, so there's not that much gain running on the CPU as well as the GPU.

Currently the conformer generation still runs on the CPU, but they are looking at the possibility of porting that to OpenCL as well in the future.

The relative performance is shown in the plot below, it is worth noting that these are relatively inexpensive graphics cards that you can pick up on Amazon or ebay for a few hundred pounds. Also note for a $2.10/hour GPU instance on AmazonEC2 you can process 2m conformations.


There are more examples of GPU science here.


International Workshop on OpenCL (IWOCL)

This might be of interest to those involved in developing scientific applications that take advantage of the GPU.

The International Workshop on OpenCL (IWOCL) is an annual meeting of vendors, researchers and developers to promote the evolution and advancement of the OpenCL standard. The meeting is open to anyone who is interested in contributing to, and participating in the OpenCL community. IWOCL is the premier forum for the presentation and discussion of new designs, trends, algorithms, programming models, software, tools and ideas for OpenCL. Additionally, IWOCL provides a formal channel for community feedback to OpenCL promoters and contributors.

May13-14 2013 Georgia Institute of `Technology, more detailed here.


OpenCL conference

I just noticed the announcement of the first International Workshop on OpenCL

The International Workshop on OpenCL (IWOCL) is an annual meeting of vendors, researchers and developers to promote the evolution and advancement of the OpenCL standard. The meeting is open to anyone who is interested in contributing to, and participating in the OpenCL community. IWOCL is the premier forum for the presentation and discussion of new designs, trends, algorithms, programming models, software, tools and ideas for OpenCL. Additionally, IWOCL provides a formal channel for community feedback to OpenCL promoters and contributors.

The closing date for submitting papers is Feb 8th

We solicit the submission of unpublished technical papers detailing innovative, original research related to OpenCL. All topics related to OpenCL are of interest, including OpenCL applications from any domain (e.g., scientific computing, video games, computer graphics, multimedia, information retrieval, optimization, text processing, data mining, finance, signal and image processing and numerical solvers), OpenCL performance analysis and modeling, OpenCL performance and correctness tools and proposed OpenCL extensions.

There is a listing of GPU accelerated scientific applications here.


GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs

A recent publication describes the development of a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. The basic approach is simple and adaptable to other available GPU architectures. They observed speed-ups of 50–60 times compared with CPU implementation on a well-known breast microarray study and its associated data sets.

The source code of the proposed GPU-based fast and scalable k nearest neighbor search technique (GPU-FS-kNN) is available at under GNU Public License (GPL).

There is a listing of GPU accelerated scientific applications here.


UK Many-Core developer conference 2012 (UKMAC 2012)

Based on the weblogs there appears to be significant interest in GPU accelerated scientific applications so I thought I’d highlight this meeting that was mentioned to me.

UK Many-Core developer conference 2012 (UKMAC 2012)

Registration now open Only a limited number of spaces are available, so please register early (£70 for the conference only, £99.50 including the conference dinner). The UK Many-Core developer conference 2012 (UKMAC 2012) conference follows on from the UK GPU developer conference and is now in its fourth year. Previous conferences have been held at Oxford, Cambridge and Imperial College.


GPU accelerated applications in science

Last week I highlighted a couple of GPU accelerated applications and based on the number of views of the page I thought it might be worth looking to see how many GPU accelerated science applications are available.

It should be noted from the start that there are two main programming frameworks for writing programs that can execute on the GPU. OpenCL originally developed by Apple is an open-source initiative supported by a wide variety of graphics card vendors. The other major implementation is CUDA developed by Nvidia and is specific for Nvidia graphics units. Whilst it is true that for several years CUDA gave higher performance recent developments with OpenCL have probably closed the gap. "A Comprehensive Performance Comparison of CUDA and OpenCL" DOI.

I’ve compiled a listing of GPU-accelerated science applications here.


Lumo:- Molecular Orbital Visualisation

I’ve recently noticed an increasing interest in harnessing the computational power of the graphics card to accelerate scientfic calculations.

The latest application is Lumo which accelerates the visualization of molecular orbitals from electronic structure calculations by harnessing the power of the graphics processing unit in modern macs. Lumo currently reads formatted checkpoint calculations from Gaussian03/09 calculations and there is preliminary support for Orca output files. Lumo was designed to speed up the slow part of looking at molecular orbitals and making molecular orbital diagrams. Lumo eliminates several steps along the process by reading in the output of programs like Gaussian, quickly visualizing the orbitals, and creating pictures of the essential orbitals in seconds.

Lumo requires Mac OS 10.6 or higher, 64-bit processor, and an OpenCL capable compute device. Lumo is routinely run on MacBook Pros and MacBook Airs. For analysis of larger systems, it is recommended to have at least 4GB of system RAM.

There is a movie of Lumo in action on the website


Try out GPU-accelerated code

NVIDIA invites you to take a free and exclusive test drive to experience running your computational chemistry applications 5x faster with GPUs. The test drive is hosted on a remote cluster loaded with the latest GPU-accelerated applications so you don’t need setup any hardware or software. Simply log on and run your application as usual, no GPU programming expertise required. Try it now and see how you can reduce simulation time from days to hours.

Try any of the following GPU accelerated applications: AMBER NAMD LAMMPS
TeraChem Quantum Espresso  Coming Soon GROMACS  Coming Soon Or try your self-developed code.

Sign up today:


FastROCS updated

FastROCS is an extremely fast shape comparison application, based on the idea that molecules have similar shape if their volumes overlay well and any volume mismatch is a measure of dissimilarity running on the latest high performance graphics cards it can process 2 million conformations per second on a Quad Fermi box.

If you want to find out more about the use of GPUs in scientific computing take a look at this podcast.


OpenCL in scientific computing

I was at the Cresset UGM yesterday and saw an excellent talk by Simon McIntosh-Smith describing their work porting the docking software Bude (that was originally written in a mixture of Fortran and C++) over to OpenCL.

The performance gains were very impressive, what was equally striking was the efficiency gains as measured by electricity usage, it looks like several thousand pounds will be saved for every million compound docking run.

He also showed the portability of OpenCL code, allowing efficient use of both the GPU and CPU.

He has a report on “The GPU Computing Revolution” available online

If you would like to learn more Apple have a
OpenCL section in the Developer library, and Simon’s website is an invaluable resource, and there a couple of recommended books (links to Amazon)


Teraflop-desktop Apple system

Teraflop-desktop Apple system

OpenMM released

Version 1.0beta of OpenMM has been released, a library which provides tools for modern molecular modeling simulation.


The latest OpenCL tutorial is on MacResearch

OpenCL Q & A

Latest lecture on OpenCL is on MacResearch.

OpenCL lecture 3

The third of a series of lectures on programming with OpenCL is available on MacResearch Read More...