# GPU accelerated applications in science

I recently highlighted a couple of GPU accelerated applications and based on the number of views of the page I thought it might be worth looking to see how many GPU accelerated science applications are available.

It should be noted from the start that there are two programming frameworks for writing programs that can execute on the GPU. OpenCL originally developed by Apple is an open-source initiative supported by a wide variety of graphics card vendors. The other major implementation is CUDA developed by Nvidia and is specific for Nvidia graphics units. Whilst it is true that for several years CUDA gave higher performance recent developments with OpenCL have probably closed the gap. "A Comprehensive Performance Comparison of CUDA and OpenCL" DOI.

## GPU-accelerated Applications

Abalone a general purpose molecular modeling program focused on molecular dynamics of biopolymers.

Abinit a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis.

ACEMD bio-molecular dynamics (MD) software specially optimized to run on graphics processing units

ADF a suite of software to model chemical and physical properties. Our density functional theory programs are accurate and efficient and the approximate quantum-based codes offer fast insight in complex systems. Our programs work, in parallel, out of the box on any popular system (Windows, Mac, Linux/UNIX).

AMBER One of the new features of AMBER 11 was the ability to use NVIDIA GPUs to accelerate PMEMD for both explicit solvent PME and implicit solvent GB simulations. This has been further extended in AMBER 12.

AMIRA viewer for clinical or preclinical image data, nuclear data, optical or electron microscopy imagery, molecular models, vector and flow data, simulation data on finite element models, and all types of multidimensional image, vector, tensor, and geometry data

Arioc high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space DOI.

Ascalaph molecular modelling suite.

BarraCUDA a sequence mapping software that utilizes the massive parallelism of graphics processing units (GPUs) to accelerate the inexact alignment of short sequence reads to a particular location on a reference genome.

BigDFT a DFT massively parallel electronic structure code using a wavelet basis set. available as part of Abinit or as standalone.

BlazeGPU. The original CPU application Blaze uses the shape and electrostatic character of known ligands to rapidly search large chemical collections for molecules with similar properties. The latest version BlazeGPU runs at 40 times the speed of the CPU version of Blaze but loses nothing in accuracy. At a fraction of the hardware cost, BlazeGPU delivers the same effective, ligand based virtual screening as Blaze, based on the shape and electrostatic nature of molecules.

Bude a generic molecular docking program

CASINO CASINO is a code for performing quantum Monte Carlo (QMC) electronic structure calculations for finite and periodic systems.

Charm The most recent release of CHARMM makes available to users significant performance enhancements for conventional molecular dynamics calculations, e.g., MD with explicit solvent and periodic boundary conditions using PME. This enhanced performance comes from the development and introduction of the DOMDEC module, by Antti-Pekka Hynninen and Michael Crowley, for simulations on parallel architectures, and for GPU accelerated molecular dynamics from the CHARMM/OpenMM interface.

CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard

CP2K CP2K is a program to perform atomistic and molecular simulations of solid state, liquid, molecular, and biological systems. It provides a general framework for different methods such as e.g., density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW) and classical pair and many-body potentials.

CUDA-BLAST designed to accelerate NCBI BLASTP for scanning protein sequence databases on GPUs.

CUDA-MEME a motif discovery software based on MEME (version 3.5.4) algorithm

CUSHAW - fast short read alignment for CUDA-enabled GPUs

CryoSPARC is an easy to use software tool that enables rapid, unbiased structure discovery of proteins and molecular complexes from cryo-EM data. DOI.

Core Hopping In addition to more conventional ligand-based methods, Core Hopping offers receptor-based scaffold hopping, exploiting information about the active site and known binding poses to guide the search for novel cores.

FastROCS a fast shape comparison application, based on the idea that molecules have similar shape if their volumes overlay well and any volume mismatch is a measure of dissimilarity. It uses a smooth Gaussian function to represent the molecular volume, so it is possible to routinely minimize to the best global match.

Fen Zi (yun dong de Fen Zi = Moving MOLECULES) is a CUDA code that enables large-scale, GPU-based MD simulations. The code of Fen Zi is now available in Google Code at http://code.google.com/p/fen-zi/.

FieldScreen virtual screening using molecular fields.

GAMESS-US a program for ab initio molecular quantum chemistry. Briefly, GAMESS can compute SCF wavefunctions ranging from RHF, ROHF, UHF, GVB, and MCSCF. Correlation corrections to these SCF wavefunctions include Configuration Interaction, second order perturbation Theory, and Coupled-Cluster approaches, as well as the Density Functional Theory approximation,

GAMMES-UK general purpose ab initio molecular electronic structure program for performing SCF-, DFT- and MCSCF-gradient calculations, together with a variety of techniques for post Hartree Fock calculations.

GPAW a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). It uses real-space uniform grids and multigrid methods or atom-centered basis-functions.

GPU-BLAST an accelerated version of the popular NCBI-BLAST (www.ncbi.nlm.nih.gov). In comparison to the sequential NCBI-BLAST, GPU-BLAST is nearly four times faster, while producing identical results.

GPU-FS-kNN A recent publication describes the development of a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. The basic approach is simple and adaptable to other available GPU architectures. They observed speed-ups of 50–60 times compared with CPU implementation on a well-known breast microarray study and its associated data sets.

GPU-HMMER an open source MPI implementation of the HMMER protein sequence analysis suite.

Gromacs versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.

HALMD is a high-precision molecular dynamics package for the large-scale simulation of simple and complex liquids. HALMD supports acceleration through CUDA-enabled graphics processors.

HOOMD-blue performs general purpose particle dynamics simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to many processor cores on a fast cluster.

IPV an interactive protein visualizer based on a ray-tracing engine. Targeting high quality images and ease of interaction, IPV uses the latest GPU computing acceleration techniques.

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

LSDalton for computing Hartree-Fock and DFT wave functions, energies, and molecular properties.

LUMO accelerates the visualization of molecular orbitals from electronic structure calculations by harnessing the power of the graphics processing unit in modern macs. Lumo currently reads formatted checkpoint calculations from Gaussian03/09 calculations with more formats coming soon.

Matlab a high-level language and interactive environment for numerical computation, visualization, and programming.

Mathmatica 8 a sophisticated development environment that combines a flexible programming language with a wide range of symbolic and numeric computational capabilities, it is also a development platform fully integrating computation into complete workflows

Mendel-GPU Haplotyping and genotype imputation on Graphics Processing Units

Molcas methods that will allow an accurate ab initio treatment of very general electronic structure problems for molecular systems in both ground and excited states.

Molegro Virtual Docker 5 it is now possible to perform virtual screening runs on a Graphics Processing Unit, a GPU, in Molegro Virtual Docker.

Molpro a complete system of ab initio programs for molecular electronic structure calculations.

MOPAC is a semiempirical quantum chemistry program based on Dewar and Thiel's NDDO approximation.

MUMmerGPU a high-throughput DNA sequence alignment program that runs on nVidia G80-class GPUs.

NAMD molecular dynamics program excels at simulating, in atomic detail, the complex molecular machinery of living cells.

NWChem kinetics and dynamics of chemical transformations, chemistry at interfaces and in the condensed phase.

Parstream database engine designed and optimized for processing extreme amounts of data on highly parallel processor architectures.

PASHA a parallel short read assembler for large genomes using de Bruijn graphs.

Octopus Density Functional Theory for ground-state calculations.

OpenGE an open source project for analyzing and interpreting high-throughput sequencing data.

OpenMM OpenMM is a library which provides tools for modern molecular modeling simulation. As a library it can be hooked into any code, allowing that code to do molecular modeling with minimal extra coding.

PIPER an FFT-based protein docking program with pairwise potentials

PYMOL molecular visualisation system.

Q-Chem ab initio quantum chemistry package for accurate predictions of molecular structures, reactivities, and vibrational, electronic and NMR spectra.

QMCPack Quantum Monte Carlo techniques provide some of the most accurate solutions to quantum mechanical problems.

Quantum Espresso an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.

QUICK is a GPU-enabled ab intio quantum chemistry software package

SOAP3 a GPU-based software for aligning short reads with a reference sequence.

SeqNFind addresses the need for complete and accurate alignments of many small sequences against entire genomes .

Smoldyn on Graphics Processing Units: Massively Parallel Brownian Dynamics Simulations

TeraChem general purpose quantum chemistry software designed to run on NVIDIA GPU architectures.

Torch a scientific computing framework with wide support for machine learning algorithms

Unipro UGENE a unified bioinformatics toolkit.

VASP Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.

VMD a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting. VMD supports computers running MacOS X, Unix, or Windows, is distributed free of charge, and includes source code.

WideLM is envisioned as a fast, streamlined version of R's venerable linear modelling command, "lm", tailored specifically for the case of numerous modestly-sized submodels taken from a large under-determined (p >> n) system.

# A few useful sites

GPUscience aims to offer one point access to all advances of graphics processing units (GPUs) in all areas science, technology and medicine.

Khronos Group a not for profit industry consortium creating open standards for the authoring and acceleration of parallel computing, graphics, dynamic media, computer vision and sensor processing on a wide variety of platforms and devices.

A survey of computational molecular science using graphics processing units

Simon McIntosh-Smith website Senior Lecturer in High Performance Computing and Architectures, Simon McIntosh-Smith has also released a new OpenCL training course “HandsOnOpenCL" via Github. It Includes a comprehensive set of exercises and solutions in C, C++ & Python.

OpenCL University Training Courses

Podcast with Andrew Dalke Brian Cole and Imran Haque about their experiences with GPU computing and their success at implementing the ROCS and Lingo algorithms on the GPU

List of quantum chemistry and solid-state physics software A table illustrating the capabilities of the most versatile software packages and whether or not they support GPU acceleration.

Last Updated 4 May 2017