Macs in Chemistry

Insanely Great Science

DataWarrior update

 

A new version of DataWarrior has been released

v05.05.00: April 2021

  • 3D-Structure alignment considering shape and pharmacophoric features (PheSA)
  • Google Patent search and results in DataWarrior (keyword, structure, date, ...)
  • Link to Spaya synthesis planning server
  • Searchable and navigatable user manual
  • Macro to retrieve and visualize world-wide Corona virus spreading
  • Lots of new features, range filter animations, smarter labels, ...

DataWarrior2

Comments

MacVector on Apple Silicon

 

The very popular bioinformatics tool MacVector 18.1 is now available to download. MacVector 18.1 is a Universal Binary application, which means it runs natively on both Apple Silicon M1 Macs and Intel Macs. MacVector 18.1 matches the “Big Sur” look and feel. …and for the first time in many, many years the MacVector icon has changed to match the square look of macOS Big Sur icons.

macvector

We ran some benchmarks to see how much faster MacVector now runs on an Apple Silicon MacBook Pro. We compared this against MacVector 18.0, which runs using Rosetta2 emulation. In some cases you can see that the native Apple Silcon MacVector 18.1 runs 200% faster than the emulated MacVector 18.0.

More on benchmarks here.

Comments

XQuartz updated

 

Xlogo

XQuartz 2.8.0 has been released for macOS 10.9 or later. I've been in touch with a couple of users and they report no issues so far. This is the first version with Apple Silicon support.

The XQuartz project is an open-source effort to develop a version of the X.Org X Window System that runs on OS X. Together with supporting libraries and applications, it forms the X11.app that Apple shipped with OS X versions 10.5 through 10.7.

Changes in 2.8.0

  • Adds native support for Apple Silicon Macs.
  • Removes support for versions of macOS older than 10.9
  • Uses system libXplugin
  • Removes build-time support for deprecated X11 libraries:
    • ibXaw8
    • libXevie
    • libXfontcache
    • libxkbui
    • libXp
    • libXTrap
    • libXxf86misc
  • Removes deprecated commands:
    • gccmakedep
    • makedepend
    • xdmshell
    • xfindproxy
    • Xfake
  • Removes xpyb
  • Removes older libpng

Full release notes are here https://www.xquartz.org/releases/XQuartz-2.8.0.html

Comments

ChemDoodle 2D v11.4 Update Available

 

A new version of ChemDoodle is available, this is a free update for all subscribers.

Whilst there are a few bug fixes and stability improvements the big news is the new feature enabling generation of a chemical structure from an image.

ChemDoodle now has the ability to recreate a chemical drawing from an image of a molecule, recovering the original chemical data. This function is performed using the File>Recover from Image... menu item. This function is different from inserting the image as inserting an image provides you with just a graphic, while recovering the chemical drawing allows you to regain access to the chemical information to use or edit further. We call this function Chemical Image Recovery (CIR). Some may also refer to this function as Optical Structure Recognition (OSR).

I can see this being a very popular feature.

Comments

RSC Emerging Technologies Competition

 

The Emerging Technologies Competition is the Royal Society of Chemistry’s annual initiative for early stage companies and academic entrepreneurs who want to commercialise their technologies to make a societal impact.

The competition is a European programme seeking to identify start-ups and spin outs who are developing the most novel, innovative and promising chemistry tech.

More details here https://www.rsc.org/competitions/emerging-technologies/

Closing date for applications April 18th.

Winners will.....

  • Receive a cash prize of £20,000 (per competition category)
  • Be assigned a Royal Society of Chemistry mentor who will provide ongoing support for one year
  • Through discussion with their mentor, winners, who are an incorporated company, can be provided a business acceleration grant up to the value of £20,000
  • Gain wider support and advice from our competition partners and judges, during the final
Comments

WWDC21

 

I just got an email about the Apple Worldwide Developers Conference (WWDC21), June 7 - 11.

The Apple Worldwide Developers Conference is coming to a screen near you, June 7 to 11. Join the worldwide developer community for an all-online program with exciting announcements, sessions, and labs at no cost. You’ll get a first look at the latest Apple platforms, tools, and technologies — so you can create your most innovative apps and games yet.

What caught my eye was the Swift Student Challenge https://developer.apple.com/wwdc21/swift-student-challenge/.

We continue our long-standing tradition of supporting students who love to code with this year’s exciting Swift Student Challenge. Showcase your passion for coding by creating an incredible Swift playground on the topic of your choice. Winners will receive exclusive WWDC21 outerwear, a customized pin set, and one year of membership in the Apple Developer Program. This challenge is open to students around the world.

  • Submissions open on Tuesday, March 30, 2021, at 6:00 a.m. PDT.
  • Deadline for submissions is Sunday, April 18, 2021, at 11:59 p.m. PDT.
  • Applicants can view their status by end of business day on Tuesday, June 1, 2021.
Comments

RSC CICAG YouTube channel

 

The video of the ChimeraX workshop is now online https://youtu.be/M2K72Kgk718.

The RSC CICAG YouTube channel is building up a very useful collection of videos https://www.youtube.com/c/RSCCICAG.

CICAGYoutube

Comments

Importing Open Source Antibiotics Data into DataWarrior

 

A couple of DataWarrior macros to import data directly from an online spreadsheet. Full details are here

Screenshot

The Open Source Antibiotics Consortium is a group of researchers working together to discover new antibiotics in an open source manner. All data is freely available and in the open. There is also an effort to develop open source solutions to the computational needs of the project. The CompChem Tools page provides links to a variety of Open-Source tools, scripts, Jupyter notebooks.

Comments

MarvinSketch freezes on macOS Big Sur

 

Apparent MarvinSketch is freezing under Big Sur, This problem is caused by a change in tab behavior in Big Sur. Newly opened dialogs are handled as tabs by default now.  The issue can be solved by changing the tab preferences of the system.

Details are here https://chemaxon.freshdesk.com/support/solutions/articles/43000613604-marvinsketch-freezes-on-macos-big-sur.

There are more details of Scientific Applications Under Big Sur here.

Comments

Open-Source Tools for Chemistry Workshops

 

Over the years RSC CICAG have held workshops providing tuition for various software packages, because of the physical limitations these workshops had to be limited to around 50 attendees leaving us with a waiting list of folks who missed out. Because of the COVID-19 pandemic the meetings have been moved online allowing much greater access. The recent Open Chemical Science workshops had attendees from 45 different countries. There was considerable interest in including further workshops and so CICAG have started a monthly workshop series, each month a workshop will be held highlighting a particular package/resource.

The first four workshops are now organised, you can register here. These look to be very popular with registrations already up to 150 for the first workshop, which is three-fold higher than we could have accommodated in a physical workshop.

There are four two-hour sessions in this series which will be run on Zoom.

Registration is free and you will be sent login details at a later date.

Register here https://cicag-open-source-tools-for-chemistry.eventbrite.com/.

These workshops are sponsored by LiverpoolChiroChem.

The previous workshops are now available on the RSC CICAG YouTube channel.

Comments

Open Chemistry and Google Summer of Code

 

OpenChem is once again participating in the Google Summer of Code.

Avogadro 2 Project Ideas

  • Project: Python-based Compute and Data Server
  • Project: Biological Data Visualization
  • Project: Scripting Bindings
  • Project: Integrate with RDKit
  • Project: Tools for Interactive Molecular Dynamics

Open Babel Project Ideas

  • Project: Integrate CoordGen library
  • Project: Implement MMTF format
  • Project: Test Framework Overhaul
  • Project: Develop a JavaScript version of Open Babel
  • Project: Develop a validation and standardization filter

cclib Project Ideas

  • Project: Support for QCSchema JSON output
  • Project: Implement new parsers
  • Project: Discovering computational chemistry content online

QC-Devs Project Ideas

  • Project: Visualization of Molecular Structure and Reactivity
  • Project: Extended Interoperability of ChemTools and Quantum Chemistry Software
  • Project: Visualize Chemical Reactions
  • Project: Extended interoperability of GOpt and Quantum Chemistry Software
  • Project: Implement Workflows for Calculation and Usage of Databases of Isolated Atom Densities
  • Project: Orthogonal Procrustes for Rectangular Matrices
  • Project: Faster Molecular Integrals with Density-Fitting

3Dmol.js Project Ideas

  • Project: Improve 3Dmol.js

gnina Project Ideas

  • Project: Improve gnina

NWChem Project Ideas

  • Project NWChem-JSON
  • Project NWChem-Python-Jupyter Interface
  • JSON-LD for Chemical Data

DeepChem Project Ideas

  • Project: PyTorch Lightning Implementation
  • Project: Semiconductor Modeling Support
  • Project: Protein Language Models

Miscellaneous Project Ideas

  • Project: OneMol: Google Docs & YouTube for Molecules

There are more details of the potential ideas here https://wiki.openchemistry.org/GSoCIdeas2021 or contribute your own idea.

Comments

CSD Software Portfolio from the CCDC upgraded for Big Sur

 

The full CSD software portfolio, including Mercury, ConQuest, Mogul, GOLD, CSD-CrossMiner, the CSD Python API and other components, has now been upgraded and tested for compatibility with Big Sur. We are pleased to report that the newly available 2020.3.1 CSD Release (only available on macOS) is fully supported on macOS Big Sur at point of release, both for Intel-based macs, as well as the newer M1 Apple silicon based macs. At this point we are aware of just two specific known issues for the newer silicon hardware machines:

  • The POV-Ray integration in Mercury for high-resolution graphics generation does not work on M1 Apple silicon based macs
  • The Aromatics Analyser component in the CSD-Materials menu of Mercury does not work on M1 Apple silicon based macs We expect that these final remaining issues will be addressed in the next CSD software release.

Full details are here https://www.ccdc.cam.ac.uk/solutions/whats-new/.

More details on scientific applications under Big Sur are here https://www.macinchem.org/blog/files/1fd84c61d3f91608c1b9c413c8064cd4-2692.php

Comments

Open-Source Tools for Chemistry Workshops

 

Last year RSC CICAG held a five day conference entitled Open Chemical Science this online event proved to be enormously popular with attendees from 45 different countries. One feature of the meeting was a series of workshops highlighting a number of open-source software tools, these workshops are now available on the RSC CICAG YouTube channel.

There was considerable interest in including further workshops and so CICAG have started a monthly workshop series, each month a workshop will be held highlighting a particular package/resource.

The first four workshops are now organised, you can register here.

There are four two-hour sessions in this series which will be run on Zoom.

Registration is free and you will be sent login details at a later date.

Register here https://cicag-open-source-tools-for-chemistry.eventbrite.com/.

Comments

Schrödinger Software Release 2021-1

 

Schrödinger have just announced the release of the latest update.

This release fully supports macOS 11, Big Sur, (with previous releases of the suite remaining unsupported on Big Sur).

Computers running ARM64-based processors such as the Apple M1 chip are unsupported.

Comments

AI4Proteins webinar series

 

Save the dates!!

AI3SD are collaborating with RSC-CICAG (The Royal Society of Chemistry – Chemical Information and Computer Applications Group) and have teamed up to run an #AI4Proteins Seminar Series in 2021. This series starts on Wednesday 14th April 2021, and is made up of a set of sessions of 1-2 talks, ending with an all day virtual conference on Thursday 17th June 2021.

Full details are on the website here.

ProteinSaveTheDateFlyerV3

Comments

Fortran on a Mac page updated

 

I've just updated the Fortran on a Mac page.

In particular

gfortran for ARM Big Sur (macOS 11.0) and Apple Silicon.

NAG Fortran compiler Fortran compiler for Apple Silicon Macs now available to download. Available on Linux, Windows and macOS, including Apple Silicon Macs.

Absoft Pro Fortran 2021 For macOS and OS X. Fully compatible with macOS Big Sur (11.0).

Comments

Scientific computing on Apple M1

 

We are just starting to see a few benchmarks on the new Apple M1 chip using scientific applications.

This blog post looks like it will be really interesting to follow.

Scientific computing on Apple M1, vol 1: ASE and GPAW.

In this post, which I expect will be the first in a series, I’ll share the code that got me running with a basic Python 3.9, scipy, and matplotlib environment. However, I immediately took it further, getting a working – and quite well-performing – installations of the Atomic Simulation Environment (ASE), used for building, manipulating and visualizing atomistic structure files, as well as a parallel installation of the density functional theory code GPAW.

Bottom line

not even having 10 high-performance Xeon cores in the iMac Pro instead of only 4 high-performance M1 cores in the MacBook Pro brought the two systems to parity: the M1 MacBook Pro handily wins this comparison.

Comments

Homebrew on Apple Silicon

 

Homebrew has been updated

Apple Silicon is now officially supported for installations in /opt/homebrew. formulae.brew.sh formula pages indicate for which platforms bottles (binary packages) are provided and therefore whether they are supported by Homebrew. Homebrew doesn’t (yet) provide bottles for all packages on Apple Silicon that we do on Intel x8664 but we welcome your help in doing so. Rosetta 2 on Apple Silicon still provides support for Intel x8664 in /usr/local.

Comments

Ammolite

 

This looks really interesting, Ammolite enables the transfer of structure related objects from Biotite to PyMOL for visualization, via PyMOL’s Python API:

  • mport AtomArray and AtomArrayStack objects into PyMOL - without intermediate structure files
  • Convert PyMOL objects into AtomArray and AtomArrayStack instances.
  • Use Biotite’s boolean masks for atom selection in PyMOL.
  • Display images rendered with PyMOL in Jupyter notebooks.

To install

conda install -c conda-forge ammolite

Biotite package bundles popular tasks in computational molecular biology into a uniform Python library.

ammolite

Comments

MOE and the XQuartz 2.8.0 Beta

 

I just heard that it seems that there is a problem with the XQuartz 2.8.0 beta version which will stop MOE from starting with the OpenGL graphics. They recommend that you keep using XQuartz 2.7.11 until there is a version of the 2.8 release that works properly.

Comments

Python on Apple Silicon

 

A lot of people have been asking me about running data analysis on the new laptops with M1 chips. It looks like we are starting to see a few benchmarks appearing.

A recent blog post Are The New M1 Macbooks Any Good for Data Science? Let’s Find Out would suggest that the performance of the M1chip continues to impress.

Whilst all benchmarks come with caveats, some use "native" installations others require Rosetta

Python is approximately three times faster when run natively on a new M1 chip, Numpy looks to be slightly slower, Pandas is twice as fast, SciKit-Learn is twice as fast.

Instructions for installing TensorFlow 2.4 on Apple Silicon M1: installation under Conda environment have also been reported.

PyCharm, JetBrains’ IDE for Python development, now supports Apple Silicon M1 processors.

Comments

Pro Fit supports Big Sur

 

pro Fit pro Fit 7 is now at version 7.0.18, supporting dark mode, Catalina, and Big Sur.

pro Fit is a macOS application for data/function analysis, plotting, and curve fitting. It is used by scientists, engineers and students to analyze their measurements and the mathematical models they use to describe them.

Highlights:

  • Data windows for storing and analyzing data
  • Drawing windows for plots and other graphics
  • Function windows for user defined functions
  • Write your own functions and scripts using Python or Pascal
  • Numerous Curve Fitting Algorithms:
  • Levenberg-Marquardt, Robust, Multi-dimensional
  • High resolution, high quality drawings and graphs
  • Full PDF support for exporting figures
  • Big Sur and Retina Support

There is a comprehensive list of scientific applications under Big Sur here

Comments

Mapping chemical reaction space

 

Really nice paper looking at reaction classification based on text description, and visualisation using reaction fingerprints. Mapping the space of chemical reactions using attention-based neural networks. DOI

Can be installed using conda

All code is on GitHub https://github.com/rxn4chemistry/rxnfp/tree/master/.

annotated_atlas

Comments

The official release of GROMACS 2021 is now available

 

You can find the code, manual, release notes, installation instructions and test suite at the links below.

Code: ftp://ftp.gromacs.org/pub/gromacs/gromacs-2021.tar.gz.

Documentation: http://manual.gromacs.org/2021/index.html. (includes install guide, user guide, reference manual, and release notes)

Test Suite: ftp://ftp.gromacs.org/regressiontests/regressiontests-2021.tar.gz.

Comments

Unix tips for dealing with very large files

 

I've updated the page describing a variety of unix commands that can be helpful when dealing with very large files. In particular I've added details of how to split very large files into more manageable chunks.

Dividing sdf files can be problematic since we need each division to be at the end of a record defined by "$$$$". I've spent a fair amount of time searching for a high-performance tool that will work for very, very large files. Many people suggest using awk

AWK (awk) is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it's a filter, and is a standard feature of most Unix-like operating systems.

I've never used awk but with much cut and pasting from the invaluable Stack Overflow this script seems to work.

awk -v RS='\\$\\$\\$\\$\n' -v nb=1000 -v c=1 '
{
  file=sprintf("%s%s%06d.sdf",FILENAME,".chunk",c)
  printf "%s%s",$0,RT > file 
}
NR%nb==0 {c++}
' /Users/username/Desktop/SampleFiles/HitFinder_V11.sdf

The result is shown in the image below. There are a couple of caveats, this script only works with the version of awk shipped with Big Sur (you should be able to install gawk using Home Brew and use that on older systems), and it requires the file has unix line endings. The resulting file names is not ideal and if there are any awk experts out there who could tidy it up I'd be delighted to hear from you.

awkresults

Comments

OpenMM on Apple silicon

 

The ARM OSX Migration seems to be quite active :-)

Update

And here it is

OpenMM is now available on condaforge for osx-64 to support new Apple hardware based on the Mac M1 chip!

To install

conda install -c conda-forge openmm

I'd be really interested in hearing about any benchmarking activities.

Comments

AI in Chemistry literature

 

Some of the more popular pages on the site are the compilation of resources, a listing of Open-Source Cheminformatics toolkits and a list of useful python libraries for data science so I thought I'd flag a recent post by Pat Walters listing some interesting machine learning publications in 2020 on his practical cheminformatics blog.

His post AI in Drug Discovery 2020 is certainly an invaluable starting point.

Comments

ChemDoodle 2D v11.3 Update

 

I just heard version 11.3 of ChemDoodle 2D software has been released.

ChemDoodle 2D v11.3 is a feature update. The main new feature is expert IUPAC naming support for free radicals. Other major features include new image output sizing options with previews, and stoichiometry table options.

ChemDoodle 2D is a popular and extensively featured Chemical Drawing and cheminformatics software tool.

Comments

Mathematica System requirements

 

A couple of folks have asked me about running Mathematica on Apple Silicon. I don't use Mathematica but the system compatibility is on their website.

Mathematica 12.2 is optimized for the latest operating systems and hardware. mathematicaSystemRequirements

Comments

Analyzing the runtime, energy usage, and performance of Tensorflow training on a M1 Mac Mini and Nvidia V100

 

An interesting comparison of Apple Intel and M1 chips machines with Nvidia 100 when using Tensorflow.

https://wandb.ai/vanpelt/m1-benchmark/reports/Can-Apple-s-M1-help-you-train-models-faster-cheaper-than-NVIDIA-s-V100---VmlldzozNTkyMzg

We ran a sweep of 8 different configurations of our training script and show that the Apple M1 offers impressive performance within reach of much more expensive and less energy efficient accelerators such as the Nvidia V100 for smaller architectures and datasets.

Code is available on Colab https://colab.research.google.com/drive/1RvZBpzJRW9MNPWQ2rZG8HIyHJbaCwnTI.

They also include tips on setting up a Mac mini to run Tensorflow.

Only initial results on modest data sets, will be interesting to see the performance when Apple releases more Pro hardware.

Comments

Latest RSC CICAG newsletter

 

The latest RSC CICAG newsletter (Winter 2020) is now available http://rsccicag.org/newsletters.htm. It includes:-

Chemical Information & Computer Applications Group Chair's Report
CICAG Planned and Proposed Future Meetings
Memories of Dr Angus McDougall, 1934-2020
CASP14: DeepMind’s AlphaFold 2 – an Assessment
Meeting Report: 3rd RSC BMCS & CICAG AI in Chemistry Meeting
ReadMe and HowTo for Lightning Poster Presentations
As Conferences went Online: What do we miss the most?
Alan F Neville, 1943-2020, BSc, PhD
Parallel Processing for Molecular Modeling in ChemDoodle 3D
Catalyst Science Discovery Centre & Museum Trust: A Year in Review
The 6th Tony Kent Strix Annual Memorial Lecture 2020
Open Chemical Science Meetings and Workshops - Introduction
Open Access Publishing for Chemistry – Meeting Reports
Open Data for Chemistry – Meeting Reports
Open Source Tools for Chemistry – Workshop Reports
RSC Open Access Journals and Future Plans
RSC’s Journal Archives Available for Text and Data Mining
Chemical Information / Cheminformatics and Related Books
News from AI3SD 64 Other Chemical Information Related News

Comments

Why Apple's M1 Chip is So Fast

 

A technical but still very accessible (15 min) analysis of the latest Apple M1 chip. Well worth spending a coffee break viewing.

Comments

JupyterLab 3.0 released

 

JupyterLab is the next-generation web-based user interface for Project Jupyter.

JupyterLab 3.0 includes a number of new features and enhancements that are described on the Jupyter blog. Full details are described in the ChangeLog

To install using conda

conda install -c conda-forge jupyterlab=3

However note that some extensions may not yet have been updated.

Comments

OpenChem

 

OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend. The goal of OpenChem is to make Deep Learning models an easy-to-use tool for Computational Chemistry and Drug Design Researchers.

You can read about in this publication DOI.

All code is available on GitHub https://github.com/Mariewelt/OpenChem.

Requires

  • Modern NVIDIA GPU, compute capability 3.5 or newer.
  • Python 3.5 or newer (we recommend Anaconda distribution)
  • CUDA 9.0 or newer

numpy, pyyaml, scipy, ipython, mkl, scikit-learn, six, pytest, pytest-cov

The software is licensed under the MIT license

Comments

RDKit blog

 

If you are a RDKit user then you should bookmark Greg Landrum's RDKit blog https://greglandrum.github.io/rdkit-blog/about/. This is a new site and all the old content will be migrated in due course.

RDKitBlog

Comments

Annual Site Review

 

At the end of each year I have a look at the website analytics to see which items were the most popular.

Over the year there were 90,006 visitors, an increase of 28% over 2019, spending an average of 2.5 minutes per session, looking at the regular visitors there are around 4000 who visited 20-200 times per year. The US provided 28% of the visitors and the UK 8% with Germany, India, China and Japan around 5%. As might be expected 57% of the visitors were using a Mac, but 23% of the visitors were Windows users, 9% iOS and 6% Android, also 4% Linux. There has been a gradual increase in the number of visitors using mobile devices.

Again the most popular page was Fortran on a Mac which has been updated a couple of times this year with reader suggestions. Other popular pages include the Reviews and the Hints and Tutorials. The page describing the update to iBabel was particularly popular.

viewerTab

The post about Scientific Applications under Catalina made it to number 4 in the years listing and elicited a significant amount of reader feedback.

The Mobile Science site has seen increased visitor numbers.

The most popular apps viewed were.

Merck PTE
IBM Micromedex Drug Info
PocketCAS: Mathematics Toolkit
The Periodic Table Project
Periodic Table

Also popular were

Python3IDE Human Anatomy Atlas 2019
ChemTube3D.
Molecular Constructor

The Twitter feed @macinchem has steadily attracted new followers and currently has 993 followers.

The most popular tweets were

IOData: A python library for reading, writing, and converting computational chemistry file formats

and

Google Colab is very cool. .


Comments