Macs in Chemistry

Insanely Great Science

Setting up ML and AI tools on Apple Silicon

I've had a number of questions about setting up a machine learning/artificial intelligence environment on an Apple Silicon Mac. So I've tried to write a step by step guide.

Setting up ML and AI tools on Apple Silicon, using home-brew and conda to install and manage compatibility and dependences.

I've also created a .yml file that you can use instead of going through all the steps.

There are a couple of example Jupyter notebooks that give a starting point for trying things out.

I'm very much aware that this is a bit of a moving target at the moment so comments/suggestions are much appreciated.

Comments

MayaChemTools update

The awesome http://www.mayachemtools.org/index.htmlMayaChemTools has a couple of new additions and updates.

Two new command line scripts:

In addition, the Psi4PerformTorsionScan.py and RDKitPerformTorsionScan.py scripts have been updated to optionally filter matched torsions by atom indices for performing torsion scans. A number of enhancements have been made to PyMOLVisualizeMacromolecules.py script including visualization of docked poses.

All scripts are listed here http://www.mayachemtools.org/docs/scripts/html/index.html.

Comments

alvaMolecule updated

Alvascience have just released an update to alvaMolecule https://www.alvascience.com/alvamolecule/.

alvaMolecule-768x461

Here is the list of the main changes of alvaMolecule:

  • Import XYZ cartesian coordinates format (*.xyz)
  • Export dataset as Excel file (*.xlsx)
  • Delete molecules
  • Find molecular structures in Google Patents/Scholar
  • Molecule grid: filter molecules by substructure (SMARTS)
  • Molecule grid: column value can be set as footer of a molecule cell
  • Charts: added molecule hint on charts
  • Charts: added context menu to save data of charts
  • Charts: added 'Lasso selection'
  • Highlight substructures (SMARTS)
  • Highlight Bemis-Murcko features
  • Standardizers: standardize nitro group as [N+]([O-](=O))
  • Standardizers: remove radicals preserving usual valence
  • Standardizers: neutralize atoms
  • Standardizers: neutralize molecule
  • Standardizers: custom standardizer with SMIRKS
  • Checkers: modified the non-standard atom set which now includes the following atoms: H, B, C, N, O, P, S, F, Cl, Br, I
  • Duplicate analysis: identify duplicated structures
  • Duplicate analysis: automatically delete duplicated structures
  • Duplicate analysis: identify molecules with the same value for a specific column
  • Scaffold analysis: identify the Bemis-Murcko scaffolds

alvaMolecule is free for academic and non-commercial use.

Comments

1st EUOS/SLAS Joint Challenge: Compound Solubility

The latest kaggle challenge is up.

Develop new methods to predict compound solubility based on chemical structure.

EU-OPENSCREEN ERIC and SLAS challenge you to develop a reliable algorithm that can predict the solubility of a small molecule, an essential feature of all biologically active compounds. EU-OPENSCREEN ERIC provides a high-quality data set of experimentally measured aqueous solubility of about 100,000 small molecules which was produced at an EU-OPENSCREEN ERIC high throughput screening partner site. 70,000 of these molecules will be available for download on Kaggle, and the residual 30,000 compounds will be withheld for prediction.

Full details are here https://www.kaggle.com/competitions/euos-slas/overview.

Comments

Asahi Linux

Asahi Linux is a project and community with the goal of porting Linux to Apple Silicon Macs, starting with the 2020 M1 Mac Mini, MacBook Air, and MacBook Pro. More details here https://asahilinux.org/about/.

Asahi Linux is still in very early alpha stages. Lots of hardware components are not functional yet! Check out the Feature Support page first, and if you still want to give it a go, see the blog post for the alpha installer:

The alpha release https://asahilinux.org/2022/03/asahi-linux-alpha-release/.

All the code is on GitHub https://github.com/AsahiLinux.

Comments

Cambridge Cheminformatics Meeting

The next Cambridge Cheminformatics Network meeting is on 7th September at 4pm UK time at CCDC on Union Road Cambridge https://www.ccdc.cam.ac.uk.

Programme

Systematic Evaluation of Local and Global Models for ADMET Prediction Elena Di Lascio, Novartis Institutes for BioMedical Research (remote) https://www.linkedin.com/in/elena-di-lascio/ https://www.novartis.com/research-development/novartis-institutes-biomedical-research

Adventures in AI Dan Ormsby, Dotmatics (in person) https://www.linkedin.com/in/danormsby https://www.dotmatics.com/

Face Value – Analysing Surfaces and Properties with CSD-Particle Andy Maloney, CCDC (in person) https://www.linkedin.com/in/andy-maloney-53b351227/ https://www.ccdc.cam.ac.uk/

Spotlight Talk: The Skills Alliance - Connecting People and Opportunities James Thompson, The Skills Alliance (in person) https://www.linkedin.com/in/james-thompson-b2b70139/ https://www.skillsalliance.com/

Details are here http://c-inf.net.

Comments

OPSIN 2.7.0 has been released

OPSIN - Open Parser for Systematic IUPAC Nomenclature, has been updated https://github.com/dan2097/opsin/releases/tag/2.7.0.

OPSIN is a Java library for IUPAC name-to-structure conversion offering high recall and precision on organic chemical nomenclature.

Java 8 (or higher) is required for OPSIN 2.7.0

Supported outputs are SMILES, CML (Chemical Markup Language) and InChI (IUPAC International Chemical Identifier)

Convert a chemical name to SMILES

java -jar opsin-cli-2.7.0-jar-with-dependencies.jar -osmi input.txt output.txt

where input.txt contains chemical name/s, one per line

Comments

ChemDoodle 3D v6.6 Update Available

Just got this message

We are pleased to introduce version 6.6 of our ChemDoodle 3D software. This update is free for all ChemDoodle subscriptions, Lifetime and Site licenses. Licenses are as little as $15, and we have a free trial available at: https://www.chemdoodle.com/3d

Major new features include the ability to perform full scene modeling simulations, an implementation of the FIRE optimizer, 3D molecular structure alignment, improved bond deduction with new bond order perception algorithms, and more.

Comments

Comparing the M2 MacBook Air

I've updated the pages comparing the new Apple Silicon machines with those with the older Intel chips https://www.macinchem.org/reviews/MacBooks/m1macbookpromax.php. In addition to the MacBook Pro M1 Max I've now added the M2 MacBook Air.

M2MacBookAir

Comments

5th Artificial Intelligence in Chemistry Symposium

The lineup for the 5th Artificial Intelligence in Chemistry Symposium (Thursday-Friday, 1st-2nd September 2022) is now complete for both oral and poster presentations. It really is a fantastic selection of topics and speakers and it is clear this event is now a highlight of the scientific calendar.

The details are here https://www.rscbmcs.org/events/aichem22/

Registration is now open, register here https://www.eventsforce.net/hg3/221/home. In person registration deadline: Monday 29th August 17:00 (BST)

AI_in_Chemistry_1st_Announcement-FINAL-pdf-722x1024

Comments

WebMolKit: switched to Apache 2.0

Just saw this.

WebMolKit is a cheminformatics library that I’ve been working on for a long time: it runs on all kinds of JavaScript engines (browsers, desktop via Electron, command line via NodeJS). Its flagship feature is a powerful chemical sketcher, but it also has many supporting functions for handling molecules. As of now, the licensing terms have been switched to Apache 2.0, which basically means you are allowed to use it for non-open projects, as long as proper credit is given

I've updated the Open Source Cheminformatics Toolkits page

Comments

ChimeraX on Apple M1 CPUs

News just in from ChimeraX team https://www.rbvi.ucsf.edu/chimerax/data/czi-nov2021/apple_m1.html.

We are making a version of our ChimeraX molecular graphics program that runs natively on Apple's new M1 CPUs for faster interactive calculations. We'll report some speed-up timings and describe difficulties porting from Intel to the Apple M1 CPU. A native Apple M1 version of ChimeraX is not yet available, but we expect to release it within 6 months.

Difficulties porting ChimeraX to Apple M1 CPUs

  • ChimeraX Python and C++ code needs no changes.
  • ChimeraX uses 90 packages developed by others.
  • 60 are pure Python from the PyPi repository.
  • 30 are binary packages that need Apple M1 versions.
  • 6 binary packages do not have Apple M1 distributions: ambertools, h5py, imagecodecs, netcdf4, pytables, scipy.
  • Qt 6 window toolkit is distributed for Apple M1 but not Qt 5.
  • ChimeraX uses Qt 5, the stable Qt version from 2012 - 2021.
  • Qt 6 with html support was released September 2021.
  • Apple M1 applications must be either all native M1 binaries or all Intel binaries, no mixing.
  • Need to distribute either a large univeral package that includes both Intel and M1 binaries, or two separate ChimeraX versions.

Potential advantages of native Apple M1 ChimeraX

  • Better OpenGL driver stability with Apple M1 GPU.
  • No graphics driver crashes among 43 ChimeraX bug reports in 2021 with Apple M1.
  • About 100 ChimeraX graphics driver crashes reported on Intel Macs in past 2 years.
  • Better C++ crash stack traces with native M1 app than with Intel emulation.
  • Intel ChimeraX crashes on M1 often give no C++ stack trace.
Comments

MOE 2022.2 released

The 2022.02 release of Chemical Computing Group's Molecular Operating Environment (MOE) software includes a variety of new features, including support for Apple Silicon!

Screenshot 2022-07-21 at 08.31.14

This update also includes

  • Browser-based Combinatorial Library Enumeration with on-the-fly reagent search and library generation

  • MOEsaic Docking calculations with real-time visualization of results

  • scFv and custom antibody homology models

  • GPU-accelerated protein modeling and protein-protein docking

  • Hydrogen Mass Repartitioning for accelerating MD and Thermodynamic Integration

  • Database Viewer SNFG carbohydrate display, graphic objects, and enhanced plotting

If you want to read more about the performance gains using MOE on an M1 Mac have a look at this page https://www.macinchem.org/reviews/MacBooks/moe.php.

Comments

ChemDoodle 2D update

An updated version of this very popular chemical drawing package is available. ChemDoodle 2D v11.10 is a free update.

https://www.ichemlabs.com/news/read?post=cd2d1110released.

ChemDoodle 2D v11.10 includes further advancements to our cheminformatics functions along with some new features. The stereochemistry engine has seen a complete rewrite, with a brand new system for evaluating stereogeometries in OD/2D/3D, leading to accurate generation and interpretation of wedge drawings and full support for stereochemistry in SMILES protocol. Further advancements in our industry-leading 2D coordinate layout algorithm are provided. New features include ring coloring for highlights, support for group objects in ChemDraw files, checking bonds for drawing warnings, new periodic table options and more. For macOS users, the QuickLook plugin is now functional on the latest macOS versions.

Comments

RSC CICAG Summer newsletter

The RSC CICAG newsletter is now available

http://www.rsccicag.org/indexhtmfiles/CICAG%20Newsletter%20Summer%202022%20FINAL pdf

Chemical Information and Computer Applications Group Chair’s Report 4
Your CICAG Committee - Introducing Our New Members 5
CICAG Planned and Proposed Future Meetings 6
Free Workshops on Open-Source Tools for Chemistry 7
The COVID Moonshot 7
Practical Cheminformatics with Open-Source Software 11
The Catalyst Science and Discovery Centre Archives 12
Chemical Data Recovery 3: Legacy Chemical Data Recovery 15
Svante Wold, 1941-2022 22
Cheminformatics: a Digital History - Part 1 Early days at Sheffield: a Personal Perspective 23
Being #CompChemURG: Forging New pathways 28
UKeiG Call for Nominations for the Prestigious Tony Kent Strix Award 2022 29
Welcome to the New Era of Scientific Publishing 30
Greg Landrum Receives the Mike Lynch Award 37
Bioinformatics in the Post-AlphaFold 2 Era 38
Diana Leitch – Reflections on her Life in Chemistry, Chemical Information and Librarianship 45
Meeting Report: AI in Drug Discovery 50
AI4SD News 53
The IUPAC Green Book 55
RSC Historical Group – Women in Chemistry Symposium 56
ACS CINF Report for July 2022 57
News from CAS 57
Chemical Information / Cheminformatics and Related Books 59
Other Chemical Information News 61

We have already started compiling content for the winter newsletter, if you have suggestions or would like to contribute please get in touch.

Comments

Mnova 14.3 released

 

Mnova has just been updated and it runs on Apple Silicon.

Just to highlight a couple of new features

New Product! Mnova Screen 2D. Efficient Batch Processing Tools for Lead Discovery using Protein-Observed 2D NMR

This product is related to Mnova Screen, which can be used to process ligand-observed 1D NMR spectra (STD, T1rho, CPMG, WaterLogsy, etc.). Screen 2D processes the protein-observed 2D 1H-15N, 1H-13C, or H1-13C/15N dual heteronuclear correlation (HSQC or HMQC) spectra to find binding ligands based on chemical shift perturbations.

Save the Whole Document as JCAMP - Mnova General

An enhancement to the way we handle JCAMP files. We have implemented a new file filter, "JCAMP-DX Document” (*.jdx *.dx *.jcm *.cs *.jcamp), which is analogous to the other JCAMP-DX except for the fact that when saving, it saves the entire document.

SIMCA Model Classification - Chemometrics. Soft independent modelling by class analogy (SIMCA) is a statistical method for supervised classification of data.

In order to build the classification models, the samples belonging to each class need to be analyzed using principal component analysis (PCA), from which only the significant components are retained.

Full details and download link are here

https://resources.mestrelab.com/top-highlights-in-mnova-14-3/

Comments

RSC CICAG Open Source tools for Chemistry Workshops

 

The latest of the RSC CICAG workshops is now online https://youtu.be/Ka08REoGYvI.

The is the latest in 20 workshops that are available on the RSC CICAG YouTube channel https://www.youtube.com/c/RSCCICAG. These workshops have now been viewed over 21,000 times and they are a fabulous way to find out about some of the Open-Source tools and resources that are available to chemists.

RSC CICAG are also organising a number of meetings

Details of the RSC CICAG meeting on Ultra-large Chemical Libraries are available on the CICAG website http://www.rsccicag.org/ultra-large%20chemical%20libraries.htm.

This one-day meeting will be held on 10 August 2022 10:00-17:00, at Burlington House, London. Registration is open https://www.rsc.org/events/detail/73675/ultra-large-chemical-libraries the speakers have been finalised and looks a great line-up. Remember bursaries are also available.

RSC CICAG and BMCS are organising the 5th Artificial Intelligence in Chemistry Symposium. #AIChem22

AI_in_Chemistry_1st_Announcement-FINAL-pdf-722x1024

This two day meeting (Thursday-Friday, 1st-2nd September 2022) will be held at Churchill College Cambridge UK. Details of the meeting and registration are on the conference page website https://www.rscbmcs.org/events/aichem22/.

RSC CICAG in collaboration with the SCI are also organising the SCI-RSC Workshop on Computational Tools for Drug Discovery 2022.

This will be held on 23 November 2022 at The Studio Birmingham. Full details and registration are on the website https://www.soci.org/events/fine-chemicals-group/2022/scirsc-workshop-on-computational-tools-for-drug-discovery-2022

Comments

RSC CICAG Open Source Tools for Chemistry :- Scoring of shape and ESP similarity (Ester Heid)

 

The latest of the RSC CICAG workshops is now online https://youtu.be/Ka08REoGYvI.

Electrostatic effects along with volume restrictions play a major role in enzyme and receptor recognition. Evaluating electrostatic and shape similarities of pairs of molecules such as proposed versus known ligands can therefore be valuable indicators of prospective binding affinities. This workshop will demonstrate how to compute electrostatic and shape similarities using the open-source tool ESP-Sim github.com/hesther/espsim, doi.org/10.26434/chemrxiv-2021-sqvv9-v3. Available options for comparing electrostatics will be discussed interactively on selected examples of public datasets, along with advice on embedding and aligning molecules prior to computing similarities.

Whilst comparing molecules using 1D or 2D descriptors is well known, most molecules are three dimensional, as are biomolecule binding sites. The comparison of molecular shapes and electrostatics is particularly challenging and this workshop is a perfect introduction. Come along and you have a chance to ask questions directly.

All materials are available on GitHub https://github.com/hesther/espsim/tree/master/workshop

Comments

Chemfp 4.0 has been released

 

Chemfp 4.0 was recently released, with support for several diversity selection algorithms, and an improved API for interactive use in a notebook environment.

Chemfp is an analytics package for cheminformatics fingerprints. It contains command-line tools and an extensive Python library for fingerprint generation, high-performance similarity search, diversity selection, and exploratory research.

The new diversity selection algorithms are MaxMin, sphere exclusion (both random and directed), and HeapSweep.

People who live in the Jupyter notebook will likely enjoy the new chemfp user experience. Most long-term actions support progress bars, chemfp's Python objects have more informative repr()s, search results added Pandas integration, and there are new high-level APIs that let you express a lot of functionality compactly.

The Base License covers most in-house use of chemfp, though a few features are either limited or disabled and require a license key to unlock. For alternative licenses, including source code and no-cost academic licensing, see https://chemfp.com/license/ -- or try one of the re-formatted ChEMBL datasets at https://chemfp.com/datasets/ which include an embedded authorization key.

Comments

UCSF ChimeraX version 1.4 has been released

 

ChimeraX includes user documentation and is free for noncommercial use. Download for Windows, Linux, and MacOS from https://www.rbvi.ucsf.edu/chimerax/

Updates since version 1.3 (Dec 2021) include:

  • search/retrieve from EBI AlphaFold DB 3rd release, ~1 million structures
  • display AlphaFold predicted aligned error (PAE) plots
  • can BLAST UniRef100,90,50 (+ previous choices AlphaFold, PDB, NR)
  • AlphaFold prediction of multimers (of limited size) on Google Colab
  • independent centers of rotation available as mouse mode
  • join models with Build Structure tool or command
  • define axes for display and/or use in measurements
  • increase/decrease VDW radii relative to their current values
  • switch PDB residue numbering scheme (author/canonical/uniprot)
  • align sequences with Clustal Omega or MUSCLE
  • calculate % identity in sequence alignments
  • window toolkit updated to Qt 6.2.3 from 5.15.2

There is a ChimeraX workshop here https://youtu.be/M2K72Kgk718.

Comments

Electrostatic and shape similarity workshop

 

The latest RSC CICAG Open-Source Tools for Chemistry workshop

23 June 2022 Scoring of shape and ESP similarity (Ester Heid)

Electrostatic effects along with volume restrictions play a major role in enzyme and receptor recognition. Evaluating electrostatic and shape similarities of pairs of molecules such as proposed versus known ligands can therefore be valuable indicators of prospective binding affinities. This workshop will demonstrate how to compute electrostatic and shape similarities using the open-source tool ESP-Sim github.com/hesther/espsim, doi.org/10.26434/chemrxiv-2021-sqvv9-v3. Available options for comparing electrostatics will be discussed interactively on selected examples of public datasets, along with advice on embedding and aligning molecules prior to computing similarities.

Whilst comparing molecules using 1D or 2D descriptors is well known, most molecules are three dimensional, as are biomolecule binding sites. The comparison of molecular shapes and electrostatics is particularly challenging and this workshop is a perfect introduction. Come along and you have a chance to ask questions directly.

Registration

https://www.eventbrite.com/e/open-source-tools-for-chemistry-tickets-294585512197?.

Comments

PyTorch on Apple Silicon

 

Latest nightly build.

Comments

Apple event 2022

 

Just in case you missed it.

Comments

Cambridge Cheminformatics Network Meeting

 

Next Meeting: 4pm (UK time) 8 June 2022, via Zoom and in person (hybrid!) details are here http://c-inf.net.

The IN-PERSON meeting will be held at the Cambridge Crystallographic Data Centre on Union Road, and be capped at 30 attendees. For IN-PERSON attendance please email andreas AT drugdiscovery.net for registration. Afterwards, at 6pm, we will go to the Panton Arms, and everyone is welcome to join there!

HYBRID MEETING - please use this registration link for VIRTUAL attendance: https://zoom.us/meeting/register/tJwlde-gpz4iG9NZ60YXrbGfGgvDWeozG-QK

Programme Efficient algorithms for fingerprint similarity search and diversity selection Andrew Dalke, Dalke Scientific (remote) http://www.dalkescientific.com

Chemical substructure and similarity search at scale on a Graph computing platform Andrew Stolman, Abbvie (remote) https://www.abbvie.com/

Automated determination of optimal λ schedules for free energy calculations Sofia Bariami and Mark Mackey, Cresset (in-person) https://www.cresset-group.com/

The Cambridge Cheminformatics Network Meetings, which are free to attend and open to all. We start our meetings in the afternoon at 4pm (UK time) with a series of short scientific talks, either on Zoom, or in person. If the latter, we will continue the evening with a mixer at the local pub.

Comments

Performance of PyTorch on Apple Silicon

 

A really useful blog post on PyTorch on Apple Silicon

https://sebastianraschka.com/blog/2022/pytorch-m1-gpu.html.

Comments

PyTorch on Apple Silicon

 

PyTorch is now available on Apple Silicon https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/.

In collaboration with the Metal engineering team at Apple, we are excited to announce support for GPU-accelerated PyTorch training on Mac. Until now, PyTorch training on Mac only leveraged the CPU, but with the upcoming PyTorch v1.12 release, developers and researchers can take advantage of Apple silicon GPUs for significantly faster model training. This unlocks the ability to perform machine learning workflows like prototyping and fine-tuning locally, right on Mac.

To get started, just install the latest Preview (Nightly) build on your Apple silicon Mac running macOS 12.3 or later with a native version (arm64) of Python https://pytorch.org/get-started/locally/

Comments

asitop

 

Whilst Activity Monitor gives a nice graphical display it perhaps lacks granularity

A Python-based nvtop-inspired command line tool for Apple Silicon (aka M1) Macs. Code is available on GitHub https://github.com/tlkh/asitop

  • Utilization info: CPU (E-cluster and P-cluster), GPU, Frequency and utilization, ANE utilization (measured by power)

  • Memory info: RAM and swap, size and usage, Memory bandwidth (CPU/GPU/total), Media engine bandwidth usage

  • Power info: Package power, CPU power, GPU power, Chart for CPU/GPU power, Peak power, rolling average display

asitop uses the built-in powermetrics utility on macOS, which allows access to a variety of hardware performance counters. Note that it requires sudo to run due to powermetrics needing root access to run. asitop is lightweight and has minimal performance impact.

asitop only works on Apple Silicon Macs on macOS Monterey

Install using pip

pip3 install asitop

To activate

sudo asitop

Enter your password and you should see something like this

asitop

Comments

iBabel updated

 

The latest version of iBabel is now available. The big change is iBabel is now a universal application.

universal

More details here https://www.macinchem.org/ibabel/version5/ibabel5.php.

Comments

Ultra Large Chemical Libraries Conference

 

More details of the RSC CICAG meeting on Ultra-large Chemical Libraries are available.

This one-day meeting will be held on 10 August 2022 10:00-17:00, at Burlington House, London.

Registration is open https://www.rsc.org/events/detail/73675/ultra-large-chemical-libraries and a number of the speakers have been finalised and looks a great line-up.

Roger Sayle, NextMove Software Limited, United Kingdom
Carol Mulrooney, GSK, United States
Jan H Jensen, University of Copenhagen, Denmark
Noah Harrison, Evariste Technologies, United Kingdom
Peter Pogany, GSK, United Kingdom

There is still time to submit poster abstracts. A limited number of bursaries are available, the application form should be submitted to the organisers. A maximum of £300 will be reimbursed on submission of receipts.

If you would like to exhibit, sponsor or support this meeting please contact the organisers.

This meeting is supported by

DD logo 1600 x 325 RSC MedChem_no border

Comments

Tips for writing Vortex scripts/plugins

 

A Noel O'Blog post giving a few useful tips for writing scripts or plugins for Vortex https://baoilleach.blogspot.com/2022/04/threading-time-through-vortex.html.

Vortex (a chemical spreadsheet/visualisation software from Dotmatics) has a plugin system built around Jython. Simply drop a .vpy file into a specific scripts folder, and a menu item immediately appears in the application. Here are some notes on using this to communicate with a webserver.

The tutorials page on this site also includes many examples of Vortex scripts.  

Comments

The IUPAC Green Book, Quantities, Units, and Symbols in Physical Chemistry,

 

The IUPAC Green Book, Quantities, Units, and Symbols in Physical Chemistry, provides a readable compilation of widely used terms and symbols from many sources together with brief understandable definitions. For more details  https://iupac.org/what-we-do/books/greenbook/

We are currently working on a potentially major revision of the Green Book for the 5th edition IUPAC Project https://iupac.org/projects/project-details/?project_nr=2019-001-2-100  and we would like to consult as widely as possible about the content of the Green Book.  The 4th edition which has been updated with the new SI will be published soon along with an abridged edition but we are considering more major changes for the 5th edition.

Please fill in our survey to help us gather community views on what should be included in the 5th edition. The survey will remain open until the end of the summer but early replies will be very useful in planning the 5th edition project

We have received full ethics approval from the University of Southampton Ethics and Research Governance Team to run this survey under ERGO no 72139. Please read the following participant information to make sure that you understand and agree to the terms of this study: https://www.ai3sd.org/ai3sd/wp-content/uploads/sites/374/2022/03/72139_ParticipantInformationSheet.pdf.   To contribute to this discussion please click on the Survey link below or copy and paste the URL into your browser.   Survey   https://forms.office.com/Pages/ResponsePage.aspx?id=-XhTSvQpPk2-iWadA62p2AwsLs39xipFhWNBcEZwVyBURDk4RzBaRUIwWkhSQlU3QUgxTkROWVZGNyQlQCN0PWcu  

Comments

Print clipboard script

 

One of the most popular downloads on the site is the "Print Clipboard" script. This AppleScript prints any text copied to the clipboard without the need to paste the text into a text editor or word processor prior to printing. I was recently asked if it might be possible to have an option to save the text to a pdf file rather than printing. So I've added the option to print to pdf.

Copy any text to the clipboard and run the script. The first dialog shows the content of the clipboard

clipboarddialog

Click OK, and then the second dialog gives the option to Print directly (default option) or you can save as a pdf on your desktop. You can of course choose your own file name.

choosepdfname

You can download the AppleScript here.

http://macinchem.org/applescript/Applescripts/Print_clipboard.scpt.zip

If you want quick access to scripts I recommend selecting "Show Script menu in menu bar" in the Script Editor preferences.

scriptmenu

Comments

Comparing a M1 MacBook with Intel MacBookPro for Cheminformatics/CompChem Updated

 

I'm slowly working through a variety of cheminformatics toolkits and computational chemistry applications, I'm trying to run some "real world" workflows so you can see what kind of performance improvement you might expect.

The index page is here https://www.macinchem.org/reviews/MacBooks/m1macbookpromax.php and I'll update it as a test more applications.

Comments

Update on the next RSC CICAG Open Source Tools for Chemistry workshops

 

Update on the next RSC CICAG Open Source Tools for Chemistry workshops

PDBe Knowledge Base

This workshop explores the Protein Data Bank in Europe Knowledge Base (PDBe-KB https://www.ebi.ac.uk/pdbe/) resource and its tools for the investigation, analysis, and interpretation of biomacromolecular structures. PDBe-KB brings together data from all PDB entries and displays this data as aggregated information for individual proteins, including ligand binding sites, macromolecular interactions and more. Furthermore, this community-led resource brings together structural and functional information from a host of other related resources. In this workshop, you will learn how to use the PDBe-KB aggregated views for proteins to investigate structural and function information for proteins and their associated ligands. We will also demonstrate effective use of novel visualisation components of large-scale structural data on these pages, including 3D visualisation of superposed protein structures with their bound ligands.

2.00 PDBe-KB talk (30 min) - David Armstrong
2.30 Intro to tutorials (15 min) - David Armstrong
2.45 Break (10 min)
2.55 Intro to Biochemgraph project & ligand pages (25 min) - Preeti Choudhary
3.20 PDBe-KB ligand page design session (20 min) - Preeti Choudhary
3.40 Q&A (20 min) - David Armstrong and Preeti Choudhary
4.00 End of session

Still time to register https://www.eventbrite.com/e/open-source-tools-for-chemistry-tickets-294585512197?.

Comments

WWDC22, June 6-10

 

Apple have announced that WWDC22 will take place on June 6-10 2022, all online and at no cost. More details here https://developer.apple.com/wwdc22/.

WWDC22

Also check out the Swift student challenege.

We continue our long-standing support of students around the world who love to code with this year’s exciting Swift Student Challenge. Showcase your passion for coding by creating an incredible Swift Playgrounds app project on the topic of your choice. Winners will receive exclusive WWDC22 outerwear, a customized pin set, and one year of membership in the Apple Developer Program.

Comments

AMS2022 release,

 

The SCM team proudly announce our new AMS2022 release, with many new features and improvements.

New Parametrization ReaxFF & DFTB, reaction mapping, OLED tools

Full details are here

I also noted

We have also gotten reports of AMS2022 working correctly on the new Apple processors (M1), but we currently do not offer technical support for this platform.

Comments

Open-Source Tools workshops

 

Registration for the next batch of Open-Source Tools workshops run by the RSC Chemical Information and Computer Applications Group is now open.

https://www.eventbrite.com/e/open-source-tools-for-chemistry-tickets-294585512197?.

These workshops have been enormously popular and the interactions with the instructors have been especially valuable. Details of the next 3 workshops are described below.

All meetings start at 2 pm UK time (5 min break after 1 hour). All run using Zoom Webinar

21 April 2022 PDBe Knowledge Base (David Armstrong)

This workshop explores the Protein Data Bank in Europe Knowledge Base (PDBe-KB https://www.ebi.ac.uk/pdbe/) resource and its tools for the investigation, analysis, and interpretation of biomacromolecular structures. PDBe-KB brings together data from all PDB entries and displays this data as aggregated information for individual proteins, including ligand binding sites, macromolecular interactions and more. Furthermore, this community-led resource brings together structural and functional information from a host of other related resources. In this workshop, you will learn how to use the PDBe-KB aggregated views for proteins to investigate structural and function information for proteins and their associated ligands. We will also demonstrate effective use of novel visualisation components of large-scale structural data on these pages, including 3D visualisation of superposed protein structures with their bound ligands.

19 May 2022 KILFS database (Albert Jelke Kooistra, Andrea Volkamer )

Over the past three decades, six thousand structures of the catalytic kinase domain have been made publicly available via the Protein Data Bank. But to what extent are we making use of this wealth of information? In order to harness this data in a better way and to make it readily available for all to use in their research, KLIFS (https://klifs.net) was constructed. KLIFS, i.e. the Kinase–Ligand Interaction Fingerprints and Structures database, is a structural kinase database that systematically collects and processes all structures of the catalytic kinase domain. With the database, you can - for example - easily get a complete overview of all structures, search for ligands with a specific binding mode, identify analogs or your ligands of interest, collect data for your data mining and machine learning applications.

For this workshop, the developers of KLIFS have teamed up with the Volkamer Lab and therefore the workshop will be divided into two segments. First, Albert J. Kooistra will give an introduction to KLIFS and demonstrate different functionalities of the KLIFS website and the integration of KLIFS in KNIME via the 3D-e-Chem nodes. In the second half, Andrea Volkamer and Dominique Sydow will demonstrate, based on their new kinase-focused TeachOpenCADD workflow, how to assess kinase similarity from different data perspectives. They will emphasize their Python package KiSSim – a KLIFS-based kinase structural similarity fingerprint, and OpenCADD-KLIFS – a Python module to facilitate the integration of KLIFS data into kinase research workflows.

23 June 2022 Scoring of shape and ESP similarity (Ester Heid)

Electrostatic effects along with volume restrictions play a major role in enzyme and receptor recognition. Evaluating electrostatic and shape similarities of pairs of molecules such as proposed versus known ligands can therefore be valuable indicators of prospective binding affinities. This workshop will demonstrate how to compute electrostatic and shape similarities using the open-source tool ESP-Sim (github.com/hesther/espsim, doi.org/10.26434/chemrxiv-2021-sqvv9-v3). Available options for comparing electrostatics will be discussed interactively on selected examples of public datasets, along with advice on embedding and aligning molecules prior to computing similarities.

Comments

Such a tease

 

Comments

Mac Studio

 

So the Apple event revealed the new Apple Mac Studio, a double hight Mac mini, when combined with the new M1 Ultra chip this small enclosure appears to deliver really impressive performance.

MacStudio

The M1 Ultra is an evolution of the M1max chip that uses "UltraFusion" technology to fuse two M1 Max chips together, resulting in a huge processor that offers 16 high-performance CPU cores, 4 efficiency cores, a 48- or 64-core integrated GPU, and support for up to 128GB of RAM, 800GB/s of memory bandwidth and a 32-core Neural Engine.

Whilst Apple gave the usual performance tests based on video editing I'm not sure they give a realistic measure of performance for scientific applications.

I've been looking at a variety of different application/toolkits/python scripts etc. on my MacBook Pro M1 max here , and if anyone has a chance to test scientific software on the M1Ultra I'd be happy to include the results.

Comments

Apple Event March 8

 

Are you ready?

AppleEvent

https://www.apple.com/apple-events/.

Comments

Building combinatorial libraries using MOE on MacBook Pro M1max

 

I had a look at building combinatorial libraries using MOE on an MacBook Pro Apple M1 max.

Bottom line it is seriously fast.

Read more here...

Comments

GROMACS 2022 official release

 

The official release of GROMACS 2022 is now available.

  • Free-energy kernels are accelerated using SIMD, which make free-energy calculations up to three times as fast when using GPUs
  • A new formulation of the soft-cored non-bonded interactions for free-energy calculations allows for a finer control of the alchemical transformation pathways
  • New transformation pull coordinate allows arbitrary mathematical transformations of one of more other pull coordinates
  • New interface for multi-scale Quantum Mechanics / Molecular Mechanics (QM/MM) simulations with the CP2K quantum chemistry package, supporting periodic boundary conditions.
  • grompp performance improvements
  • Cool quotes music playlist
  • Additional features were ported to modular simulator
  • Added AMD GPU support with SYCL via hipSYCL
  • More GPU offload features supported with SYCL (PME, GPU update).
  • Improved parallelization with GPU-accelerated runs using CUDA and extended GPU direct communication to support multi-node simulation using CUDA-aware MPI.

If you are a Spotify user the Cool quotes music playlist may be of interest!

Note:

If you are running on Mac OS X, the best option is gcc. The Apple clang compiler provided by MacPorts will work, but does not support OpenMP, so will probably not provide best performance.

Comments

Schrödinger Software Release 2022-1

 

The latest Schrödinger Software Release 2022-1 brings support for Apple M1 machines in addition to a range of updates and new features.

Hit Identification & Virtual Screening

Pharmacophore Modeling

New alignmulticores.py script to align 3D ligands to a reference ligand with multiple disconnected cores [2022-1] Ligand Docking

Return SMARTS of the core used when running core constraint docking with MCS [2022-1] Input file that generated a Glide grid is saved in the grid archive to improve ease of making changes [2022-1]

Target Validation & Structure Enablement

Protein Preparation

Sped-up hydrogen atom assignment to be o(n) by system size [2022-1] Protein X-Ray Refinement

PHENIX/OPLS supports PHENIX 1.20 [2022-1] Multiple Sequence Viewer/Editor

Automatically save MSV projects [2022-1] Rapid selection of a subset of sequences based on user-defined percent identity or similarity relative to a reference sequence [2022-1] Improved ability to save one or more sequences by ‘right clicking’ to export [2022-1] Protein Homology Modeling

Selectively download only the PDB BLAST subset of the NR BLAST database for local homology modeling [2022-1] New Workflow Action Menu prompts for homology modeling enables single click access to structure quality assessment, reliability reports, additional loop refinement, and sidechain refinement and localized minimization [2022-1]

Platform Environment

Maestro Graphical Interface

Apple M1 Support [2022-1] New 2D Sketcher (beta) [2022-1] New Workflow Action Menus [2022-1] Antibody Modeling Homology Modeling [2022-1] Force Field

Improved accuracy of histidine parameters, particularly in FEP+ prediction of histidine pka’s [2022-1] Improved geometries for B-N bond containing compounds [2022-1] Up to 10x faster execution of FFBuilder when parameterizing hundreds of ligands through greater job distribution [2022-1] Workflows & Pipelining [KNIME Extensions]

New 2D Sketcher node [2022-1] Run from LiveDesign [2022-1]: Export to LiveDesign node can export all the structures so model results can be stored in new LiveReport(s) Model output columns can contain files (eg with pdf) Store an executed workflow in a LiveReport column

Medicinal Chemistry Design

Ligand Designer

Ability to specify a max number of enumerated compounds [2022-1] Added access to “Vendor ID” details in the Project Table for purchasable compounds [2022-1]

Lead Optimization

FEP+

FEP+ Correlation Plot [2022-1]: Display best fit line and equation of the line Modified reporting to show confidence intervals instead of standard deviations Web services [2022-1]: Improved performance when viewing map status Solubility FEP (Beta)

Access to trajectory, representative structures, FEP classifiers in the analysis tab [2022-1] Web Services will return fmp/fmpdb files instead of mae/csv for analysis [2022-1] AutoQSAR

DeepChemAutoQSAR now supports Windows and Mac platforms [2022-1] FPsim-GPU

New vendor column in similarity results [2022-1]

Comments

Installing Alphafold2 on Apple Silicon

 

AlphaFold2 is an artificial intelligence (AI) program developed by Alphabets's/Google's DeepMind which performs predictions of protein structure. Despite the name AlphaFold2 does not actually predict the folding mechanism instead it predicts the final 3D structure of a protein from the protein sequence DOI.

Source code for the AlphaFold model, trained weights and inference script are available under an open-source license at https://github.com/deepmind/alphafold.

I've compiled step by step instructions for installing Alphafold2 on a MacBook Pro M1 max here https://www.macinchem.org/reviews/alphafold/installalphafold2.php.

Many thanks to Yoshitaka Moriwaki for help.

Comments

Matched molecular pair database generation and analysis

 

Matched molecular pair analysis (MMPA) is a popular structure activity method in cheminformatics that compares the properties of two molecules that differ only by a single chemical transformation, (e.g. substitution of a hydrogen atom by a chlorine atom). Because the structural difference between the two molecules is small, any experimentally observed change in a physical or biological property between the matched molecular pair could be associated with this particular molecular transformation.

Andrew Dalke has recently published open source code to support this methodology https://github.com/adalke/mmpdb/tree/v3-dev.

To install

python -m pip install mmpdb

The package has been tested on Python 3.9.

You will need a copy of the RDKit cheminformatics toolkit, available from http://rdkit.org/ , which in turn requires NumPy. You will also need SciPy, peewee, and click. The latter three are listed as dependencies in setup.cfg and should be installed automatically.

Full details are described in this publication.

A. Dalke, J. Hert, C. Kramer. mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets. J. Chem. Inf. Model., 2018, 58 (5), pp 902–910. DOI.

Comments

AI/ML on Apple Silicon

 

A GitHub repository giving details of how to set up an Apple M1 machine for data science. https://github.com/tcapelle/applem1pro_python, includes a series of test scripts for benchmarking.

There is a M1 Max VS RTX3070 Tensorflow Performance Tests here.

Comments

5th RSC BMCS/CICAG Artificial Intelligence in Chemistry Meeting

 

The 5th Artificial Intelligence in Chemistry is now open for both oral and poster abstract submission. This meeting will held at Churchill College 1-2 September 2022. #AIChem22

Confirmed Speakers Include

Charlotte Deane, Connor Coley, Kim Jelfs, Val Gillet, Adrian Roitberg,

You can submit your abstracts here https://hg3.co.uk/ai/.

croppedAI

The circular for the meeting is here https://www.rscbmcs.org/wp-content/uploads/2022/01/FINALAIinChemistry1st_Announcement.pdf

Comments

Python 2.7 removed from macOS Monterey 12.3

 

The developer notes for macOS Monterey 12.3 beta contains details of new features and also deprecations.

Python 2.7 was removed from macOS in this update. Developers should use Python 3 or an alternative language instead. (39795874)

I moved to Python 3.x a while back (I use conda python) but if you are relying on system python and are still using version 2.7 this sounds like time to move.

Comments

Multiple screens on a laptop

 

This looks very cool, with more people working away from the office sometimes screen real estate becomes and issue. This adds a couple of extra screen to your MacBook.

xebec

https://www.thexebec.com/products/xebec-tri-screen-2.

Comments

Mathematica on Apple sIlicon

 

A couple of readers have asked about the performance of Mathematica on the new Apple Silicon machines. I've heard second hand reports that it runs 2-3 times faster on "real world" problems but no details.

Does anyone have any benchmarks that would be willing to share?

Comments

ChemDoodle 2D updated

 

The very popular chemical drawing application ChemDoodle has been updated to version 11.8

ChemDoodle 2D v11.8 includes significant improvements to many features, including excellent chain replacement nomenclature in IUPAC naming, support for allene/cumulene stereochemistry, improved bezier curve tools, current InChI support and more file options.

Full details of the update are here https://www.ichemlabs.com/news/read?post=cd2d118released.

You can get all the ChemDoodle applications (ChemDoodle 2D, 3D and mobile) for a single monthly ($15) or yearly subscriptions ($100), and as a one off lifetime purchase ($750).

More details on the store https://www.ichemlabs.com/store.

Comments

Icons for Mac Hard drive

 

Someone asked me recently where I got the image for the Macintosh HD on my desktop.

macHD icon

Actually they are all available on your Mac.

macIcons

They can be found in

/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/

You will have to right-click (or control click) on CoreTypes.bundle and choose show bundle contents

Comments

MacVector on Apple Silicon

 

Some one just sent me an email mentioning MacVector supports Apple silicon.

MacVector 18.2 requires Mac OS X 10.12 or later. It will NOT work on Windows, Mac OS 9 or on Mac OS X 10.11 or earlier. MacVector 18.2 is a "Universal Binary", meaning it will run natively on both Intel and Apple Silicon based Macintosh computers.

Comments

MacBook M1 vs M1 Pro for Data Science and Machine Learning

 

When the first M1 MacBooks came out there were limited libraries available but over the last year most of the libraries needed for data science now support the new Apple silicon architecture.

Includes details for installing TensorFlow and the test dataset.

Comments

Comparing energy usage between M1 Mac and Intel

 

Updated

Added a couple more comparisons.

The pages comparing cheminformatics/compchem apps on the MacBook Pro M1max are proving very popular. Several readers have asked me to compare energy usage which is an excellent suggestion.

Based on a suggestion I purchased Nevsetpo Power Meter UK Plug Power Monitor Watts Meter Plug and I've used it to test a selection of tasks. Once plugged into a socket it monitors total energy consumption of anything device plugged in. Both machines were fully charged and the "Optimised battery charging" was switched off.

I tried a few computationally intensive tasks and details of energy consumption are here..

M1chip

Comments

RSC CICAG Open Source Tools for Chemistry Workshops

 

In 2020 RSC CICAG ran a 5 day virtual meeting on Open Source Chemical Sciences, this event had three streams Open Data. Open Publishing and Open Source tools for Chemistry. The Open Source tools for Chemistry workshops proved to be enormously popular and so CICAG held a series of monthly workshops through 2021. These workshops covered a variety of Open Source tools and resources ranging from visualisation tools, data analysis using cheminformatics toolkits, and online resources like the PDB.

These workshops were all recorded and are available on the CICAG YouTube channel, as we plan for this years workshops I thought it might be timely to remind everyone what is now available and also to thank all the presenters and developers who made the workshops possible. I've included links to all the workshops below

PDB workshop 2 using Mol*
PDB workshop 1 Registration system
Clustering using KNIME
Web apps for fragment-based drug discovery
Introduction to Cheminformatics and Machine Learning
Oxford Protein Informatics Group antibody modelling tools
Advanced DataWarrior
GNINA
Chemical Structure validation/standardisation
ChimeraX
DataWarrior
ChEMBL
UsingGoogleCoLab workshop
Fragalysis workshop
Knime workshop
PyMOL workshop

These workshops have now been viewed nearly 13,500 times and some of the comments are worth highlighting

I am not sure why this software is not famous. This is presumably the best chemoinformatics software I have seen, Great presentation!

Two hours of distilled pure science.

These workshops were sponsored by Liverpool Chirochem.

Comments

Annual Site review

 

At the end of each year I have a look at the website analytics to see which items were the most popular.

Over the year there were 118,000 visitors spending an average of 1.5 minutes per session. Whilst the US and the UK were the two top countries but as the table below shows, there has been a steady following around the world. As might be expected the majority are Mac users (57%) but there are a substantial number of Windows (23%) and Linux users (4%). The mobile platforms iOS (9%) and Android (6%) make up most of the remainder.

macinchemVisitors2

The most popular page was again the Fortran on a Mac page, followed by the M1chip category page. The iBabel page is in third place. Interestingly, the next most popular page was Python coding within Xcode. Perhaps, suggesting that Apple should consider making it a bit more seamless to use Xcode with Python.

The Mobile Science site has seen increased visitor numbers.

The most popular apps viewed were.

Merck PTE
IBM Micromedex Drug Info
Python3IDE
PocketCAS: Mathematics Toolkit
Molecular Constructor
Radiology 2.0

Also popular were

Human Anatomy Atlas 2019
ChemTube3D.
The Periodic Table Project
Periodic Table

The Twitter feed @macinchem has steadily attracted new followers and currently has 1272 followers.

The most popular tweets were

Rutherford &Fry book
Malcolm Campbell award
Web apps for Fragment drug discovery
More Open Source workshops

Comments