Macs in Chemistry

Insanely great science

Setting up ML and AI tools on Apple Silicon

One of the questions I'm regularly asked is can you run data analysis/machine learning/Artificial Intelligence jobs on Apple Silicon machines. I'm not an expert in AI but I thought I'd go through the process of setting up an Apple Silicon MacBook Pro M1 max for machine learning using python. I've tried to document every step so apologies if it is too detailed.

The details of the machine are shown below, macOS Monterey version 12.5.1


First Steps

I'm using home-brew and conda to install and manage compatibility and dependences, detailed notes on installation are on the instructions for install cheminformatics tools on a Mac

Install Homebrew from You may need to install the Xcode Command line tools, details are in the link above.

Install Anaconda or Miniconda normally (I used miniconda), and let the installer add the conda installation of Python to your PATH environment variable. There is no need to set the PYTHONPATH environment variable.

Here is the link for miniconda for Apple Silicon Make sure you use the arm64 version.

Restart the Terminal

You can check the installation using these commands in the Terminal

(base) chrisswain@ChrisM1MBP ~ % echo $PATH

(base) chrisswain@ChrisM1MBP ~ % which python

(base) chrisswain@ChrisM1MBP ~ % python --version
Python 3.8.12

Check you have the arm version installed

(base) chrisswain@ChrisM1MBP ~ % python
Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:13:55) 
[Clang 11.1.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.platform()

The Terminal prompt includes (base) because we are in the base python installation environment.

This is probably a good point to install the Xcode command line tools, these include many useful tools such as the Apple LLVM compiler, linker, and Make for compiling executable software from source code.

Xcode-select  --install

Setting up a Machine Learning environment

We can now set up an environment for machine learning. I have a folder called projects and I created a sub-folder called ForAIML

Whilst the instructions below take you through the complete process I've also created an environment file myMLenv.yml that can be downloaded from here. this used to create the conda environment using the following command.

conda env create -f myMLenv.yml -n myML

In the Terminal cd to this folder and then set up a python environment, at the moment I'm using python 3.9.x for most of my work.

conda create -n myML python=3.9
conda activate myML

Collecting package metadata (current_repodata.json): done

Solving environment: done

==> WARNING: A newer version of conda exists. <== current version: 4.13.0 latest version: 4.14.0

Please update conda by running

$ conda update -n base conda

You should see the following displayed in the Terminal, note packages are for conda-forge/osx-arm64.

Package Plan

environment location: /Users/chrisswain/Projects/ForAIML/myML

added / updated specs: - python=3.9

The following packages will be downloaded:

package                    |            build
ca-certificates-2022.07.19 |       hca03da5_0         124 KB
libsqlite-3.39.2           |       h2c9beb0_1         825 KB  conda-forge
libzlib-1.2.12             |       ha287fd2_2          48 KB  conda-forge
openssl-3.0.5              |       h7aea29f_1         2.3 MB  conda-forge
pip-22.2.2                 |     pyhd8ed1ab_0         1.5 MB  conda-forge
readline-8.1.2             |       h46ed386_0         263 KB  conda-forge
setuptools-65.1.1          |   py38h10201cd_0         1.4 MB  conda-forge
sqlite-3.39.2              |       h40dfcc0_1         817 KB  conda-forge
xz-5.2.6                   |       h57fd34a_0         230 KB  conda-forge
                                       Total:         7.4 MB

The following NEW packages will be INSTALLED:

  bzip2              conda-forge/osx-arm64::bzip2-1.0.8-h3422bc3_4
  ca-certificates    pkgs/main/osx-arm64::ca-certificates-2022.07.19-hca03da5_0
  libffi             conda-forge/osx-arm64::libffi-3.4.2-h3422bc3_5
  libsqlite          conda-forge/osx-arm64::libsqlite-3.39.2-h2c9beb0_1
  libzlib            conda-forge/osx-arm64::libzlib-1.2.12-ha287fd2_2
  ncurses            conda-forge/osx-arm64::ncurses-6.3-h07bb92c_1
  openssl            conda-forge/osx-arm64::openssl-3.0.5-h7aea29f_1
  pip                conda-forge/noarch::pip-22.2.2-pyhd8ed1ab_0
  python             conda-forge/osx-arm64::python-3.8.13-hd3575e6_0_cpython
  python_abi         conda-forge/osx-arm64::python_abi-3.8-2_cp38
  readline           conda-forge/osx-arm64::readline-8.1.2-h46ed386_0
  setuptools         conda-forge/osx-arm64::setuptools-65.1.1-py38h10201cd_0
  sqlite             conda-forge/osx-arm64::sqlite-3.39.2-h40dfcc0_1
  tk                 conda-forge/osx-arm64::tk-8.6.12-he1e0b03_0
  wheel              conda-forge/noarch::wheel-0.37.1-pyhd8ed1ab_0
  xz                 conda-forge/osx-arm64::xz-5.2.6-h57fd34a_0

Proceed ([y]/n)? y

Type Y and the installation will proceed.

The Terminal prompt should now include (myML) because we are now in the myML python environment.

We can now start to install a variety of data science analysis and visualisation packages.

(myML) chrisswain@ChrisM1MBP  % conda install jupyter pip pandas numpy matplotlib seaborn scikit-learn tqdm scipy lxml version_information lightgbm yellowbrick rdkit=2022.03.4

Now add pytorch

(myML) chrisswain@ChrisM1MBP  % pip3 install torch torchvision torchaudio

This will install the packages and any additional dependencies

Installing collected packages: urllib3, typing-extensions, pillow, numpy, idna, charset-normalizer, certifi, torch, requests, torchvision, torchaudio

Now you can start jupyter

(myML) chrisswain@ChrisM1MBP  %jupyter notebook

(myML) chrisswain@ChrisM1MBP pytorchtest % jupyter notebook
    [I 12:51:09.557 NotebookApp] Serving notebooks from local directory: /Users/chrisswain/Projects/ForAIML
[I 12:51:09.557 NotebookApp] Jupyter Notebook 6.4.12 is running at:

Create a new notebook by clicking on the "New" button and selecting Python 3 (ipykernel)" Type this code to verify all the dependencies are available and check PyTorch version.

import version_information
%load_ext version_information
%reload_ext version_information
%version_information torch, numpy, scipy, pandas, scikit-learn, seaborn, matplotlib


Then check PyTorch version/GPU access.

import torch
import numpy as np
import pandas as pd
import sklearn
import matplotlib.pyplot as plt

print(f"PyTorch version: {torch.__version__}")

# Check PyTorch has access to MPS (Metal Performance Shader, Apple's GPU architecture)
print(f"Is MPS (Metal Performance Shader) built? {torch.backends.mps.is_built()}")
print(f"Is MPS available? {torch.backends.mps.is_available()}")

# Set the device      
device = "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using device: {device}")


Hopefully you now have a Python environment on you Apple Silicon machine for AI/ML

You might want to also have a look at PyCaret

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that exponentially speeds up the experiment cycle and makes you more productive.

PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks, such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, and a few more (this pre release candidate is not in the yml). You can install PyCaret 3.0-rc

pip install --pre pycaret

To use Tensorflow we need to add a few more packages, some of which are available from the Apple conda channel. To add the channel type

conda config --addchannels apple

Then we can install

conda install  tensorflow-deps

Amount other things you should see

 tensorflow-deps    apple/osx-arm64::tensorflow-deps-2.9.0-0

Followed by the following packages using pip

pip install tensorflow-macos
pip install tensorflow-metal
pip install bayesian-optimization
pip install gym
pip install kaggle

Register your Environment

The following command registers your myML environment and makes it available as a kernel in your the Jupyter notebook.

python -m ipykernel install --user --name myML --display-name "Python 3.9 (myML)"

Checking Tensorflow

(myML) chrisswain@ChrisM1MBP ~ % python
>>> import tensorflow.keras
>>>  import tensorflow as tf
>>> print(f"Tensor Flow version: {tf.__version__}")

Tensor Flow version: 2.10.0

>>> gpu = len(tf.config.list_physical_devices('GPU'))>0
>>> print("GPU is" , "available" if gpu else "NOT AVAILABLE"

GPU is available

Running simple Machine learning projects in a Jupyter Notebook.

To further confirm all is working correctly I've created a series of Jupyter notebooks exploring a variety of machine learning data analysis workflows.

The notebooks can be downloaded here JupyterNotebooks

PLSmodel using
MLRmodel using
RFmodel using
Lightgbm using

All examples use a data set from the UCI machine learning repository

The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant. A combined cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST) and heat recovery steam generators. In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is colected from and has effect on the Steam Turbine, he other three of the ambient variables effect the GT performance. Pınar Tüfekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods, International Journal of Electrical Power & Energy Systems, Volume 60, September 2014, Pages 126-140, ISSN 0142-0615

Last Updated 11 October 2022