Macs in Chemistry

Insanely Great Science

KNIME workshop on clustering molecules

 

The latest RSC CICAG Open-Source Tools for Chemistry Workshop is now on YouTube. RSC CICAG Open Source Tools for Chemistry :- Clustering using KNIME.

Clustering is an invaluable cheminformatics technique for subdividing a typically large compound collection into small groups of similar compounds. One of the advantages is that once clustered you can store the cluster identifiers and then refer to them later, this is particularly valuable when dealing with very large datasets. Clustering is often used in the analysis of high-throughput screening results, or the analysis of virtual screening or docking studies.

knimeWorkshop

There is a comparison of clustering options here.

The previous 14 workshops are available on a CICAG YouTube playlist.

You can still register for next months workshops here https://www.eventbrite.com/e/open-source-tools-for-chemistry-workshops-tickets-156431429617.

Comments

RSC CICAG KNIME workshop materials

 

Just got this message from Greg Landrum

The workshop tomorrow will have a hands-on component. This isn't mandatory, but I think you'll get more from the workshop if you follow along with what we're doing in your own local copy of the workflow.

In order to get you started and make sure that you have all the KNIME pieces that you need installed, I created a space in the public KNIME hub for the workshop and have uploaded some introductory material there:

https://hub.knime.com/greglandrum/spaces/Public/latest/Presentations/2021/20211020RSCWorkshop/ClusteringIntro.

If you are logged into the KNIME hub (registration is free), you can download the workflow and data by simply using the download button:

KNIMEimage

Once you've downloaded the workflow package, you should be able to import everything into the KNIME Analytics Platform by double clicking the "ClusteringIntro.knar" file which is downloaded.

If for some reason you don't want to register, then you'll need to navigate to the page for the workflow and the Data folder, download everything individually, and import them into KNIME Analytics Platform manually. You should be able to find information online for how to do this, but I won't be able to help you with this during the workshop due to time constraints.

When you open the 01_Clustering workflow, you may be asked if you want to install missing extensions. Please do this in order to ensure that you have everything necessary to follow along during the workshop. Everything we install as part of this process is free and open source.

Shortly before the workshop starts I will share an additional workflow which we'll use for the main part of the workshop. I'll give you this link during the workshop.

Note that the sample workflows we are using were created with KNIME 4.4 (the version released this summer). For workshops like this we like to use recent versions of KNIME so that we can show you the newest features and capabilities. If you have an older version of KNIME things may or may not work correctly and you may have to replace one or more of the nodes with older equivalents.

You can download KNIME here https://www.knime.com/downloads

Comments

Workshop on Open-Source Tools for Chemistry

 

Just a couple of notes for software installs prior to the event for those attending the free online Workshop on Open-Source Tools for Chemistry 9-13 November 2020.

Monday 13-30 to 15-30 Cheminformatics and Data Analysis using DataWarrior (Isabelle Giraud)

DataWarrior can be downloaded from here http://www.openmolecules.org/datawarrior/download.html

The training files can all be downloaded from here

Monday 16 - 00 to 18-00 Molecular visualisation using Pymol (Garrett Morris)

Software to install:

PyMOL via Conda:
Conda: https://www.anaconda.com/distribution/ or Miniconda: https://docs.conda.io/en/latest/miniconda.html
https://anaconda.org/psi4/pymol or https://omicx.cc/2019/05/26/install-pymol-windows/

PyMOL via MacPorts:
http://www.ub.edu/cbdd/?q=content/installing-pymol-macports
% sudo port install tcl -corefoundation
% sudo port install tk -quartz
% sudo port install pymol

PyMOL from GitHub: https://github.com/schrodinger/pymol-open-source

Tuesday 11 to 13-00 Chemistry in the cloud: leveraging Google Colab for quantum chemistry (Jan Jensen)

Participants should download Chrome and have a Google account
Participants should make sure they can access this page: https://bit.ly/37fIYbp.
Some basic degree of Python proficiency is required for the course

It would be great if participants could fill out this survey https://forms.gle/pjwsnJTb4X6QpiHK9 early enough to help me design the course

Wednesday 13-30 to 15-30 Accessing biological and chemical data in ChEMBL (Anna Gaulton)

Requires a modern web-browser (with javascript not blocked) such as Chrome/Safari

Thursday 16-00 to 18-00 Fragment based screening, XChem at Diamond (Rachel Skyner)

Requires Chrome web browser, if there is time Rachel would like to give an introduction to the new Python API, we can go through the installation at the workshop but you must have Anaconda installed.

Friday 11-00 to 13-00 An introduction to KNIME workflows (Greg Landrum)

Knime can be downloaded here https://www.knime.com/downloads

Registration This event will be free to attend but registration is required.

More details and registration can be found here https://www.rsc.org/events/detail/43180/workshop-on-open-source-tools-for-chemistry.

Last Updated 28 October 2020

Comments

Workshop on Open-Source Tools for Chemistry

 

All scientists working in chemistry need software tools for accessing, handling and storing chemical information, or performing molecular modelling and computational chemistry. There is now a wealth of open-source tools to help in these activities; however, many are not as well-known as commercial offerings. This workshop offers a unique opportunity for attendees to try out a range of open-source software packages for themselves with expert tuition in different aspects of chemistry.

pymol

The software packages will be presented over six two-hour sessions as follows:

09 November: 13.30 - 15.30 Cheminformatics and data analysis using Data Warrior (Isabelle Giraud) 09 November: 16.00 - 18.00 Molecular visualization using PyMOL (Garrett M Morris)

10 November: 11.00 - 13.00  Chemistry in the cloud: leveraging Google Colab for quantum chemistry  (Jan Jensen)

11 November: 13.30 - 15.30  Accessing biological and chemical data in ChEMBL (Anna Gaulton)

12 November: 16.00 - 18.00  Fragment-based screening, XChem at Diamond (Rachael Skyner)

13 November: 11.00 - 13.00  Interactive and automated chemical data analysis with KNIME (Greg Landrum)

Registration This event will be free to attend but registration is required.

More details and registration can be found here https://www.rsc.org/events/detail/43180/workshop-on-open-source-tools-for-chemistry.


Comments

Online Events

 

The current global pandemic means that more events are moving online, here are details of a few that have been sent to me

Dotmatics User Symposium | Cambridge 2020 14th & 15th October Details and Registration.

KNIME Introduction to Working with Chemical Data October 12 - 16, 2020 details and registration.

Virtual RDKit UGM 6-8 October 2020 details and registration.

16th German Conference on Cheminformatics and EuroSAMPL Satellite Workshop 2-3 November 2020 details

Open Chemical Science 9 - 13 November 2020 details.

Comments

3D-e-Chem NLeSC project

 

This looks interesting 3D-e-Chem NLeSC project.

This project will develop technologies to improve the integration of ligand and protein data for structure-based prediction of protein-ligand selectivity and polypharmacology.

The project will use KNIME Analytics Platform to integrate the different technologies and datasets.



Comments

Data curation workflow

 

One of the most time-consuming parts of any data analysis is curating the input data prior to any model building. This Knime workflow is fully documented and described and as such is an invaluable starting point.

A semi-automated procedure is made available to support scientists in data preparation for modelling purposes. The procedure address:

  • Automatic chemical data retrieval (i.e., SMILES) from different, orthogonal web based databases, by using two different identifiers, i.e. chemical name and CAS registration number. Records were scored based on the coherence of information retrieved from different web sources.
  • Data curation procedure performed to top scored records. The procedure includes removal of inorganic and organometallic compounds and mixtures, neutralization of salts, removal of duplicates, checking of tautomeric forms.
  • Standardization of chemical structures yielding to ready-to-use data for the development of QSARs.

Comments

LigandScout 4.3 released

 

Inte:Ligand have just announced the release of LigandScout 4.3.

The LigandScout software suite comprises the most user friendly molecular design tools available to chemists and modelers worldwide. The platform seamlessly integrates computational technology for designing, filtering, searching and prioritizing molecules for synthesis and biological assessment.

ph4hilite

This is a significant update and expands LigandScout's molecular dynamics support. This update also now includes halogen binding as a new pharmacophoric element. In addition plotting has received an upgrade.

Furthermore, LigandScout 4.3 Expert introduces a completely new set of features summarized under the term Remote Execution. It is now possible to screen large compound libraries on remote High Performance Computing directly from within the graphical LigandScout user interface.

It can be downloaded here http://www.inteligand.com/ligandscout4/downloads/LigandScout43macos20181012.dmg

You can read about the technology behind LigandScout here DOI and there is a review of an earlier version here.

In addition there are now over 40 LigandScout nodes for KNIME.

KNIME Analytics Platform is the open source software for creating data science applications, workflows and services. Intuitive, open, and continuously integrating new developments, KNIME makes understanding data and designing data science workflows and reusable components accessible to everyone.

Comments

REALizer KNIME workflow from BioSolveIT

 

BioSolveIT have added to their collection of KNIME workflows.

The "REALizer" helps you to post-process the results from searches in the REAL Space, leading you to those compounds of biggest interest.


Comments

Intelligently Automating Machine Learning, Artificial Intelligence, and Data Science

 

A timely tutorial and example workflow.

we have put together a more comprehensive workflow, serving as a blueprint for anyone to build her or his own version of a Guided Analytics application to combine just the right amount of automation and interaction for a specific set of problems.

Full details here


Comments

KNIME update

 

What’s New in KNIME Analytics Platform 3.6.

  • KNIME Deep Learning
  • Constant Value Column Filter
  • Numeric Outliers
  • Column Expressions
  • Scorer (JavaScript)
  • Git Nodes
  • Call Workflow (Table Based)
  • KNIME Server Connection
  • Text Processing
  • Usability Improvements
  • Connect/Unconnect nodes using keyboard shortcuts
  • Zooming
  • Replacing and connecting nodes with node drop
  • Node repository search
  • Usability improvements in the KNIME Explorer
  • Copy from/Paste to JavaScript Table view/editor
  • Miscellaneous
  • Performance: Column Store (Preview)
  • Making views beautiful: CSS changes
  • KNIME Big Data Extensions
  • Create Local Big Data Environment
  • KNIME H2O Sparkling Water Integration
  • Support for Apache Spark v2.3
  • Big Data File Handling Nodes (Parquet/ORC)
  • Spark PCA
  • Spark Pivot
  • Frequent Item Sets and Association Rules
  • Previews
  • Create Spark Context via Livy
  • Database Integration
  • Apache Kafka Integration
  • KNIME Server

  • Management (Client Preferences)

  • Job View (Preview)
  • Distributed Executors (Preview)
  • General release notes

  • JSON Path library update

  • Java Snippet Bundle Imports

I suspect it will be the KNIME Deep learning that will catch the eye, the ability to set up deep learning models using drag and drop. Use regular Tensorflow models within KNIME Analytics Platform and seamlessly convert from Keras to Tensorflow for efficient network execution

deeplearning

The new Create Local Big Data Environment node creates a fully functional local big data environment including Apache Spark, Apache Hive and HDFS. It allows you to try out the nodes of the KNIME Big Data Extensions without a Hadoop cluster.


Comments

Tips & Tricks for Using KNIME

 

The Knime blog has a post containing lots of user submitted tips and tricks

Ever sat next to a friend or colleague at the computer and were awed when you suddenly realised the way they do certain tasks is much better? We recently asked KNIME users to share their tips and tricks on using KNIME. In this series of posts we’ll be showing you how the experts use KNIME in the hopes that by sharing ideas you’ll discover some handy techniques.


Comments

How Do You Build and Validate 1500 Models and What Can You Learn from Them?

 

Greg Landrum's ICCS 2018 presentation on slideshare


Comments

KNIME tutorial

 

Don't forget to sign up for your chance to hear a webinar by Greg Landrum, Knime's VP for Life Sciences, this Wednesday, He will be talking about processing malaria HTS results using Knime and will give a tutorial on workflows developed for ligand-based virtual screening, based on results of a phenotypic HTS against malaria.

Wed, Feb 21, 2018 3:00 PM - 4:00 PM GMT

Register Here.


Comments

MedChem Wizard KNIME workflow

 

The MedChemWizard is a KNIME workflow designed to assist medicinal chemists with idea generation, ligand design and lead optimization using a number of common functional group transformations and medchem rules-of-thumb, this tutorial provided by Dr. Alastair Donald gives a detailed description of it's use.

mcwizard

Comments

Workflow tools

 

Workflow tools are becoming increasingly common in science and this publication by Wendy Warr gives an excellent comparison of the leading alternatives, Scientific workflow systems: Pipeline Pilot and KNIME DOI.

Really well worth a read.

Comments

KNIME 2.7 released

KNIME 2.7 has been released.

KNIME now runs on Java 7 for Windows and Linux systems (Mac stays on  Java 6) Eclipse update 3.7 increases stability on Mac and some Linux systems. BIRT 3.7 brings Open Office support among other new features

JFreeChart nodes have now more setting options in the “General Plot Options” tab of their configuration window.
In R-> Local there are a number of new nodes to import:

  1. “Table to R” can read a KNIME table into R and output the R workspace.  
  2. “R to Table” takes an R workspace and outputs a KNIME table.
  3. “R +Data to R” takes an R workspace and optional data input and outputs an R workspace.
  4. “R to R-View” takes an R workspace and outputs a KNIME view

There is a KNIME tutorial here