The Chemfp Project


The Chemfp project started as a way to promote the FPS format for cheminformatics fingerprint exchange and has evolved into a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. The 10 years of work and research results of the chemfp project have now been described in an excellent publication.

I looked at Chemfp when comparing various options for clustering large datasets and Chemfp was one of the highest performing, and Andrew Dalke was very responsive to questions.


Workshop on Computational Tools for Drug Discovery

Workshop on Computational Tools for Drug Discovery (with SCI).
10 April 2019, The Studio, Birmingham.

Details of the workshops

Attendees will be able to choose from 4 of 6 sessions.

Optibrium Guided multi-parameter optimisation of 2D and 3D SAR

In this workshop, we will explore the concept of multi-parameter optimisation (MPO) and its application to quickly target high-quality compounds with a balance of potency and appropriate absorption, distribution, metabolism and excretion (ADME) properties. We will further illustrate how this concept can be combined with an understanding of 2D and 3D structure-activity relationships (SAR) to guide the design of new, improved compounds.

The workshop will be based on practical 'hands-on' examples using our StarDrop™ software and all participants will get a 1-month free trial license to use StarDrop following the workshop. For more information on StarDrop, please visit our website or watch some videos of StarDrop in action at

Cresset Next generation structure-based design with Flare

Learn how simple structure-based design can be within small molecule discovery projects. The workshop will cover ligand design in the protein active site, Electrostatic Complementarity™ maps and scores, ensemble docking of ligands with Lead Finder, calculations of water stability and locations using 3D-RISM, energetics of ligand binding using WaterSwap and use of Python extensions. Applications you will use: Flare™ , Lead Finder™.

Dotmatics Data visualisation and analysis with Dotmatics

Dotmatics offers a comprehensive scientific software platform for knowledge management, data storage, enterprise searching and reporting. The focus of the workshop will be the Dotmatics visualisation and data analysis software in small molecule drug discovery workflows around compound selection from vendor catalogues and analysis of lead optimisation datasets as typically found in drug discovery.

BioSolveIT Fast – Visual – Easy – computer-aided drug design for all chemists

In this workshop you will learn - hands-on - to use modern software for hit-finding, hit-to-lead and lead optimization. We will walk you around the drug discovery cycle and show you: how to assess your protein and discover a binding site; how simple modifications to the bound molecule affect the binding affinity; how to replace a scaffold or explore sub-pockets for improved binding; how to keep all your key ADME-parameters in check, while you optimize your lead; and last but not least how to quickly find new starting points in a giant 3.8 billion vendor catalog of compounds ready for purchase.

Instead of dry theory, we will explain those use cases based on real-world scenarios and interesting targets such as Thrombin, BTK, Endothiapepsin and BRD4. Bring your own laptop to try this out for yourself right away and receive the software as well as a free trial license on top. The Software tools are called:

SeeSAR – "modeling for all chemists" and REAL Space Navigator – "the world’s largest searchable catalog of compounds on demand".

Knime An interactive workflow for hit list triaging

In this workshop I will introduce a workflow built using the open source KNIME Analytics Platform for doing hit-list triaging and selecting compounds for confirmatory assays or other followup testing. We will use a real-world HTS dataset and work through reading the data in, flagging molecules that are likely to have interfered with the assay, manual "rescue" of compounds removed by the filters, and selecting a compound subset that covers the chemical diversity of the hits yet still allows learning some SAR from subsequent experiments. Participants will be provided with both the dataset and the workflow used during the workshop so that they can adapt it to their own needs.

ChemAxon Computational intelligence driven drug design

The most recent era of vast data sources, rapid data processing and model building enables drug designers to propose high quality structures in ideation phase in lean ("fail-early") discovery cycles. The goal of this workshop is to demonstrate an integrated system (Marvin Live) to:

freely create, store and manage ideas utilize computational models such as phys-chem properties, 3D alignment, predictive models (created in KNIME) exploit existing evidences (MMP, various data sources) during design session. The dynamic plugin system facilitates balancing attributes through comparison and triage of hypothetical compounds on a single interface.


Cambridge Structural Database 2019


Cambridge Crystallographic Data Centre (CCDC) announced the first release of CSD data and software update of 2019.

The 2019 CSD Release contains 957,868 unique structures and 973,630 entries (CSD version 5.40) – an increase of more than 57,000 entries. We are currently on course to reach a million structures by summer 2019.

The update includes an exciting new polyhedra display option in our visualisation software Mercury.

Read more here….


CICAG meetings 2019


Meetings for 2019 that CICAG ( is involved with.

A great opportunity to gets hands on training to get you started on a variety of important software tools. All software and training materials required for the workshop will be provided for attendees to install and run on their own laptops and use for a limited period afterwards.

Eighth Joint Sheffield Conference on Chemoinformatics, The Edge, University of Sheffield, UK, Monday 17th – Wednesday 19th June, 2019.. CICAG are really delighted to be sponsoring this meeting.

AI in chemistry (with RSC-BMCS).
Two-day meeting to be held in Cambridge on 2nd and 3rd September 2019. Fitzwilliam College First very successful meeting in London was heavily oversubscribed, closing date for oral abstracts is 31 March and Posters 5 July.

Post-grad Cheminformatics/CompChem symposium, Wednesday 4th Sept 2019 Cambridge Chemistry Dept.
Opportunity for Post-grads to meet and present their work. Keep the date free, meeting details to be published soon, Cambridge Cheminformatics Network meeting will immediately follow the meeting so why not make a day of it.

20 years of Ro5 (with RSC-BMCS).
Wednesday, 20th November 2019, Sygnature Discovery, BioCity, Nottingham, UK.
It has been over 20 years since Lipinski published his work determining the properties of drug molecules associated with good solubility and permeability. Since then, there have been a number of additions and expansions to these “rules”. There has also been keen interest in the application of these guidelines in the drug discovery process and how these apply to new emerging chemical structures such as macrocycles. This symposium will bring together researchers from a number of different areas of drug discovery and will provide a historical overview of the use of Lipinski’s rules as well as look to the future and how we use these rules in the changing drug compound landscape. Details will be on in the near future.


Happy birthday World Wide Web


The Google Doodle today celebrates the birth of the world wide web. It is a shame however that they use a generic PC icon rather than the computer on which the internet was first built a NEXT Cube.

Screenshot 2019-03-12 at 10.25.03

A NeXT Computer and its object oriented development tools and libraries were used by Tim Berners-Lee and Robert Cailliau at CERN to develop the world's first web server software, CERN httpd, and also used to write the first web browser, as shown in the image below.


CERN are running a webinar to celebrate the event.

Welcome and Introduction

  • Welcome by Anna Cook - master of ceremonies

  • Opening talk by Fabiola Gianotti - CERN Director General

Let’s Share What We Know - panel discussion

This session highlights the importance of sharing what we know in the context of the early days of the Web. The Web has had a huge influence on the way we collaborate and share knowledge in society as a whole. Collaboration and sharing knowledge were also core values at the heart of its early evolution.

Chair: Frédéric Donck

Speakers: Tim Berners-Lee, Robert Cailliau, Jean-François Groff, Lou Montulli, Zeynep Tufekci

For Everyone - conversation

The Web was designed For Everyone!

Conversation between Sir Tim Berners-Lee and Bruno Giussani

Towards the Future - panel discussion

This session will focus on the aspects that technology evolution can bring us

Chair: Bruno Giussani

Speakers: Doreen Bogdan-Martin, Jovan Kurbalija, Monique Morrow, Zeynep Tufekci

Closing Remarks

  • Closing remarks by Charlotte Warakaulle - CERN Director for International Relations