Macs in Chemistry

Insanely Great Science

The Chemfp Project

 

The Chemfp project started as a way to promote the FPS format for cheminformatics fingerprint exchange and has evolved into a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. The 10 years of work and research results of the chemfp project have now been described in an excellent publication.

I looked at Chemfp when comparing various options for clustering large datasets and Chemfp was one of the highest performing, and Andrew Dalke was very responsive to questions.


Comments

Workshop on Computational Tools for Drug Discovery

Workshop on Computational Tools for Drug Discovery (with SCI).
10 April 2019, The Studio, Birmingham.
https://www.soci.org/events/scirsc-workshop-on-computational-tools-for-drug-discovery

Details of the workshops

Attendees will be able to choose from 4 of 6 sessions.

Optibrium Guided multi-parameter optimisation of 2D and 3D SAR

In this workshop, we will explore the concept of multi-parameter optimisation (MPO) and its application to quickly target high-quality compounds with a balance of potency and appropriate absorption, distribution, metabolism and excretion (ADME) properties. We will further illustrate how this concept can be combined with an understanding of 2D and 3D structure-activity relationships (SAR) to guide the design of new, improved compounds.

The workshop will be based on practical 'hands-on' examples using our StarDrop™ software and all participants will get a 1-month free trial license to use StarDrop following the workshop. For more information on StarDrop, please visit our website or watch some videos of StarDrop in action at www.optibrium.com/community/videos.

Cresset Next generation structure-based design with Flare

Learn how simple structure-based design can be within small molecule discovery projects. The workshop will cover ligand design in the protein active site, Electrostatic Complementarity™ maps and scores, ensemble docking of ligands with Lead Finder, calculations of water stability and locations using 3D-RISM, energetics of ligand binding using WaterSwap and use of Python extensions. Applications you will use: Flare™ , Lead Finder™.

Dotmatics Data visualisation and analysis with Dotmatics

Dotmatics offers a comprehensive scientific software platform for knowledge management, data storage, enterprise searching and reporting. The focus of the workshop will be the Dotmatics visualisation and data analysis software in small molecule drug discovery workflows around compound selection from vendor catalogues and analysis of lead optimisation datasets as typically found in drug discovery.

BioSolveIT Fast – Visual – Easy – computer-aided drug design for all chemists

In this workshop you will learn - hands-on - to use modern software for hit-finding, hit-to-lead and lead optimization. We will walk you around the drug discovery cycle and show you: how to assess your protein and discover a binding site; how simple modifications to the bound molecule affect the binding affinity; how to replace a scaffold or explore sub-pockets for improved binding; how to keep all your key ADME-parameters in check, while you optimize your lead; and last but not least how to quickly find new starting points in a giant 3.8 billion vendor catalog of compounds ready for purchase.

Instead of dry theory, we will explain those use cases based on real-world scenarios and interesting targets such as Thrombin, BTK, Endothiapepsin and BRD4. Bring your own laptop to try this out for yourself right away and receive the software as well as a free trial license on top. The Software tools are called:

SeeSAR – "modeling for all chemists" and REAL Space Navigator – "the world’s largest searchable catalog of compounds on demand".

Knime An interactive workflow for hit list triaging

In this workshop I will introduce a workflow built using the open source KNIME Analytics Platform for doing hit-list triaging and selecting compounds for confirmatory assays or other followup testing. We will use a real-world HTS dataset and work through reading the data in, flagging molecules that are likely to have interfered with the assay, manual "rescue" of compounds removed by the filters, and selecting a compound subset that covers the chemical diversity of the hits yet still allows learning some SAR from subsequent experiments. Participants will be provided with both the dataset and the workflow used during the workshop so that they can adapt it to their own needs.

ChemAxon Computational intelligence driven drug design

The most recent era of vast data sources, rapid data processing and model building enables drug designers to propose high quality structures in ideation phase in lean ("fail-early") discovery cycles. The goal of this workshop is to demonstrate an integrated system (Marvin Live) to:

freely create, store and manage ideas utilize computational models such as phys-chem properties, 3D alignment, predictive models (created in KNIME) exploit existing evidences (MMP, various data sources) during design session. The dynamic plugin system facilitates balancing attributes through comparison and triage of hypothetical compounds on a single interface.

Comments

Cambridge Structural Database 2019

 

Cambridge Crystallographic Data Centre (CCDC) announced the first release of CSD data and software update of 2019.

The 2019 CSD Release contains 957,868 unique structures and 973,630 entries (CSD version 5.40) – an increase of more than 57,000 entries. We are currently on course to reach a million structures by summer 2019.

The update includes an exciting new polyhedra display option in our visualisation software Mercury.

Read more here….


Comments

CICAG meetings 2019

 

Meetings for 2019 that CICAG (http://www.rsccicag.org) is involved with.

Workshop on Computational Tools for Drug Discovery (with SCI).
10 April 2019, The Studio, Birmingham.
https://www.soci.org/events/scirsc-workshop-on-computational-tools-for-drug-discovery A great opportunity to gets hands on training to get you started on a variety of important software tools. All software and training materials required for the workshop will be provided for attendees to install and run on their own laptops and use for a limited period afterwards.

Eighth Joint Sheffield Conference on Chemoinformatics, The Edge, University of Sheffield, UK, Monday 17th – Wednesday 19th June, 2019..
https://cisrg.shef.ac.uk/shef2019/ CICAG are really delighted to be sponsoring this meeting.

AI in chemistry (with RSC-BMCS).
Two-day meeting to be held in Cambridge on 2nd and 3rd September 2019. Fitzwilliam College
https://www.maggichurchouseevents.co.uk/bmcs/AI-2019.htm First very successful meeting in London was heavily oversubscribed, closing date for oral abstracts is 31 March and Posters 5 July.

Post-grad Cheminformatics/CompChem symposium, Wednesday 4th Sept 2019 Cambridge Chemistry Dept.
Opportunity for Post-grads to meet and present their work. Keep the date free, meeting details to be published soon, Cambridge Cheminformatics Network meeting will immediately follow the meeting so why not make a day of it.

20 years of Ro5 (with RSC-BMCS).
Wednesday, 20th November 2019, Sygnature Discovery, BioCity, Nottingham, UK.
It has been over 20 years since Lipinski published his work determining the properties of drug molecules associated with good solubility and permeability. Since then, there have been a number of additions and expansions to these “rules”. There has also been keen interest in the application of these guidelines in the drug discovery process and how these apply to new emerging chemical structures such as macrocycles. This symposium will bring together researchers from a number of different areas of drug discovery and will provide a historical overview of the use of Lipinski’s rules as well as look to the future and how we use these rules in the changing drug compound landscape. Details will be on https://www.maggichurchouseevents.co.uk/bmcs/ in the near future.

Comments

Happy birthday World Wide Web

 

The Google Doodle today celebrates the birth of the world wide web. It is a shame however that they use a generic PC icon rather than the computer on which the internet was first built a NEXT Cube.

Screenshot 2019-03-12 at 10.25.03

A NeXT Computer and its object oriented development tools and libraries were used by Tim Berners-Lee and Robert Cailliau at CERN to develop the world's first web server software, CERN httpd, and also used to write the first web browser, as shown in the image below.

First_Web_Server

CERN are running a webinar to celebrate the event.

Welcome and Introduction

  • Welcome by Anna Cook - master of ceremonies

  • Opening talk by Fabiola Gianotti - CERN Director General

Let’s Share What We Know - panel discussion

This session highlights the importance of sharing what we know in the context of the early days of the Web. The Web has had a huge influence on the way we collaborate and share knowledge in society as a whole. Collaboration and sharing knowledge were also core values at the heart of its early evolution.

Chair: Frédéric Donck

Speakers: Tim Berners-Lee, Robert Cailliau, Jean-François Groff, Lou Montulli, Zeynep Tufekci

For Everyone - conversation

The Web was designed For Everyone!

Conversation between Sir Tim Berners-Lee and Bruno Giussani

Towards the Future - panel discussion

This session will focus on the aspects that technology evolution can bring us

Chair: Bruno Giussani

Speakers: Doreen Bogdan-Martin, Jovan Kurbalija, Monique Morrow, Zeynep Tufekci

Closing Remarks

  • Closing remarks by Charlotte Warakaulle - CERN Director for International Relations
Comments

Review of Flare version 2

 

Cresset provide a variety of software packages to support small molecule design, built on the foundation of their extended forcefield XED forcefield. When I first reviewed a couple of Cresset products FieldView, FieldAlign and Forge the forcefield was only applicable to small molecules. However the forcefield has been constantly developed and can now be applied to proteins.

Flare Version 2 is a recent extension to the portfolio with the introduction of Electrostatic Complementarity (EC), i.e. a comparison of electrostatics on both the small molecule ligand and the target protein DOI.

Electrostatic interactions between small molecules and their respective receptors are essential for molecular recognition and are also key contributors to the binding free energy. Assessing the electrostatic match of protein-ligand complexes therefore provides important insights into why ligands bind and what can be changed to improve binding. Ideally, ligand and protein electrostatic potentials at the protein-ligand interaction interface should maximize their complementarity while minimizing desolvation penalties.

In addition Flare version 2 includes a new Python API, that allows users to automate tasks by scripting, but also integration with other Python packages such as RDKit cheminformatics toolkit, and Python modules for graphing, statistics (NumPy, SciPy, MatPlotLib), and Jupyter notebook integration.

waterswapview

Flare gives access to a very powerful set of tools designed to aid ligand binding, docking, electrostatic modelling and WaterSwap, all within a well thought-out interface. The storyboard feature also allows the user to store snapshots of progress and coupled with the log acts like a notebook.

You can read the full review here.

Comments

Chembience updated

 

Update to RDKit 2018.09.2 and Postgres 10.7.

Chembience is a Docker based platform supporting the fast development of chemoinformatics-centric web applications and microservices. It creates a clean separation between your scientific web service implementation and any host-specific or infrastructure-related configuration requirements.


Comments

Chemical Structure Association Trust grants

 

Application deadline for the 2019 Grant is April 19, 2019. Successful applicants will be notified no later than May 24, 2019.

The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, research or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. One or more Grants will be awarded annually up to a total combined maximum of ten thousand U.S. dollars ($10,000). Grantees have the option of payments being made in U.S. dollars or in British Pounds equivalent to the U.S. dollar amount. Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how the grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

There are more details of the requirements on the website

The 2018 awards went to

2018

Stephen Capuzzi, Division of Chemical Biology and Medicinal Chemistry at the University of North Carolina Eshelman School of Pharmacy, Chapel Hill (USA), was awarded a Grant to attend the 31th ICAR in Porto, Portugal from 06/11/2018 to 06/15/2018, where he presented his research entitled “ComputerAided Discovery and Characterization of Novel Ebola Virus Inhibitors.”

Christopher Cooper, Cavendish Laboratory, University of Cambridge, UK, was awarded a Grant to present his current research on systematic, high-throughput screening of organic dyes for co-sensitized dye-sensitized solar cells. He presented his work at the Solar Energy Conversion Gordon Research Conference and Seminar held June 16-22, 2018 in Hong Kong.

Mark Driver, Chemistry Department, University of Cambridge, UK,was awarded a Grant to offset costs to attend the 7th EUCheMS conference where he will present a poster on his research that focuses on the development and applications of a theoretical approach to model hydrogen bonding.

Genqing Wang, La Trobe Institute for Molecular Sciences, La Trobe University, Australia, was awarded a Grant to present his work at the Fragment-Based Lead Discovery Conference (FBLD2018) in San Diego, USA in October 2018. The current focus of his work is the development of novel anti-virulence drugs which potentially overcome the problems of antibiotic resistance of Gram-negative bacteria.

Roshan Singh, University of Oxford, UK, was awarded a Grant to conduct research within Dr. Marcus Lundberg’s Group at Uppsala University, Sweden, as part of a collaboration that he has set up between them and Professor Edward Solomon’s Group at Stanford University, California. He conducts research within Professor John McGrady’s group at the University of Oxford. The collaboration will look to consolidate the experiments studies on heme Fe (IV)=O complexes currently being studied by Solomon’s Group with future multi-reference calculations to be conducted within Lundberg’s Group.


Comments

Counting Identical structures in two datasets

 

Sometimes I have two datasets and I just want to know the overlap of identical structures. This Vortex script counts the number of identical structures by comparing InChIKeys. It then displays a matrix showing how many unique molecules in each dataset and how many molecules are in both datasets.

results

Comments

Data Extractor

 

Data Extractor has been updated to version 1.7.1 with a number of internal improvements.

Data Extractor allows to extract data contained inside text documents and collect them in an internal organized table with fields and records. It can parse all the text files you specify and analyze them understanding from text tags what to extract and where to put it.

Data Extractor requires Mac OSX 10.10 or later.

There are more Data Analysis tools here.

Comments

MOE 2019.1 released

 

The 2019.01 release of Chemical Computing Group's Molecular Operating Environment software includes a variety of new features, enhancements

Full release notes are here.

includes:

  • Calculate and Analyze pH-Dependent Protein Properties
  • MOEsaic Session Sharing and Project Customization
  • Determine Conformation Population from NMR NOE Data
  • Predict Relative Binding Energies with AMBER Thermodynamic Integration

Worth noting there is an updated Version of Flexera License Manager.

MOE now uses an updated version of the Flexera license manager. The license manager server components lmgrd, chemcompd, and lmutil have all been updated to version 11.16.0.0. Note that older versions of MOE will continue to run with updated license manager servers.


Comments

Update to MayaChemTools

 

I just heard that the following command line scripts available as part of MayaChemTools package now have implemented multiprocessing functionality.

o RDKitCalculateMolecularDescriptors.py

o RDKitCalculatePartialCharges.py

o RDKitGenerateConformers.py

o RDKitFilterChEMBLAlerts.py

o RDKitFilterPAINS.py

o RDKitPerformMinimization.py

o RDKitRemoveSalts.py

o RDKitSearchSMARTS.py


Comments

SCI-RSC Workshop on Computational Tools for Drug Discovery

 

I'm delighted to report this meeting seems to be filling up fast

All scientists working in drug discovery need tools and techniques for handling chemical information. This workshop offers a unique opportunity to try out a range of software packages for themselves with expert tuition in different aspects of pre-clinical drug discovery. Attendees will be able to choose from sessions covering data processing and visualisation; ligand and structure-based design, or ADMET prediction run by the software providers. All software and training materials required for the workshop will be provided for attendees to install and run on their own laptops and use for a limited period afterwards

More details

Presentations from Optibrium / Cresset / Dotmatics /BioSolveIT/ Knime / ChemAxon

Comments

BBEdit Updated

 

BBEdit 12.6 has been released and this is a very significant update. BBEdit is now a sandboxed application which means there are a number changes to the way permissions are handled.

It is well worth reading the Release Notes which offer a very detailed explanation of the situation.

Without unrestricted access to your files and folders, many of BBEdit’s most useful features, from the basic to the most powerful, won't work at all; or they may misbehave in unexpected ways. At the very least, this hinders your ability to work done.

In order to resolve this fundamental conflict between security and usability, we have devised a solution in which BBEdit requests that you permit it the same sort of access to your files and folders that would be available to a non-sandboxed version.

There are also many additions, changes and fixes.

Comments

Data curation workflow

 

One of the most time-consuming parts of any data analysis is curating the input data prior to any model building. This Knime workflow is fully documented and described and as such is an invaluable starting point.

A semi-automated procedure is made available to support scientists in data preparation for modelling purposes. The procedure address:

  • Automatic chemical data retrieval (i.e., SMILES) from different, orthogonal web based databases, by using two different identifiers, i.e. chemical name and CAS registration number. Records were scored based on the coherence of information retrieved from different web sources.
  • Data curation procedure performed to top scored records. The procedure includes removal of inorganic and organometallic compounds and mixtures, neutralization of salts, removal of duplicates, checking of tautomeric forms.
  • Standardization of chemical structures yielding to ready-to-use data for the development of QSARs.

Comments

The official release of GROMACS 2016.6 is available

 

This release fixes remaining issues found since version 2016.5. All users of the 2016 series are encouraged to update to 2016.6. Please see the link to the release notes below for more details.

You can find the code, documentation, release notes, and test suite at the links below.

Code: ftp://ftp.gromacs.org/pub/gromacs/gromacs-2016.6.tar.gz
Documentation: http://manual.gromacs.org/documentation/2016.6/index.html (including release notes, install guide, user guide, reference manual)
Test Suite: http://gerrit.gromacs.org/download/regressiontests-2016.6.tar.gz


Comments

PyMOL 2.3 released

 

Just got this message

We are happy to announce the release of PyMOL 2.3. Download ready-to-use bundles from https://pymol.org/2/ or update your installation with "conda install -c schrodinger pymol". New features include: - Atom-level cartoon transparency - Fast MMTF export - Sequence viewer gaps display

This is the first time there are PyMOL bundles with Python 3. If you use custom or third-party Python 2 scripts, they might stop working until you convert them.

Full release notes are here https://pymol.org/dokuwiki/?id=media:new23 and


Comments

International Year of the Periodic Table of Chemical Elements (IYPT 2019)

 

The United Nations General Assembly during its 74th Plenary Meeting proclaimed 2019 as the International Year of the Periodic Table of Chemical Elements (IYPT 2019) on 20 December 2017.

1869 is considered as the year of discovery of the Periodic System by Dmitri Mendeleev. 2019 will be the 150th anniversary of the Periodic Table of Chemical Elements and has therefore been proclaimed the "International Year of the Periodic Table of Chemical Elements (IYPT2019)" by the United Nations General Assembly and UNESCO.

The IYPT website gives details of events and you can find out more by looking for the hashtag #IYPT2019. Of particular note is Mendeleev 150: 4th International Conference on the Periodic Table endorsed by IUPAC.

simplePT

The periodic table has always been a popular source of apps, with a variety of mobile apps available. There are a couple that I would highlight.

The Elements in Action

The periodic table comes to life with 79 video explorations of the weird, wonderful, and sometimes alarming properties of the elements. Filmed by BAFTA award winner Max Whitby in partnership with Theodore Gray, author of the iconic book and app The Elements, and previously available only in a few museum installations, this is the most beautifully filmed collection of videos ever assembled to explore and explain what makes each element unique and fascinating.

There is a companion book The Elements: A Visual Exploration of Every Known Atom in the Universe that is also very popular.

The Periodic Table Project

To celebrate the International Year of Chemistry (IYC), Chem 13 News magazine together with the University of Waterloo's Department of Chemistry and the Faculty of Science encouraged chemistry educators and enthusiasts worldwide to adopt an element and artistically interpret that element. The project created a periodic table as a mosaic of science and art. The apps include the creative process behind each tile along with basic atomic properties of the element. The apps work to truly highlight the artistic expression of the Periodic Table Project.

Periodic Table

Created by the Royal Society of Chemistry. Did you know that neodymium is used in microphones? Or europium in Euro bank notes to help stop counterfeiting? These are just two of the absorbing facts in our free, user-friendly and customisable app, based on our popular and well-respected Royal Society of Chemistry Periodic Table website.

The RSC also created Top Trumps Chemistry a card game to learn more about the elements.

There are also several online interactive periodic tables

Ptable

RSCperiodictable

ElementsTable

PeriodicTable

There are many events around the world registered on the IYPT website, and if you are organising something you can add them to the list.


Comments

2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry

 

In June 2018 the First RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry meeting was held in London. This proved to enormously popular, there were more oral abstracts and poster submissions than we had space for and was so over-subscribed we could have filled a venue double the size.

Planning for the second meeting is now in full swing, and it will be held in Cambridge 2-3 September 2019.

Event : 2nd RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry
Dates : Monday-Tuesday, 2nd to 3rd September 2019
Place : Fitzwilliam College, Cambridge, UK
Websites : Event website, and RSC website.

Twitter #AIChem19

aifirst_announcement-

Applications for both oral and poster presentations are welcomed. Posters will be displayed throughout the day and applicants are asked if they wished to provide a two-minute flash oral presentation when submitting their abstract. The closing dates for submissions are:

  • 31st March for oral and
  • 5th July for poster

Full details can be found on the Event website,


Comments

A new method for 3D printing

 

Where as the usual 3D printing methods create a 3D object by printing it layer by layer a recent publication in Science DOI described a method by which the objects were created by passing light through liquid acrylate. Computed Axial Lithography (CAL), allows generation arbitrary geometries volumetrically through photopolymerization. The process was inspired by the 3D image construction used in CT scans.

The exposure process takes about two minutes for an object a few centimetres across; the team recreated a version of Auguste Rodin’s sculpture The Thinker a few centimetres tall.

The supplementary materials include details for printing several objects

I've added it to the 3D printing page.


Comments

New release of MayaChemTools

 

A new release of MayaChemTools is now available, these comprise a fantastic collection of Perl and Python scripts, modules, and classes to support a variety of day-to-day computational discovery needs.

The core set of command line Perl scripts available in the current release of MayaChemTools has no external dependencies and provide functionality for the following tasks:

  • Manipulation and analysis of data in SD, CSV/TSV, sequence/alignments, and PDB files
  • Listing information about data in SD, CSV/TSV, Sequence/Alignments, PDB, and fingerprints files
  • Calculation of a key set of physicochemical properties, such as molecular weight, hydrogen bond donors and acceptors, logP, and topological polar surface area
  • Generation of 2D fingerprints corresponding to atom neighborhoods, atom types, E-state indices, extended connectivity, MACCS keys, path lengths, topological atom pairs, topological atom triplets, topological atom torsions, topological pharmacophore atom pairs, and topological pharmacophore atom triplets
  • Generation of 2D fingerprints with atom types corresponding to atomic invariants, DREIDING, E-state, functional class, MMFF94, SLogP, SYBYL, TPSA and UFF
  • Similarity searching and calculation of similarity matrices using available 2D fingerprints
  • Listing properties of elements in the periodic table, amino acids, and nucleic acids
  • Exporting data from relational database tables into text files

The command line Python scripts based on RDKit provide functionality for the following tasks:

  • Calculation of molecular descriptors and partial charges
  • Comparison of 3D molecules based on RMSD and shape
  • Conversion between different molecular file formats
  • Enumeration of compound libraries and stereoisomers
  • Filtering molecules using SMARTS, PAINS, and names of functional groups
  • Generation of graph and atomic molecular frameworks
  • Generation of images for molecules
  • Performing structure minimization and conformation generation based on distance geometry and forcefields
  • Performing R group decomposition
  • Picking and clustering molecules based on 2D fingerprints and various clustering methodologies
  • Removal of duplicate molecules and salts from molecules

The command line Python scripts based on PyMOL provide functionality for the following tasks:

  • Aligning macromolecules
  • Splitting macromolecules into chains and ligands
  • Listing information about macromolecules
  • Calculation of physicochemical properties
  • Comparison of marcromolecules based on RMSD
  • Conversion between different ligand file formats
  • Mutating amino acids and nucleic acids
  • Generating Ramachandran plots
  • Visualizing X-ray electron density and cryo-EM density
  • Visualizing macromolecules in terms of chains, ligands, and ligand binding pockets
  • Visualizing cavities and pockets in macromolecules
  • Visualizing macromolecular interfaces
  • Visualizing surface and buried residues in macromolecules

Comments

Programming Languages for Chemical Information

 

This looks like it should be well worth bookmarking.

https://www.biomedcentral.com/collections/programming-languages

This thematic series comprises a set of invited papers, each one describing the use of a single language for the development of cheminformatics software that implement algorithms and analyses and aims to cover a variety of language paradigms. The issue will be rolling, such that as papers on new languages are submitted they will be automatically added to this issue.

The first article DOI is by Kevin Theisen (of ChemDoodle fame) reviewing HTML5/Javascript. Apparently there have been more lines of Javascript written than all other programming languages combined so it seems appropriate as a kick off article.


Comments

Using the Python 3 library fpsim2 for similarity searches

 

FPSim2 is a new tool for fast similarity search on big compound datasets (>100 million) being developed at ChEMBL. It was developed as a Python3 library to support either in memory or out-of-core fast similarity searches on such dataset sizes.

It is built using RDKit and can be installed using conda. It requires Python 3.6 and a recent version of RDKit..

I've written a couple of Jupyter notebooks to demonstrate it's use.

You can read the full tutorial here, and download the notebooks.






Comments

Easy Markdown updated

 

Easy Markdown has been updated to version 1.8.

A text written in Markdown is a plain text which looks correctly to humans as text and automatically translates in a correctly web pages coded in html. In Easy Markdown the window is split in two parts. As you type plain text on the left, you see on the right the resulting web page as it will be seen on the web.

There are many other Markdown editors here detailed here


Comments

A few thoughts on scientific software

 

When a wrote "A few thoughts on scientific software" I was somewhat surprised by the interest and amount of feedback I got. I've since added two more pages based on the feedback,

A listing of open-source cheminformatics toolkits and Open Source Python Data Science Libraries.

If you have any other suggestions feel free to let me know.


Comments

Chemical reactions from US patents (1976-Sep2016)

 

Great work by NextMove, an open, machine-readable, freely-reusable, annotated reaction data set, available for download here https://figshare.com/articles/ChemicalreactionsfromUSpatents1976-Sep2016/5104873

Reactions extracted by text-mining from United States patents published between 1976 and September 2016. The reactions are available as CML or reaction SMILES. Note that the reactions SMILES are derived from the CML.

Reaction SMILES

For convenience the reaction SMILES includes tab delimited columns for: PatentNumber, ParagraphNum, Year, TextMinedYield, CalculatedYield

Now that we have a large initial data set it would be great if others could contribute using the same format.

There is a fabulous detailed review of this invaluable resource on the Depth-First blog http://depth-first.com/articles/2019/01/28/the-nextmove-patent-reaction-dataset/


Comments

Alvascience

 

Just came across this.

Alvascience cheminformatics tools.

BMFpred is an easy-to-use software implementing the QSAR models described in “F. Grisoni, V.Consonni, M.Vighi (2018). Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish, Integrated Environmental Assessment and Management” to predict the laboratory-based fish Biomagnification Factor (BMF) of chemicals.

alvaDesc is the next generation tool for the calculation of a wide range of molecular descriptors and a number of molecular fingerprints. Specifically it calculates almost 4000 descriptors independent of 3-dimensional information such as constitutional, topological, phamacophore. It includes ETA and Atom-type E-state indices together with functional groups and fragment counts. Additionally, alvaDesc implements an extensive number of 3-dimensional descriptors such as 3D-autocorrelation, Weighted Holistic Invariant Molecular descriptors (WHIM) and GETAWAY.

alvaDesc_mac

Also available as a KNIME node, the alvaDesc KNIME Plugin contains three KNIME nodes:

  • Descriptor: calculates molecular descriptors
  • Fingerprint: calculates molecular fingerprints
  • Molecule Reader: reads standard molecule files and can be used as a source for the other two nodes (which are also compatible with KNIME standard molecule nodes)

Comments

Fortran

 

Just looking at the Archer application usage over the last month, lots of materials modelling codes, electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.

Screenshot 2019-01-23 at 16.10.35

ARCHER is the latest UK National Supercomputing Service. The ARCHER Service started in November 2013 and is presently expected to run till November 2019. ARCHER provides a capability resource to allow researchers to run simulations and calculations that require large numbers of processing cores working in a tightly-coupled, parallel fashion.

Also clear that Fortran still dominates

Screenshot 2019-01-23 at 16.11.19

As I've said many times I'm not a big Fortran user but the Fortran on a Mac page is the most accessed page on the site.


Comments

Comparison of bioactivity predictions

 

Small molecules can potentially bind to a variety of bimolecular targets and whilst counter-screening against a wide variety of targets is feasible it can be rather expensive and probably only realistic for when a compound has been identified as of particular interest. For this reason there is considerable interest in building computational models to predict potential interactions. With the advent of large data sets of well annotated biological activity such as ChEMBL and BindingDB this has become possible.

ChEMBL 24 contains 15,207,914 activity data on 12,091 targets, 2,275,906 compounds, BindingDB contains 1,454,892 binding data, for 7,082 protein targets and 652,068 small molecules.

These predictions may aid understanding of molecular mechanisms underlying the molecules bioactivity and predicting potential side effects or cross-reactivity.

Whilst there are a number of sites that can be used to predict bioactivity data I'm going to compare one site, Polypharmacology Browser 2 (PPB2) http://ppb2.gdb.tools with two tools that can be downloaded to run the predictions locally. One based on Jupyter notebooks models built using ChEMBL built by the ChEMBL group https://github.com/madgpap/notebooks/blob/master/targetpred21_demo.ipynb and a more recent random forest model PIDGIN. If you are using proprietary molecules it is unwise to use the online tools.

Read the article here

Comments

CrystalMaker 10.4

 

The much-awaited CrystalMaker 10.4 is now shipping - complete with full “Dark Mode”.

CrystalMaker and CrystalDiffract are real 64-bit Mac programs, written in Cocoa/Objective-C, with beautiful real Mac interfaces. They’re not Windows or Unix applications reskinned via Qt, Java, wxWidgets they’re the real deal: 100% pure Mac. Thus they are able to offer:-

  • Retina display
  • Multi-touch
  • Force touch
  • Haptic feedback
  • Touch bar interface (MacBook Pro)
  • Dark Mode
  • Full-screen mode and Spaces
  • Quick Look
  • Finder thumbnails
  • QuickTime video
  • Apple Help
  • Code-signed, sandboxed, with “hardened runtime” for maximum security

CrystalMaker 10.4 has over 60 new features, of which the most-important are probably:-

  1. Dark Mode
  2. Sleek new structures library with integrated CrystalViewer (1,100 structures; fully customizable)
  3. New energy-modelling engine: makes designing your own molecules quick and easy, with vibrational spectra simulation.
  4. Live powder diffraction: link CrystalMaker 10.4 with CrystalDiffract 6.8 so that editing a crystal in CrystalMaker automatically updates its simulated powder diffraction pattern in CrystalDiffract.
  5. Interpolate Structures command - makes animating structural behaviour smooth and seamless.
  6. Customizable Atoms Inspector and coordinates display.
  7. Spring-loaded sidebars: move the mouse to the edge of the screen to show the relevant sidebar (works great in full-screen mode).
  8. Powerful video sizing/compression options.
  9. Fat sticks display option (great for emphasising structural channels).
  10. Advanced control over axial vectors with scaling, fonts, positioning, inset etc.

More details are available from the download page http://crystalmaker.com/crystalmaker/

They also have a set of video tutorials available.

CrystalMaker X lets you import data from over 40 different formats: with instant display and powerful customization. CrystalMaker can handle including multi-structure files such as DL_POLY HISTORY - use CrystalMaker's synchronization and animation capabilities to rapidly understand structural behaviour, lattice dynamics, or visualize the trajectory of a simulation. CrystalMaker X can also handle truly massive structures. Take advantage of our unique "Depth Profiling" tool, to rapidly scan ares of interest in massive structures - ideal for characterizing the results from computer models.



Comments

TS Calc The mathematical equations tool

 

TS Calc is a document based application and its documents can be realized and used as calculation models for specific mathematical technical problems. It is a complete different approach to solve math problems respect to the usual one using spreadsheets.

4


Comments

Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data

 

Around 4% of the population suffer from colour blindness in one for or another with red/green colour blindness being the most common and sadly in many plots, graphs, presentations little effort is made to make things easier for those people with colour blindness.

Color blindness, also known as color vision deficiency (CVD), is the decreased ability to see color or differences in color. Simple tasks such as selecting ripe fruit, choosing clothing, and reading traffic lights can be more challenging. Color blindness may also make some educational activities more difficult.

A recent publication seeks to address this need, Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data DOI

While there have been some attempts to make aesthetically pleasing or subjectively tolerable colormaps for those with CVD, our goal was to make optimized colormaps for the most accurate perception of scientific data by as many viewers as possible. We developed a Python module, cmaputil, to create CVD-optimized colormaps, which imports colormaps and modifies them to be perceptually uniform in CVD-safe colorspace while linearizing and maximizing the brightness range. The module is made available to the science community to enable others to easily create their own CVD-optimized colormaps.

journal.pone.0199239.g001


Comments

LICHEM: Layered Interacting CHEmical Models

 

An update to LICHEM: Layered Interacting CHEmical Models has been published DOI

LICHEM is an open-source (GPLv3) interface between QM and MM software so that QM/MM calculations can be performed with polarizable and frozen electron density force fields. Functionality is also present for standard point-charge based force fields, pure MM, and pure QM calculations.

Available from GitHub https://github.com/CisnerosResearch/LICHEM.

Note, On OSX machines, the SEDI, TEX, BIB, and CXXFLAGS variables will need to be modified.

Comments

Workshop on Computational Tools for Drug Discovery

 

Registration opened just before Christmas and apparently there were a number of people sign up over the festive period. Remember there are a limited number of places and it is first come first served.

Registration and full details are here.

Computational Tools Flyer

This workshop is intended to provide expert tutorials to get you started and show what can be achieved with the software.

Comments

GROMAC 2019 released

 

I just noticed GROMACS 2019 was released on Dec 31 2018.

GROMACS http://www.gromacs.org is one of the major software packages for the simulation of biological macromolecules. It is aimed at performing the simulation of large, biologically relevant systems, with a focus on both being efficient and flexible to allow the research of a number of different systems

Several important performance improvements

  • Simulations now automatically run using update groups of atoms whose coordinate updates have only intra-group dependencies. These can include both constraints and virtual sites. This improves performance by eliminating overheads during the update, at no cost.
  • Intel integrated GPUs are now supported with OpenCL for offloading non-bonded interactions.
  • PME long-ranged interactions can now also run on a single AMD GPU using OpenCL, which means many fewer CPU cores are needed for good performance with such hardware.

Release notes here


Comments

Macs in Chemistry Annual Site Review

 

At the end of each year I have a look at the website analytics to see which items were the most popular.

Over the year there were 70,000 unique visitors with 25% visiting the site on multiple occasions. The US provided 30% of the visitors and the UK 7% with Germany, India and France around 5%. As you might expect the majority were Mac users (56%), but there were also Windows (25%), iOS (12%), and Android (2.5%) users.

Of the Mac users, 51% are now using 10.14 (Mojave), 27% 10.13 with all older versions each well below 10%.

Chrome and Safari were the preferred browsers with both around 40%.

The most popular web pages were (other than the main page)

The popularity of the Fortran on a Mac page has continued for several years now and it has been updated several times with user provided information.

The most viewed blog pages in 2018 were

The update to Mojave seems to have been another daily smooth transition, with most software developers reporting no major issues.

A couple of recent additions have generated significant interest.

As has an article I wrote on my thoughts on scientific software

Comments

A Jupyter Kernel for Swift

 

I'm constantly impressed by the expansion of Jupyter it is rapidly becoming the first-choice platform for interactive computing.

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

A latest expansion is a Jupyter Kernel for Swift, intended to make it possible to use Jupyter with the Swift for TensorFlow project.

Swift for TensorFlow is a new way to develop machine learning models. It gives you the power of TensorFlow directly integrated into the Swift programming language. With Swift, you can write the following imperative code, and Swift automatically turns it into a single TensorFlow Graph and runs it with the full performance of TensorFlow Sessions on CPU, GPU and TPU.

Requires MacOS 10.13.5 or later, with Xcode 10.0 beta or later


Comments

The International Year of the Periodic Table

 

2019 is the international year of the Periodic Table

1869 is considered as the year of discovery of the Periodic System by Dmitri Mendeleev. 2019 will be the 150th anniversary of the Periodic Table of Chemical Elements and has therefore been proclaimed the "International Year of the Periodic Table of Chemical Elements (IYPT2019)" by the United Nations General Assembly and UNESCO.

If you want to brush up on the table there are a couple mobile apps to help you.

Periodic Table
The Elements in Action
The Periodic Table Project

You can follow events for the year on Twitter https://twitter.com/iypt2019

Comments