A blog post describing a package that implements an R client to extract data from the Target validation platform.
The Open Targets Platform is a comprehensive and robust data integration for access to and visualisation of potential drug targets associated with disease. It brings together multiple data types and aims to assist users to identify and prioritise targets for further investigation.
This is an alternative to the public REST API.
The latest update to the CRAN R archive brings the total number of packages to 9004.
2016-08-22: 9000 packages
2016-02-29: 8000 packages
2015-08-12: 7000 packages
2014-10-29: 6000 packages
2013-11-08: 5000 packages
2012-08-23: 4000 packages
2011-05-12: 3000 packages
2009-10-04: 2000 packages
2007-04-12: 1000 packages
2004-10-01: 500 packages
2003-04-01: 250 packages
There is a listing of data analysis tools for Mac OSX here.
I just thought I'd flag a paper in Journal of Cheminformatics, RRegrs: an R package for computer-aided model selection with multiple regression models DOI.
We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package.
The results of the 16th annual KDnuggets Software Poll on data analysis tools is in.
The top 10 tools by share of users were
R, 46.9% share ( 38.5% in 2014, 37% in 2013)
RapidMiner, 31.5% ( 44.2% in 2014, 39% in 2013)
SQL, 30.9% ( 25.3% in 2014, NA in 2013)
Python, 30.3% ( 19.5% in 2014, 13% in 2013)
Excel, 22.9% ( 25.8% in 2014, 28% in 2013)
KNIME, 20.0% ( 15.0% in 2014, 6% in 2013)
Hadoop, 18.4% ( 12.7% in 2014, 9% in 2013)
Tableau, 12.4% ( 9.1% in 2014, NA 2013)
SAS, 11.3 (10.9% in 2014, 10.7% in 2013)
Spark, 11.3% ( 2.6% in 2014, NA in 2013)
The results very much reflect my own interactions, whilst R has a significant installed user base and of course a vast repository of open source packages, Python seems to be gaining traction. Certainly in part because Python seems to have become the lingua franca for scientific computing.
There is a listing of data analysis tools for Mac OS X here.
Whilst R is a very comprehensive statistical and data analysis package it does have a very steep learning curve.
R Instructor is an iPhone, iPad and iPod Touch application that uses plain, non-technical language and over 30 videos to explain how to make and modify plots, manage data and conduct both parametric and non-parametric statistical tests.
Now added to the mobile science site.
ChemmineOB provides an R interface to a subset of cheminformatics functionalities implemented by the OpelBabel C++ project. OpenBabel is an open source cheminformatics toolbox that includes utilities for structure format interconversions, descriptor calculations, compound similarity searching and more. ChemineOB aims to make a subset of these utilities available from within R. For non-developers, ChemineOB is primarily intended to be used from ChemmineR as an add-on package rather than used directly.
I just noticed that there is an update to R on the CRAN website
This binary distribution of R and the GUI supports 64-bit Intel based Macs on Mac OS X 10.6 (Leopard) or higher. Since R 3.0.0 the binary is a single-arch build and contains only the x86_64 (64-bit Intel) architecture. PowerPC Macs and 32-bit Macs are only supported by building from sources or by older binary R versions. The default package type is "mac.binary" and the binary repository layout has changed accordingly.
I’m not a big user of R a free software environment for statistical computing and graphics, but occasionally I notice cheminformatics modules being published. The latest issue of Bioinformatics DOI has a paper describing “fmcsR: Mismatch Tolerant Maximum Common Substructure Searching in R”.
The fmcsR package provides an R interface, with the time consuming steps of the FMCS algorithm implemented in C++. It includes utilities for pairwise compound comparisons, structure similarity searching, clustering and visualization of MCSs. In comparison to an existing MCS tool, fmcsR shows better time performance over a wide range of compound sizes. When mismatching of atoms or bonds is turned on, the compute times increase as expected, and the resulting FMCSs are often R1C5 substantially larger than their strict MCS counterparts. Based on R1C6 extensive virtual screening (VS) tests, the flexible matching feature enhances the enrichment of active structures at the top of MCS-based similarity search results. With respect to overall and early enrichment performance, FMCS outperforms most of the seven other VS methods considered in these tests.
fmcsR is freely available for all common operating systems from the Bioconductor site http://www.bioconductor.org/packages/devel/bioc/html/fmcsR.html.