Macs in Chemistry

Insanely great science

Aabel Review

A couple of months ago Gigwiz updated their flagship data analysis and graphing application Aabel I've recently had a chance to spend a little time using it and I thought I'd post my impressions. There are an increasing number of high quality data analysis tools now available under Mac OS X, these range from simple spreadsheet applications to powerful 3D data and statistcal analysis tools. One thing that you are quickly aware of is that Aabel produces absolutlely stunning graphs, but it is much more than a pretty face, it also has powerful statistical and data exploration tools. Aabel comes with an impressive 1000 page manual that describes the application in great detail with plenty of screenshots to aid understanding. It also has a much shorter "QuickStart Guide" to get you up and running instantly. The other thing I'd add is that I had a couple of questions when writing this and the Gigawiz support is excellent.

Data Import

When you first open Aabel you are presented with an empty spreadsheet, you can import a variety of different data formats including, delimited ASCII (space, tab, comma etc), Excel, FORTRAN formatted data, binary formats (8, 16, 32, 64 Bit Data), dBase (.dbf), as well as Matlab or Splus 2-column ASCII. The import dialog gives you a number of options to customise the import, and also gives you a preview of the data. So if the first row is column headings you can import it correctly.

Import is very quick, I used a data set of approx 1000 rows with 50 columns and the import was done in a screen refresh on my MacBook Pro. With the spreadsheet populated you can immediately start exploring the dataset. Each of the variables is automatically assigned a data type, these can be modified by simply clicking on the dropdown menu at the top of each column. If you click on the control button at the top of each column you get a summary of the data, together with a simple histogram. You can alter the bin size or transparency using the sliders. This is a great way to quickly look at the distribution of data. The display is smart enough to know that if you click on a column of catagorical data then you don't get the slider to change bin size. Here you can also limit the range to elimnate erroneous data and there is also the option to add units to the data.

I often find I'm adding extra columns of data to an existing data set, perhaps adding a calculated property generated by an external molecular modeling package. One nice feature is the ability to reorder columns (and rows), simply select "reorder" from the spreadsheet menu and you are presented with a dialog listing all the columns (or rows) that allows you to drag and drop the columns into the preferred order. The HERG dataset I'm using has a number of molecules included as intenal standards, these I can mark using the Symbols pallet, now when ever I create a graph those data points will automatically be marked in any plots that are created. In addition you can use the toggle button on the worksheet to dynamically display the object label. There also a wide selection of data processing tools, if I create a new column called "Efficiency" and then select "New data processing pipeleine" from the "File" menu brings up the following dialog I can then insert details of how the new column data should be calculated. You have to remember to click "Compute" or the new data will not be inserted.

Plotting and Charting

Aabel is a tremendously powerful plotting application and I can only describe a few of the many features. The HERG data set I'm using here has a column of categorical data, use the chart category "Categorical Histogram, Pareto, Spine, Ogive", and from the corresponding Variables & Plot Options palette, choose the intended categorical variable as shown in the figure below.

We can now add a second plot using the Graphics Sublayers Manager palette; within the palette, open the rightmost menu, and choose "Add New Chart Using Active Pipeline" (see the image below),

If we now add a scatterplot, and then activate the categorical Histogram and click the bin whose data should be highlighted, the corresponding points will be highlighted on the X-Y plot as shown in the image below. This linking and brushing is a really useful way to explore the data in multiple plots, and the ability to interact in this manner is one feature that sets apart the more extensive applications from the cheap and cheerful plotting and charting applications.

Alternatively, if you have colour coded the rows in the spreadsheet you can use those colours in the plots as shown below. All the plots can be customised and resized for publication, so you can change font sizes, rescale axes, dash separartion, titles, colours etc.

3D and 4D Plots

Aabel can create 3D and by using colour and shape it is possible to map further parameters to achieve complex representations. The diagram below shows a rotating 3D plot in which selected rows have been flagged to identify specific features, in this case the green points are zwitterions and the purple acids. In the lower 3D scatter plot a fourth dimension has been used to colour code the points. These plots can be rotated using sliders.

This ability to interaction between the plots and the ability for the user to manipulate the plot is invaluable.

Statistical Analysis

Of course most of the time you will be looking for correlations between experimental data and an array of variables, looking to be a predictive model. Since we had the Olympics this year for this review I'm using the results from the long jump competiton. As you can see in the image below the winning jump has generally increased over time.

The scatterplot was created in Aabel by selecting the "New Visualization pipeline" option and simply clicking on the scatterplot icon. A control-click on the axis opens up a dialog box that allows complete customisation of all aspects of the plot. Except that is the size of the plot, which is controlled by the "Graphic Sublayers Manager" which is available from the last but one button on the left-hand side button pallet, or you can highlight the chart and drag corner to resize. If we now click on the "Statistics" icon, the "Stats Analyzer" dialog opens and allows you to set up the calculation.

The reults are then posted back to the chart and the predicted results added to a worksheet, which can then be ploted as shown below.

For large datasets it is often useful to create a cross-correlation matrix to identify variables that might be correlated. Using the 1000 by 50 HERG dataset I described previously, create a "New Visualization pipeline" and select the "Statistics" option as shown below. The cross-correlation is completed almost instantly.

The results can either presented as a heat map where highly correlated variables are "hotter" colours, or as a correlation table. Unfortunately there is no hot link between heat map and table.

Aabel offers a huge selection of statistical tools and also includes worksheets as examples to most as shown below. These are an absolutely invaluable resource for understanding the program.

Aabel is an outstanding plotting and graphing tool that gives a wide variety of really beautiful plots, in additon it is also packed with a wealth of powerful statistcal tools. This is a really solid upgrade adding a wide range of additional statistical tools and plots. There have also been a few tweaks to the UI which should help new users. At $575 Aabel is a long way from the most expensive data analysis packages available in addition Aabel is available with an academic license for educational end users. Aabel also has applescript support a feature I'll look at in a subsequent review.

There are a number of alternative data analysis packages listed here that might be of interest, and there is a collection of reviews of scientific applications listed here