Tabula is awesome!
02 02 16 - Filed in: data analysis open source
I recently needed to download the supplementary information provided with a publication, my heart sank when I saw it was provided as a PDF file. My worst fears were justified when I tried to simply copy and paste SMILES strings together with 5 columns of data into a spreadsheet, no chance of it copying across in an ordered manner!
Then I tried Tabula a tool for "liberating data tables locked inside PDF files". It worked perfectly, nearly 2000 rows of data spread over 11 pages converted to a csv file in a couple of mouse clicks. This is wonderful and should be part of any data scientists toolkit.
It is included on the Data Analysis Tools page but really deserves a special mention.
blog comments powered by Disqus