Friday, December 25, 2009

Parsing Excel Spreadsheets in Perl

Last week at work I was playing around trying to read Nessus Scan results which were pre dumped in an Excel file. I used Jon McNamara's Perl modules to parse and write Excel files. They largely worked very well, just as it was documented.

One place though where I got a little stuck was when I was reading rows from one file and conditionally writing it to another Excel file. So for eg. My first Excel file had say 5 rows and I wanted to write only 2,4 and 5 but not 1 and 3. What kept happening was that 2,4 and 5 did get written, but 1 and 3 also got written as blank rows leaving gaps in the Excel sheet. I worked around it by forcing zeros to get written for Rows 1 and 3 as I didn't want blank rows in the destination file (Cant Autofilter if they are present) but it did get a little messy as complexity increased.

So I decided to use the modules just for vanilla read and write. I read data from Excel and instead of parsing it then and there dumped it into a TAB separated text file. I parsed , stripped and did what I wanted with the contents of the text file and then used the WriteExcel module to write it back to the new Excel sheet. That got the job done just like I wanted it.

Now maybe I was missing something and should have RTFM more carefully, but this little hack kind of helped me when I was under a little time pressure so figured I'd put it out so someone else stuck..could try this out.

No comments: