Friday, June 8, 2007

More than development.log but less than breakpointing

Ran into a problem today that wouldn't reveal itself in the development.log but didn't seem worthy of firing up breakpointing. I had added some more parsing features to the tour operator site scraping tool and I got odd errors on a few pages. The errors didn't show up right away, they were toward the end of the page scraping process. Quick background: The pages consist of tour itineraries listing the activities of each day throughout the tour. Fortunately each day has the same markup. Using hpricot the tool first collects the contents of tags that identify the day of the tour, then cycles back to collect the city each day is at, then cycles back to get the description of each day's activities, and so on. Many passes through the same datastream. At the end, the tool splices all the collections together to form records corresponding to each day's data points.

The error was appearing toward the end of the process. Breakpointing would have had to jump in and out of irb many times before the error appeared.

I had forgotten about logger - the ruby facility to generate your own log messages. So easy to use:"processing day: #{itin_day.inner_html[4,3]}")

writes "processing day: 4" to development.log. Half a dozen of these scattered through the tool provided enough information in the log to see exactly where the problem was.

Thanks logger!

No comments: