Saturday, March 15, 2008

Calling helpers in the rails console.

Trying to debug some asset tag helper issues and I found this...

Did you know you can call helpers directly in the rails console? Just use the helper object...

>> helper.link_to "this", "that"
=> <a href="that">this</a>


You can also call custom helpers (from your app), but I haven't tried it:

>> helper :my_custom_helper

Wednesday, March 12, 2008

UTF-8 and hpricot

I needed to take the text tagged in an XML document and make url strings out of it. Used hpricot to parse the XML. Like this:

doc = Hpricot.XML(itinerary_day_description)

and then used xpath to find the text within the <cite> tags that will form the basis of the url I need:

activities = doc/("cite")
activities.each do |activity|
title = activity.innerText
link = "<a href=\" '/redboxes/activity/#{title}\"
etc - you get the idea...

Hit the problem when the text contained non-ASCII UTF-8 characters (ñ, é, etc).

Hpricot conveniently converted them to HTML entities. And then innerText converted them into a meaningless character.

Not only does hpricot perform the HTML entity encoding in the initial XML document, but it performs it again every time the XML document gets processed.

Here's what I had to do to make this work.
  1. Use innerHTML instead of innerText. It preserves the HTML entity encoding that innerText didn't.
  2. Use the awesome HTMLEntities module from Paul Battley. I simply converted the title from an HTML entity back to native UTF-8 characters.
  3. Use CGI.escape for URL encoding.
So the final code snippet looks like:

doc = Hpricot.XML(itinerary_day_description)
coder = HTMLEntities.new
activities = doc/("cite")
activities.each do |activity|
title = coder.decode(activity.innerHTML)
link = "<a href=\" '/redboxes/activity/#{CGI.escape(title)}\" etc