Thursday, January 3, 2008

PDF/Writer

Decided to use PDF/Writer to generate a chunky report. Before I say how much I enjoyed using it, let me list its shortcomings:
  • No UTF-8 support yet. So those European characters are garbled.
  • No built-in support for text wrapping images or widow/orphan control.
  • No explicit support for headers, footers, TOC, and other common document elements.
  • No support for HTML tags, CSS-like layout, or any layout, really.
  • It's slow.
But it's ruby so there are some nice usage of blocks and other rubyisms that made it the best choice for me. I thought I'd include a couple chunks of code to illustrate how you can use PDF/Writer. I'm sure there are better ways but I couldn't find any code examples outside the demos on the site. So hoping this helps someone...

Here's my main page footer that is three lines of text below a horizontal rule including date, page numbering and some stuff:

pdf.start_page_numbering pdf.absolute_right_margin, 44, 9, :right, "Page of ."
pdf.open_object do |footer|

pdf.save_state
center = pdf.margin_x_middle
t1 = "#{tour_code.name} (#{tour_code.identifier})"
w1 = pdf.text_width(t1, 10) / 2.0
t2 = "Printed: #{Time.now.to_date}"
w2 = pdf.text_width(t2, 9)
t3 = "Copyright for brochure text and overview map belongs to #{tour_code.tour_operator.name}."
w3 = pdf.text_width(t3, 9) / 2.0
t4 = "Copyright for photos belongs to the respective owners. All else Copyright by MyTripScrapbook.com."
w4 = pdf.text_width(t4, 9) / 2.0
pdf.add_text_wrap(x-w1, 44, w1*2, t1, 10)
pdf.add_text_wrap(pdf.absolute_left_margin, 44, w2, t2, 9)
pdf.add_text_wrap(x-w3, 32, w3*3, t3, 9, :left)
pdf.add_text_wrap(x-w4, 22, w4*2, t4, 9)
pdf.line(pdf.absolute_left_margin, 60, pdf.absolute_right_margin , 60).stroke
pdf.restore_state
pdf.close_object
pdf.add_object(footer, :all_pages)
end


You'll see the single line that sets up the page numbering string pattern. Then the block that creates an arbitrary chunk of repeating text - in this case a footer. pdf is the pdf object that had been created earlier. Nothing remarkable about this except to notice the use of some dimensions using method calls (pdf.absolute_left_margin) and some absolute coordinates.

PDF/Writer allows you to place text or images in two ways: automatically at the current insertion point allowing line wrap to happen, or at a specified point on the page. In order to create a text-wrap effect around an image (in this case a map), I temporarily expanded the left margin by the width of the map, inserted the text at the current insertion point allowing line wrap to happen, and inserted the image into the empty wide left margin. Then checked to see which of the image and the text column was the longest, and advanced the insertion point to there plus a bit of padding.

marker = pdf.y
overview = tour_code.tour_code_overview
# Since we're using the thumb (200px wide) version of the map, we need to adjust the dimensions from the full size.
overview.height = overview.height*200/overview.width
overview.width = 200
pdf.left_margin += overview.width + 40
description = remove_pesky_tags(overview.description)
pdf.text description, :justification=>:full, :font_size=>10
text_height = (marker - pdf.y)
pdf.add_image_from_file RAILS_ROOT + overview.public_filename(:t), 72, marker - overview.height - 10
if overview.height > text_height
pdf.move_pointer(overview.height - text_height + 20)
end
pdf.left_margin += -(overview.width + 40)


Prior to starting a new heading, I added a primitive widow control. Fortunately I already knew the height of the text block (90 units):

if (pdf.y - 90 <>
pdf.start_new_page(true)
end


PDF/Writer also includes a handy way to measure the height of a text block without actually printing it. I'll use that for cases when I don't know the height in advance.

For a final example, it was useful to display two images side by side without having to explicitly measure each one and use coordinate positioning. So I used two columns. Here I print a text block, start 2 column mode, print the first image, start new column and print the second, then go back to single column mode for the final text block.

pdf.text activity.title + blank_line, :justification=>:center, :font_size=>12
pdf.start_columns 2, 2 # Start 2 columns.
pdf.image RAILS_ROOT+activity.public_filename(1, 300), :pad=>0, :justification=>:center
pdf.start_new_page # Starts next column.
pdf.image RAILS_ROOT+activity.public_filename(2, 300), :pad=>0, :justification=>:center if activity.image2_width
pdf.stop_columns # Back to single column.
pdf.text remove_pesky_tags(activity.description) + blank_line, :justification=>:full, :font_size=>10


You'll notice the remove_pesky_tags method which helps deal with (only some of) the issues of HTML tags and HTML entities.

def self.remove_pesky_tags(text)
new_text = text.gsub(/<\/p>

/,"\n\n")
new_text.gsub!(/

/,"")
new_text.gsub!(/<\/p>/,"")
new_text.gsub!(/"/,"'")
new_text.gsub!(//,"")
new_text.gsub!(/<\/cite>/,"
")
new_text.gsub!(/ /,"")
new_text.gsub!(/á/,"a")
new_text.gsub!(/é/,"e")
new_text.gsub!(/
/,"")

new_text.gsub!(//,"")
new_text.gsub!(/<\/sup>/,"")
new_text
end

The was only one gotcha for me: images are placed using coordinates relative to their bottom left corner, not their top left corner as you might expect. Also text_width returned inconsistent results for me.

With lots of reporting to be done, this was only the first day's taste of using PDF/Writer. I'm sure there'll be lots more helper methods to write and subtleties to understand. High on the list is figuring a way to use the QuickRef class and the techbook application. QuickRef provides a DSL for working with brochures. Techbook implements a markup language suitable for manual-like documents that goes a long way towards widow/orphan control, text tags, and other useful features.

Many thanks to Austin Ziegler, the original author, and the current band of workers who are working to update this terrific library.

3 comments:

bryanl said...

The Ruports guys have taken pdfwriter over and hopefully they can make it better. Maybe you can help them out?

Jim. said...

Thanks Bryan! Why didn't I even think of Ruports in my initial search? I'll go check them out. Thanks again!

Anonymous said...

Hi. This service here allows you to easily edit your PDF documents.
http://goo.gl/oQUqZ7

You can fill out PDF form, save it, fax it, and email it.