xml, html output

My HTML output class is now at what I expect to be a reasonably stable state. It’s not by any means a perfect solution for the world’s HTML needs, but it can generate the output that I want without much excess typing, which is all that matters.

Actually, it divided into two classes this morning. First, XmlOutput:


  class XmlOutput
    def initialize(io)
      @io = io
      @indentation = 0
      @elements = []
    end

    def element(*element_and_attributes)
      if (block_given?)
        open_element(element_and_attributes)
        yield(self)
        close_element
      else
        write_indented_element(element_and_attributes)
      end
    end

    def inline_element(*element_and_attributes)
      "<#{element_and_attributes.join(" ")}>" +
        yield +
        "</#{element_and_attributes[0]}>"
    end

    def line
      if (block_given?)
        indent
        @io.write(yield)
      end

      @io.write("\n")
    end

    # FIXME (2007-07-21, carlton): Can I use define_method to
    # construct a method taking a block?
    def self.define_element(element, *attributes)
      module_eval element_def("element", element, attributes)
    end

    def self.define_inline_element(element, *attributes)
      module_eval element_def("inline_element", element, attributes)
    end

    def self.element_def(method, element, attributes)
      %Q{def #{element}(#{attr_args(attributes)} &block)
           #{method}("#{element}", #{attr_vals(attributes)} &block)
         end}
    end

    def self.attr_args(attributes)
      attributes.map { |attribute| attribute.to_s + "_arg, " }
    end

    def self.attr_vals(attributes)
      attributes.map do |attribute|
        '"' + attribute.to_s + '=\\"#{' + attribute.to_s + '_arg}\\"", '
      end
    end

    def write_indented_element(element_and_attributes)
      line { "<#{element_and_attributes.join(" ")} />" }
    end

    def open_element(element_and_attributes)
      line { "<#{element_and_attributes.join(" ")}>" }
      @indentation += 2
      @elements.push(element_and_attributes[0])
    end

    def close_element
      element = @elements.pop
      @indentation -= 2
      line { "</#{element}>" }
    end

    def indent
      @io.write(" " * @indentation)
    end
  end

I’ve given up on the whole public/protected/private distinction, for now: I don’t see much point in it for programming that I’m doing by myself. But I suppose it does have uses when explaining code to others: if you were to use the class directly, then you’d use element, inline_element, and line. The former is for an XML element that you deem important enough to put the opening and closing tags on their own lines (perhaps head and body for HTML); inline_element is for XML elements that you want to stick in the middle of lines (perhaps cite and a for HTML). And line is for text that you’re inserting, either passed as a string or generated via inline_element. They all take blocks, to either fill in the middle of the elements or the lines; two of them do something useful if not given a block, and the third could easily enough if I need that functionality. Oh, and the element functions have a crappy way of specifying attributes.

Which works well enough, but still requires more typing (in my case, manifesting itself as > 80 column lines) than would be ideal. Which is where the class functions define_element and define_inline_element goes in. Here’s HtmlOutput:


  class HtmlOutput < XmlOutput
    define_inline_element :a, :href

    define_inline_element :span, :class

    define_inline_element :li
    alias_method :inline_li, :li

    define_inline_element :title

    define_inline_element :h1
    define_inline_element :h2

    define_element :head
    define_element :body

    define_element :div, :id

    define_element :ul, :class
    alias_method :ul_class, :ul
    define_element :ul

    define_element :li

    define_element :link, :rel, :type, :href

    def html(&block)
      line { "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"" }
      line { "  \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">" }
      element("html", "xmlns=\"http://www.w3.org/1999/xhtml\"",
              "xml:lang=\"en\"", "lang=\"en\"", &block)
    end
  end

This lets me create methods corresponding to the elements that I care about. If those elements take attributes (as in <a href=...>, I pass them as extra arguments (define_inline_element :a, :href), and the generated methods take arguments that are the values for the attributes. So, if I want to generate the following:


  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
      <title>The Title</title>
      <link rel="stylesheet" type="text/css" href="styles.css" />
    </head>

    <body>
      <h1>Main Header</h1>
      <ul>
        <li><a href="http://site/page/">link text</a></li>
      </ul>
    </body>
  </html>


  o.html do
    o.head do
      o.line { o.title { "The Title" } }
      o.link("stylesheet", "text/css", "styles.css")
    end

    o.line

    o.body do
      o.line { o.h1 { "Main Header" } }
      o.ul do
        o.line do
          o.inline_li do
            o.a("http://site/page/") { "link text" }
          end
        end
      end
    end
  end

Admittedly, this isn’t the eighth wonder of the world or anything, but I do think the interface will work pretty well for the specific uses that I have in mind. Or maybe not – I read the relevant chapter in the Pickaxe book this morning; they describe a library with an interface basically identical to what I ended up with, but then comment that people almost never use it, typically preferring to use some sort of HTML template with embedded Ruby instead. And maybe I’ll switch to a solution like that as I get more used to the area.

However that turns out, there are two bits that I want to talk about. One is what I discussed in my previous post, that it was a lot of fun starting with a complex bit of output and refactoring my way into a class that generated it. I won’t yet propose that as the way to go in all situations, and I’m not even sure it actively helped me here: if I’d started out wanting to build up a solution from scratch instead of decompose one out of a monolithic print statement, I don’t see any reason to believe it would have turned out differently or gone any slower. But it was a very pleasant way to develop code, I’m confident it didn’t slow me down at all, and I only spent about 10 minutes of development time wondering what was the best thing to do next. If nothing else, it will give me further motivation to write my acceptance tests early: currently, I have them in mind from the start of a task, but I don’t usually actually write them until the code that they’re testing is finished. That delay isn’t usually for any good reason, it’s simply because I don’t yet like writing acceptance tests as much as I like doing other things, but if I can start to see real effects out of writing the acceptance tests earlier, I’d probably switch to doing so. (It would help if I started using Fit, too; for now, though, I’m not convinced I’m working in areas where that is an obvious win.)

The second bit I want to emphasize is that I love the way the definition of HtmlOutput looks. This is the second time in this project that I’ve done something like that: there’s a base class that implements class functions designed to let you provide functionality in a subclass without writing explicit method definitions in that subclass! Much more fun than sticking in protected hooks here and there, and when it works the subclass definitions are dramatically shorter (and freer of boilerplate repetition) than they would be if I were, say, programming in Java. As the FIXME comment shows, I’m not entirely comfortable with the implementation in this particular case, and now that I think about it, I’m not entirely comfortable with my implementation in the other case as well, but the fact that I can do it at all pleases me greatly.

So: I can generate one particular piece of HTML. Now I just have to have that HTML vary based on the contents of a database. Shouldn’t be too hard; I hope I’ll find a few more ways in which the implementation improves upon its Java counterpart.

Post Revisions:

There are no revisions for this post.

Published 7/21/2007 & Filed in Programming

malvasia bianca

xml, html output

Post Revisions:

Now Reading

Now Playing

Pages

Categories

Favorite Blogs

Favorite Podcasts

Other Favorites

Personal Links

Meta

malvasia bianca

xml, html output

Post Revisions:

Search

Now Reading

Now Playing

Pages

Categories

Favorite Blogs

Favorite Podcasts

Other Favorites

Personal Links

Meta