Old Emmanuel Oga's Weblog (new one is at www.emmanueloga.com)

Pretty printing xhtml with nokogiri and xslt

Posted in ruby, xhtml, xml by emmanueloga on septiembre 29, 2009

[UPDATE]

Check this gist for a command line version of  xml indenter in this post.

Today I was looking for a way to pretty print xhtml. Good’ol REXML supports this in a very simple way:

Document.new("<some>XML</some>")doc.write($stdout, indent_spaces = 4)

This generates a nicely indented xml document. But REXML was not robust enough for my needs. Luckily, we now have a couple of excellent choices on ruby for parsing xml, including hpricot, nokogiri and libxml-ruby bindings.

I did not find a way to pretty print xhtml as easy as you can do with REXML with any of these libraries, though. But I did find a way of doing it using XSLT. Nokogiri supports applying XSLT to an XML document (probably libxml bindings do too, hpricot does not). Here is how:

    xsl = Nokogiri::XSLT(File.read("pretty_print.xsl"))
   html = Nokogiri(File.read("source.html"))
   File.open("output.html", "w") { |f| f << xsl.apply_to(html).to_s }

That’s it, simple enough. Got the idea from this dzone snippet.

For the xslt file I used this nice one I found on this site: http://www.printk.net/~bds/indent.html

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="ISO-8859-1"/>
  <xsl:param name="indent-increment" select="'   '"/>

  <xsl:template name="newline">
    <xsl:text disable-output-escaping="yes">
</xsl:text>
  </xsl:template>

  <xsl:template match="comment() | processing-instruction()">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    <xsl:copy />
  </xsl:template>

  <xsl:template match="text()">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    <xsl:value-of select="normalize-space(.)"/>
  </xsl:template>

  <xsl:template match="text()[normalize-space(.)='']"/>

  <xsl:template match="*">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
      <xsl:choose>
       <xsl:when test="count(child::*) > 0">
        <xsl:copy>
         <xsl:copy-of select="@*"/>
         <xsl:apply-templates select="*|text()">
           <xsl:with-param name="indent" select="concat ($indent, $indent-increment)"/>
         </xsl:apply-templates>
         <xsl:call-template name="newline"/>
         <xsl:value-of select="$indent"/>
        </xsl:copy>
       </xsl:when>
       <xsl:otherwise>
        <xsl:copy-of select="."/>
       </xsl:otherwise>
     </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
Tagged with: , , ,

Conferencia Locos X Rails tomorrow,

Posted in Uncategorized by emmanueloga on abril 2, 2009

Tomorrow starts Locos X Rails conference in Buenos Aires, Argentina. I’m very exited about it! Yesterday I met Desi McAdam, of DevChix fame, and also the keynote speaker Obie Fernandez, founder of HashRocket. I’m eager to hear their talks, and hope to see you in the conf. too!

Desi and Obie at the press meeting for the LocosXRails conf.

Desi and Obie at the press meeting for the LocosXRails conf.

Tagged with: , ,

Conferencia LocosXRails: falta poco!

Posted in Uncategorized by emmanueloga on marzo 10, 2009

Estamos a poco mas de 20 días para el inicio de la primer conferencia de ruby y rails en Argentina. Todavía no te anotaste? Podes hacerlo mediante la web: http://www.locosxrails.com/registration

Si ya te anotaste, “spread the word” mediante alguno de los siguientes badges:

Saludos!

Tagged with: , ,

Locos X Rails Conference: Registration Opened!

Posted in Uncategorized by emmanueloga on marzo 4, 2009

Locos X Rails

Ya se pueden anotar en la conf. El valor de la entrada es de 300 Argentinos. Claro, antes de anotarse van a querer saber quien va a dar las charlas:

http://eventioz.com/events/locos-x-rails-conference/speakers

Nos vemos ahi!

Tagged with: , ,

Locos por Rails Conference in Buenos Aires, Argentina

Posted in people, rails, ruby by emmanueloga on enero 23, 2009

Badge Locos X Rails Conference

Por fin!, una conferencia de ruby y rails en Argentina! La misma se realizará el 3 y 4 de Abril en la Universidad de Palermo, C.A.B.A. Estan cordialmente invitados. La inscripción no ha empezado todavía, pero si pueden mandar sus propuestas para disertar en la misma.

Locos Por Rails Conference 2009 will be held on April 3rd and 4th in Buenos Aires, Argentina. Registration is not open yet, but the call for papers is already open. Send yours!

Tagged with: , ,

Handling URIs in ruby a walk in the park? Don’t think so….

Posted in rails, ruby, Uncategorized by emmanueloga on octubre 17, 2008

I got sick an tired of juggling around with String urls and URI objects just to get a correct URI for http GETting, POSTing, etc…

For example, if you

Net::HTTP.post_form URI(“www.somewhere.com”)

you get an error…. but if you

Net::HTTP.post_form URI(“www.somewhere.com/”)

(notice trailing slash) you don’t… Boring stuff. The URI lib does provide a normalize method, but it does not always add the trailing “/”. For consistency I wanted normalize to add the trailing slash always. Update: seems like my “consistency” politic is not correct…. Oh, well… at least google does not like it. 🙂 I’m updating my Gist to remove that behavior…. Off course! Adding a trailing slash means you are looking for a directory and not for a resource, so you can’t go berserk adding trailing slashes in all your urls :p

But, the most boring stuff is joining uris and adding query params to them… This should be simple, right?:

uri = URI("something.com/?some=params")
uri.query = "other=params" # WRONG, previous params are overwritten
uri = URI("something.com/")
uri.query << "other=params" # WRONG, previous query is nil
uri = URI("something.com/?some=params")
uri.query << "other=params" # WRONG, params should be joined with & char...
&#91;/sourcecode&#93;

We need to <a href="http://github.com/jnunemaker/httparty/tree/master/lib/httparty.rb#L113">juggle with the URI object</a> to get the job done. More boring stuff. I wrote<a title="Joining an nomrmalizing URIs in ruby" href="http://gist.github.com/17342"> two simple methods to handle these problems</a>. Now I won't have to manually tweak those urls again... never more! (I hope :-).


  describe NormalizeURI do
    it "should add scheme and final / to an uri" do
      NormalizeURI("www.yahoo.com?something=true").to_s.should == "http://www.yahoo.com/?something=true"
    end
  end

  describe JoinURI do
    it "should join a string, an uri and additional query params" do
      one = URI("www.yahoo.com?uno=dos")
      two = URI("/peteco/carabal?tres=4&cinco=seis")
      result = "http://www.yahoo.com/peteco/carabal/?uno=dos&tres=4&cinco=seis"
      JoinURI(one.to_s, two, :more => :params).should.to_s == "#{ result }&more=params"
    end
  end

This is such small stuff I don’t think a gem for this would be cool…. maybe later (no gemspec fighting yet hehe). And yes, because I’m lazy I’m using active_support this time…. I’m using this inside rails anyways.

Tagged with: , ,

Meeting RubyArg (nos visitó Marcel Molina Jr., que calidad!)

Posted in people, rails, rspec by emmanueloga on agosto 18, 2008

El pasado Viernes Pedro Visintin organizo una reunion de railers, con la presencia de un invitado deluxe: Marcel Molina Jr.! Estuvo todo muy bueno, gracias Pedro por organizar el encuentro. No solo charlamos de temas muy interesantes, sino que tambien aprendi un par de lecciones fotográficas:

Diego se va

Diego se va

Otra vez la misma foto

Otra vez la misma foto

No entramos todos

No entramos todos

Ahora si entramos todos

Ahora si entramos todos

Tagged with: , , ,

Ubuntu gutsy problem number two: ruby script/console fails to start (readline)

Posted in rails, ubuntu by emmanueloga on noviembre 29, 2007

Just for the record, number problem one was to set up mongrel_cluster on init.d.

Soooo. Now “ruby script/console” fails with a “require ‘readline’ failure” or something. The problem was that, for some reason, when i built the ruby interpreter (1.8.6) the readline extension was not built and installed. ruby’s readline extension comes with the standard ruby distribution. So, i did a:

  • locate readline” (probably you will need a “sudo updatedb” before, or better just look for the source code of your ruby interpreter where you know it is. You know that, right? 🙂
  • Went to the directory where your ruby source code lives, subdirectory ext/readline. In my case: “cd ~/packages/ruby-1.8.6-p110/ext/readline
  • ruby extconf.rb” –this one generates the makefile for the library
  • make
  • sudo make install“.

In ubuntu you’ll problably need libreadline5 and libreadline5-dev _before_ running make:

sudo apt-get install libreadline5 and libreadline5-dev

Now “ruby script/console” should work.

Tagged with: , , , ,

alias versus alias_method

Posted in ruby by emmanueloga on noviembre 1, 2007

alias y alias_method hacen lo mismo: copian un metodo y le asignan un nombre diferente:

class Test
  def test
    puts "hola"
  end
  alias copia_test test
  alias_method :copia_test2, :test
end

Test.new.test        # > "hola"
Test.new.copia_test  # > "hola"
Test.new.copia_test2 # > "hola"

Me surgio la curiosidad de saber la diferencia entre alias y alias_method. Resulta que son la misma cosa, excepto que:

  1. alias es una palabra reservada de ruby
  2. alias toma como parametros identificadores de metodo, sin necesidad de usar symbols o strings (como al usar def nombre_metodo, nombre_metodo es un identificador y no un string o symbol). Este comportamiento puede ser bastante confuso al principio.
  3. alias_method es un metodo de clase de Module
  4. alias_method toma los parametros separados por coma (strings o symbols), como cualquier otro metodo.

Las concecuencias son sencillas:

  • alias_method puede ser redefinido, y alias no.
  • alias puede ser usado incorrectamente (fuera del contexto de definicion de metodos)

Al ser un methodo de Module, alias_method ayuda a ser usado correctamente: en el contexto de la definicion de metodos de una clase o modulo.

Conclusion: en la mayoria de los casos, alias_method es lo que necesito.

Tagged with:

Ohhh!!! tan hermoso!!!

Posted in blogging by emmanueloga on octubre 11, 2007

Ah! maravillensen:

def test
puts "hola"
end

Syntax highlighting!!!! No es hermoso???? Conclusion: WordPress le da una paliza, sopita y a la cama a Blogger!!!