Thursday, September 30, 2010

Nokogiri HTML and XML parser

Nokogiri gem, a new HTML, XML, SAX and Reader parser for Ruby.

It parses and searches XML/HTML faster than Hpricot(Hpricot being the current de facto Ruby HTML parser) and boasts XPath support, CSS3 selector support (a big deal, because CSS3 selectors are mega powerful) and the ability to be used as a "drop in" replacement for Hpricot.

On an Hpricot vs Nokogiri benchmark, Nokogiri clocked in at 7 times faster at initially loading an XML document, 5 times faster at searching for content based on an XPath, and 1.62 times faster at searching for content via a CSS-based search.

Here is the example :

require 'nokogiri'

require 'open-uri'

# Get a Nokogiri::HTML:Document for the page we’re #interested in...

doc = Nokogiri::HTML(open(''))

#Do funky things with it using Nokogiri::XML::Node methods...

# Search for nodes by css

doc.css('h3.r a.l').each do |link|

puts link.content



# Search for nodes by xpath

doc.xpath('//h3/a[@class="l"]').each do |link|

puts link.content



# Or mix and match.'h3.r a.l', '//h3/a[@class="l"]').each do |link|

puts link.content


Source :

Saturday, September 25, 2010

My libraries in lib folder is not loading for Rails3..

My libraries in lib folder is not loading for Rails3!!!

Is it a bug?...NO

In Rails 3 it does not automatically load files in lib folders. . If you create any module/class in lib folder and try to use it, the server will throw an "uninitialized constant Modulename (NameError)". You have to manually include the modules which you want to use.

The possible reason for this behaviour is:

  • This makes a Rails application behave more closely to any Ruby project, leaving the autoload to directories specific to Rails .
  • Increasing performance.
  • Autoloading in lib was always causing some pain, since some people had files like "employee.rb" but there was no "employee" module defined it.

There is solution to unable this feature again:

Put this in application.rb

config.autoload_paths += %W(#{config.root}/lib)

config.autoload_paths += Dir["#{config.root}/lib/**/"]