Last Updated: July 19, 2018
· 123aswin123
261905 photo me

Using The Nokogiri Gem To Parse Nested XML Data In Ruby

Nokogiri (http://www.nokogiri.org/) is indeed one of the most powerful Ruby gems to parse XML / HTML.

The Site itself does not contain sufficient information to parse XML. On the other hand the documentation (http://www.rubydoc.info/github/sparklemotion/nokogiri) is too much to read through to parse a simple XML tree.

Consider the following XML tree(The same tree given in the nokogiri website:http://www.nokogiri.org/tutorials/searching_a_xml_html_document.html )

      <name>Married with Children</name>
        <character>Al Bundy</character>
        <character>Bud Bundy</character>
        <character>Marcy Darcy</character>
      <name>Perfect Strangers</name>
        <character>Larry Appleton</character>
        <character>Balki Bartokomous</character>
      <name>The A-Team</name>
        <character>John "Hannibal" Smith</character>
        <character>Templeton "Face" Peck</character>
        <character>"B.A." Baracus</character>
        <character>"Howling Mad" Murdock</character>

According to the Syntax given in the official docs page :

require 'nokogiri'

doc = Nokogiri::XML(File.open("shows.xml"))
doc.xpath('//character').each do


 puts char_element.text


The output of the above code would be

Al Bundy
Bud Bundy
Marcy Darcy
Larry Appleton
Balki Bartokomous
John "Hannibal" Smith
Templeton "Face" Peck
"B.A." Baracus
"Howling Mad" Murdock

But most probably you , like I did, you wish to display/ access the sitcom name first and then display/ access the character names and repeat it for the remaining elements this code would do it for you:

require 'nokogiri'

xml_file = File.read("shows.xml")

doc = Nokogiri::XML.parse(xml_file)

doc.xpath('//sitcom').each do


  puts "\nShow Name : "+sitcom_element.xpath('name').text
  sitcom_element.xpath('characters/character').each do


    puts "    #{count}.Charachter : " + character_element.text



The output of the following code is :

Show Name : Married with Children
    1.Charachter : Al Bundy
    2.Charachter : Bud Bundy
    3.Charachter : Marcy Darcy

Show Name : Perfect Strangers
    1.Charachter : Larry Appleton
    2.Charachter : Balki Bartokomous

1)The Key observation here is that


would just return a list of all the elements under the tag_name, it is like a Ctrl+F Operation

2)When we use


we will be able to access the elements in a tree like fashion

Say Thanks

1 Response
Add your response


Nice sharing! Thanks!

over 1 year ago ·