03r98q
Last Updated: October 12, 2018
·
27.23K
· 123aswin123

Using The Nokogiri Gem To Parse Nested XML Data In Ruby

Nokogiri (http://www.nokogiri.org/) is indeed one of the most powerful Ruby gems to parse XML / HTML.

The Site itself does not contain sufficient information to parse XML. On the other hand the documentation (http://www.rubydoc.info/github/sparklemotion/nokogiri) is too much to read through to parse a simple XML tree.

Consider the following XML tree(The same tree given in the nokogiri website:http://www.nokogiri.org/tutorials/searching_a_xml_html_document.html )

[shows.xml]
<root>
  <sitcoms>
    <sitcom>
      <name>Married with Children</name>
      <characters>
        <character>Al Bundy</character>
        <character>Bud Bundy</character>
        <character>Marcy Darcy</character>
      </characters>
    </sitcom>
    <sitcom>
      <name>Perfect Strangers</name>
      <characters>
        <character>Larry Appleton</character>
        <character>Balki Bartokomous</character>
      </characters>
    </sitcom>
  </sitcoms>
  <dramas>
    <drama>
      <name>The A-Team</name>
      <characters>
        <character>John "Hannibal" Smith</character>
        <character>Templeton "Face" Peck</character>
        <character>"B.A." Baracus</character>
        <character>"Howling Mad" Murdock</character>
      </characters>
    </drama>
  </dramas>
</root>

According to the Syntax given in the official docs page :

require 'nokogiri'

doc = Nokogiri::XML(File.open("shows.xml"))
doc.xpath('//character').each do

 |char_element|

 puts char_element.text

 end

The output of the above code would be

Al Bundy
Bud Bundy
Marcy Darcy
Larry Appleton
Balki Bartokomous
John "Hannibal" Smith
Templeton "Face" Peck
"B.A." Baracus
"Howling Mad" Murdock

But most probably you , like I did, you wish to display/ access the sitcom name first and then display/ access the character names and repeat it for the remaining elements this code would do it for you:

require 'nokogiri'

xml_file = File.read("shows.xml")

doc = Nokogiri::XML.parse(xml_file)

doc.xpath('//sitcom').each do

  |sitcom_element|

  puts "\nShow Name : "+sitcom_element.xpath('name').text
  count=1
  sitcom_element.xpath('characters/character').each do

    |character_element|

    puts "    #{count}.Charachter : " + character_element.text
    count=count+1



  end

end

The output of the following code is :

Show Name : Married with Children
    1.Charachter : Al Bundy
    2.Charachter : Bud Bundy
    3.Charachter : Marcy Darcy

Show Name : Perfect Strangers
    1.Charachter : Larry Appleton
    2.Charachter : Balki Bartokomous

1)The Key observation here is that

element.xpath("//tag_name") 

would just return a list of all the elements under the tag_name, it is like a Ctrl+F Operation

2)When we use

element.xpath("tag_name") 

we will be able to access the elements in a tree like fashion

4 Responses
Add your response

24859

Nice sharing! Thanks!

over 1 year ago ·
30425

hello! wanna write a work on the similar topic so may i use your info please?
i work here (https://uk.edubirdie.com/assignment-help) as a writer and i choose what am i gonna write by myself. thank you anyways
wait for response! have a good one

2 months ago ·
30808

Most of the people were addicted to play online card game if you were interested in such online game then from here http://myspades.org you will play the most popular mind twisting spades game online without any pay and app installation.

25 days ago ·
30959

how can I use this parser to find necessary words in essay papers here - https://leo-green.com/

10 days ago ·