5/19/2016

xpath

It is very easy if you understood the funta behind XPATH . xpath is XML path , XPath is used to navigate through elements and attributes in an XML document . As i said earlier both HTML and XML follows DOM structure , so we can use xpath to navigate elements and attributes in an HTML.

<html>
 <body>
   <div class="1">
     <div class="2"></div>
        <div class="1"></div>
        <div class="2">
           <div class="1">
              <a id="1">HI</a>
              <a id="2">
                 <h1>HI<h2>
              </a>
           </div>
        </div>
        <div class="3"></div>
     <div class="2"></div>
   </div>
 </body>
</html>

we can visualize the above HTML code in DOM structure we get

In Document object Model (DOM) , document is the parent object , then comes html,body,child and siblings , if we want to navigate to element h1, we can start from html inside the document , document is the outer cover of all elements.

if we want to reach body we can write its as
navigate so far (xpath) : /html/body        

Inside body  tag there is only one child which is a div element so we can write it as
navigate so far (xpath) : /html/body/div          

Inside the div element there are three div elements , so we can count it as 1,2,3 other wise returns 3 element array of div elements and we want to navigate to second div element , we can write it as

navigate so far (xpath) : /html/body/div/div[2]

Inside div[2] there is only one child which is a div element, so we can write it as 
navigate so far (xpath) : /html/body/div/div[2]/div 

Inside the div element we can see  two anchor elements and we can count it as 1,2 other wise which will return array of two anchor elements . Then we want to navigate to the second anchor element which we can write it as


navigate so far (xpath) : /html/body/div/div[2]/div/a[2]

Inside a[2] there is only one child that is the h1 , thats our required element and we can write it as 

 navigate so far (xpath) : /htm/body/div/div[2]/div/a[2]/h1

That is how xpath is take. There are two types of xpath 

  1. Absolute xpath 
  2.  Relative xpath
The above xpath we took earlier is called absolute path which contains the whole path . But there is problem in taking the absolute path because if there any change in dom will effect the navigation for example

In the above xpath /htm/body/div/div[2]/div/a[2]/h1   div[2] changes to div[1] or deleted the navigation will not work, this is where relative xpath come in to play

if we want to take the relative path of  the above absolute path , we can write it as .//a[@id='2']/h1      or .//*[@id='2']/h1     we will break this expressions in to parts 

[@id=2] , which means take any element in the document who has the id=2, in xpath attributes are represented by @id which has a value of 2 which is enclosed by two square brackets if you remember JS syntax.

so we can write it as [@id=2] and if we write a[@id='2']     this expression will return anchor tag <a> who has the attribute id=2 . 

if we write a[@id='2']/h1   this will returns the element who has path match with this expression which has h1 after a[@id='2']

if we write  //a[@id='2']/h1  the two forwarded slash means ,Selects nodes in the document from the current node that match the selection no matter where they are.

if we write .//a[@id='2']/h1   the dot in front of this means , Select the current node.


if we write *[@id='2']/h1  , There is an Asterisk comes in front of the squares means , There can be any elements comes , we wont mind the element , since id is unique and it is related with the <a> tag, it always returns the anchor element. thus we get .//*[@id='2']/h1


But when you write the expression we need to begin from left.

Now we can write path expressions using predicate which is supported by xpath.
for example [@id='2', predicates always enclosed in a square brackets . There are many other predicate which supports path expression .

for example div[2]  2 is a predicate , we can use last(),position(),@ for selecting attributes which i already said.

if we want to use predicate last() and position() we can use like this 

.//a[last()]/h1 this means , select the last anchor element in this path that is a[2] 

by predicate position we can write it as .//a[position()>1]/h1  which is a[2]
you can generate any kind of meaningful expression by mix and match this predicates.

No comments:

Post a Comment

installing perl dancer in ubuntu

For installing perl dancer you need curl utility for installing curl ,   sudo apt-get install curl Now we can install Perl-Dancer by thi...