document.evaluate不返回正确的TextNodes XPath

时间:2013-06-08 13:25:51

标签: android xpath webview document.evaluate

我在WebView中为Android创建“荧光笔”。 我通过以下函数获取HTML中所选范围的XPath表达式

  

/ HTML [1] / BODY [1] / DIV [1] / DIV [3] / DIV [1] / DIV [1] /文本()[5]

现在我正在通过javascript中的这个函数评估上面的XPath表达式

var resNode = document.evaluate('/HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[5]',document,null,XPathResult.FIRST_ORDERED_NODE_TYPE ,null);
var startNode = resNode.singleNodeValue;

但是我得到了startNode'null'。

但是,这是有趣的一点:

如果我使用此评估'/ HTML [1] / BODY [1] / DIV [1] / DIV [3] / DIV [1] / DIV [1]' XPath表达式相同的功能,它给出了正确的节点,即'div'。

两个XPath之间的差异是先前的包含textNode,后来只有div。

但同样的事情在桌面浏览器上运行良好。

被修改 示例HTML

<html>
<head>
<script></script>
</head>
<body>
<div id="mainpage" class="highlighter-context">
<div>       Some text here also....... </div>
<div>      Some text here also.........</div>
<div>
  <h1 class="heading"></h1>
  <div class="left_side">
    <ol></ol>
    <h1></h1>
    <div class="text_bio">
    In human beings, height, colour of eyes, complexion, chin, etc. are 
    some recognisable features. A feature that can be recognised is known as 
    character or trait. Human beings reproduce through sexual reproduction. In this                
    process, two individuals one male and another female are involved. Male produces   
    male gamete or sperm and female produces female gamete or ovum. These gametes fuse 
    to form zygote which develops into a new young one which resembles to their parent. 
     During the process of sexual reproduction 
    </div>
  </div>
  <div class="righ_side">
  Some text here also.........
  </div>
  <div class="clr">
         Some text here also.......
  </div>
</div>
</div>
</body>
</html>

获取XPath:

var selection = window.getSelection(); 
var range = selection.getRangeAt(0); 
var xpJson = '{startXPath :"'+makeXPath(range.startContainer)+      
             '",startOffset:"'+range.startOffset+
             '",endXPath:"'+makeXPath(range.endContainer)+ 
             '",endOffset:"'+range.endOffset+'"}';

制作XPath的功能:

function makeXPath(node, currentPath) {
          currentPath = currentPath || ''; 
          switch (node.nodeType) { 
          case 3:
          case 4:return makeXPath(node.parentNode, 'text()[' + (document.evaluate('preceding-sibling::text()', node, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength + 1) + ']');
          case 1:return makeXPath(node.parentNode, node.nodeName + '[' + (document.evaluate('preceding-sibling::' + node.nodeName, node, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength + 1) + ']' + (currentPath ? '/' + currentPath : ''));
          case 9:return '/' + currentPath;default:return '';
    }
}

我不使用XML,而是在webview中使用HTML。

我尝试使用Rangy序列化和反序列化但是Rangy“Serialize”正常工作但不是“反序列化”。

任何想法的人,哪里出错了?

更新

最后得到问题的根本原因(尚未解决:()

` android webview中究竟发生了什么。 - &GT;&GT;不知何故,android webview正在改变加载的HTML页面的DOM结构。尽管DIV不包含任何TEXTNODES,但在从DIV中选择文本时,我正在为该DIV中的每一行获取TEXTNODE。例如,对于桌面浏览器中的相同HTML页面和相同的文本选择,从webview获取的XPath与桌面浏览器中提供的XPath完全不同'


XPath from Desktop Browser:
startXPath /HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[1]
startOffset: 184 
endXPath: /HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[1]
endOffset: 342

Xpath from webview:
startXPath :/HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[3]
startOffset:0 
endXPath:/HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[4]
endOffset:151

1 个答案:

答案 0 :(得分:0)

在您的示例中,路径/HTML[1]/BODY[1]/DIV[1]/DIV[3]/DIV[1]/DIV[1]/text()[5]选择div元素的第五个文本子节点

<div class="text_bio">
In human beings, height, colour of eyes, complexion, chin, etc. are 
some recognisable features. A feature that can be recognised is known as 
character or trait. Human beings reproduce through sexual reproduction. In this                
process, two individuals one male and another female are involved. Male produces   
male gamete or sperm and female produces female gamete or ovum. These gametes fuse 
to form zygote which develops into a new young one which resembles to their parent. 
 During the process of sexual reproduction 
</div>

div有一个文本子节点,所以我不明白为什么text()[5]应该选择任何东西。