Question

请考虑以下页面：

$( "#myform" ).validate( {
  rules: {
    field: {
      required: true,
      normalizer: function( value ) {
        // Trim the value of the `field` element before
        // validating. this trims only the value passed
        // to the attached validators, not the value of
        // the element itself.
        return $.trim( value );
      }
    }
  }
} );

如果我首先使用<n1 class="a"> 1 </n1> <n1 class="b"> <b>bold</b> 2 </n1>选择第一个n1，我应该排除第二个class="a"，实际上这似乎是正确的：

n1

但是，如果我们现在使用这个“子集化”页面：

library(rvest)
b_nodes = read_html('<n1 class="a">1</n1>
<n1 class="b"><b>bold</b>2</n1>') %>% 
  html_nodes(xpath = '//n1[@class="b"]')
b_nodes
# {xml_nodeset (1)}
# [1] <n1 class="b"><b>bold</b>2</n1>

如何“重新发现”b_nodes %>% html_nodes(xpath = '//n1') # {xml_nodeset (2)} # [1] <n1 class="a">1</n1> # [2] <n1 class="b"><b>bold</b>2</n1>节点？

注意：我知道如何使用两个单独的xpath获得我想要的内容。这是一个概念性的问题，为什么“子集”不能按预期工作。我的理解是1应该完全排除第一个节点 - b_nodes对象甚至不应该知道该节点存在。

Answer 1

html_nodes(xpath = '//n1')

//是/descendant-or-self::n1的缩写，当前节点整个文档

将其更改为.//n1，.表示当前节点您之前选择的内容

Answer 2

我不是你想要做什么，但是，你为什么不尝试用foreach遍历节点？我的意思是：

$XML = read_html('
<n1s>
<n1 class="a">1</n1>
<n1 class="b"><b>bold</b>2</n1></n1s>') %>%


$valueA = '';
$valueB = '';    
foreach ($XML->xpath('//n1') as $n1) {
        switch ((string)$n1['class']){
              case 'a':
                    $valueA = $XML->n1;
                     break;
              case 'b':
                    $valueB = $XML->n1;
                     break;
        }
    }

我希望这可以帮到你。此致！

为什么xpath会再次找到排除的节点？

2 个答案: