R xmlParse()索引

时间:2013-04-20 07:22:59

标签: xml r

我有xmlParse()生成的数据。我能够获得对名为root

的XMLNode的引用
> class(root)
[1] "XMLInternalElementNode" "XMLInternalNode"        "XMLAbstractNode"     

我什么时候

> root[[2]][[1]]
<tr class="party-republican race-winner"><th rowspan="5" class="results-county">Autauga <span class="precincts-reporting">100.0% Reporting</span></th>&#13;
                                &#13;
                                <th scope="row" class="results-candidate">M. Romney</th>&#13;
                                <td class="results-party"><abbr title="Republican">GOP</abbr></td>&#13;
                                <td class="results-percentage">72.6%</td>&#13;
                                <td class="results-popular">    17,366</td>&#13;
                            </tr> 

我尝试引用标记旁边的值:

<td class="results-percentage">

但是,root[[1]][[2]][["<td class='results-percentage'>]]执行返回null

我做错了什么阻止我访问72.6%的值?

1 个答案:

答案 0 :(得分:1)

你应该给出一个有效的xpath,像这样:

  //td[@class='results-percentage'] ## preeceeding by td and cotaining a certain class

使用您的数据:

library(XML)
dd <- xmlParse('<tr class="party-republican race-winner"><th rowspan="5" class="results-county">Autauga <span class="precincts-reporting">100.0% Reporting</span></th>&#13;
  &#13;
  <th scope="row" class="results-candidate">M. Romney</th>&#13;
  <td class="results-party"><abbr title="Republican">GOP</abbr></td>&#13;
  <td class="results-percentage">72.6%</td>&#13;
  <td class="results-popular">    17,366</td>&#13;
  </tr> ',asText=TRUE)

然后应用xpath

getNodeSet(dd, "//td[@class='results-percentage']/text()")[[1]]
72.6% 

或使用xpathSApply

xpathSApply(dd, "//td[@class='results-percentage']",xmlValue)
[1] "72.6%"