Question

这是一个简单的xpath练习，但我无法让它发挥作用。

当我检查按钮的元素（使用google chrom）时，它会给出这个树 - 我想抓住标题，例如＆＃34;杰出贡献者＆＃34;或者＆＃34;董事会经理＆＃34;。

<span class="author-by"></span>

<span class="UserName lia-user-name">

    <img id="display_3" class="lia-user-rank-icon-left" alt="Distinguished Contributor" title="Distinguished Contributor"></img>

.....

<span class="author-by"></span>

<span class="UserName lia-user-name">

    <img id="display_25" class="lia-user-rank-icon-left" alt="Board Manager" title="Board Manager"></img>

到目前为止，我试过了

> xpathSApply(htmltree, "//img[@class='lia-user-rank-icon-left']", xmlGetAttr, "href")

> test = "//img/@title"
> a <- xpathSApply(htmltree, test, function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

和其他一些人，但它还没有成功。任何指导将非常感谢！

Answer 1

这是一个用类＆＃39; dno＆＃39;来获取图像源的示例。我认为在你的情况下，你必须改变“不”。和＆＃39; src＆＃39;。

library(RCurl)
library(XML)
text = getURL("http://stackoverflow.com/questions/23024062/r-right-xpath-to-grab-the-text-using-xpathsapply")
d = htmlParse(text)
L = xpathApply(d, "//img[@class='dno']")
sapply(L, xmlGetAttr, "src")

您可以用xpathApply(d, "//img[@class='dno']", xmlGetAttr, "src")替换最后两行。但是，出于调试目的，最好将其拆分为两个命令。

R，右xpath使用xpathSApply获取文本

1 个答案: