Question

我正在尝试使用rvest从维基百科中提取ISO国家/地区信息（包括来自其他网页的链接）。我无法找到一种正确获取链接（href属性）而不包含名称的方法（我尝试过xpath字符串函数会导致错误）。它很容易运行 - 并且自我解释。

任何帮助表示赞赏！

library(rvest)
library(dplyr)

searchPage <- read_html("https://en.wikipedia.org/wiki/ISO_3166-2")
nodes <- html_node(searchPage, xpath = '(//h2[(span/@id = "Current_codes")]/following-sibling::table)[1]')
codes <- html_nodes(nodes, xpath = 'tr/td[1]/a/text()')
names <- html_nodes(nodes, xpath = 'tr/td[2]//a[@title]/text()')
#Following brings back data but attribute name as well
links <- html_nodes(nodes, xpath = 'tr/td[2]//a[@title]/@href')
#Following returns nothing
links2 <- html_nodes(nodes, xpath = 'tr/td[2]//a[@title]/@href/text()')
#Following Errors
links3 <- html_nodes(nodes, xpath = 'string(tr/td[2]//a[@title]/@href)')
#Following Errors
links4 <- sapply(nodes, function(x) { x %>% read_html() %>% html_nodes("tr/td[2]//a[@title]") %>% html_attr("href") })

Answer 1

您应该在问题中包含更多信息。＆＃34;不言自明＆＃34; 几乎让我忽略了这个问题（提示：考虑提供足够的口头细节，尊重他人的时间以及破碎的代码）。

我说b / c我不知道这是不是你需要或不是b / c你真的没有说过。

$('#ES_name').append('(TEST)');

R - 网页刮痧 - 使用rvest获取属性值时遇到问题

1 个答案: