Question

我正在尝试使用R中的rvest软件包从网站中提取pi的数字，但是它一直给我一个xml error。

library(rvest)
pisite <- read_html("http://www.eveandersson.com/pi/digits/1000000")
pitable <- pisite %>% 
html_node(xpath = "/html/body/table[2]/tbody/tr/td[1]/pre/text()[1]")

我一直得到结果：

{xml_missing}
不适用

请注意，我从chrome网站检查工具复制了用于xpath的值。尽管它的确与我之前获得的xpath有所不同。

不确定要更改的内容。猜测它确实很简单。有什么想法吗？

Answer 1

也许这可以帮助：

library(rvest)
library(dplyr)
# here the site
pisite <- read_html("http://www.eveandersson.com/pi/digits/1000000")

# here you catch what you need
pi <- pisite %>% html_nodes("pre") %>% html_text()

# here you replace de \n with nothing, to have the numbers only
pi <-gsub("\n", "", pi)

pi
[1] "3.1415926535897932384626433832795028841971   ...and so on..."

如何使用rvest从以下网站提取pi的数字？

1 个答案: