Question

我想从this link抓取数据，并且我已经在R中编写了以下代码。但是，这不起作用，仅返回结果的第一页。显然，该循环不起作用。有人知道循环出了什么问题吗？

pivottablejs

Answer 1

您确定循环有问题吗？我希望它能获得40次结果的首页。看看

webpage <- read_html(paste0(("http://search.beaconforfreedom.org/search/censored_publications/result.html?author=&cauthor=&title=&country=7327&language=&censored_year=&censortype=&published_year=&censorreason=&sort=t&page=, i"))

不是（字符串的最后十个字符不同；引号会移动）

webpage <- read_html(paste0(("http://search.beaconforfreedom.org/search/censored_publications/result.html?author=&cauthor=&title=&country=7327&language=&censored_year=&censortype=&published_year=&censorreason=&sort=t&page=", i))

paste0在R中的作用是将两个字符串缝合在一起而没有任何分隔符。但是您只有一个字符串。因此，它尝试获取page=, i的结果。但是您希望它通过page=1来获取page=40。因此，请使用双引号（page=", i），以便将URL和i粘贴在一起。

我不是R程序员，但是那简直就是突如其来。

Source用于paste0的行为。

使用for循环在R中进行Web抓取

1 个答案: