刮网页绕过服务器错误

时间:2014-03-30 21:06:06

标签: r curl

我正试图抓下网页

parenturl = http://www.liberty.co.uk/fcp/product/Liberty//Rosa-A-Tana-Lawn/1390

但我得到以下错误

srcpage = getURLContent(GET(parenturl)$url,timeout(10))
Error in function (type, msg, asError = TRUE)  : Empty reply from server

是否可以绕过并抓取网页

非常感谢提前

1 个答案:

答案 0 :(得分:0)

请尝试使用httr库:

library(httr)

pg <- GET("http://www.liberty.co.uk/fcp/product/Liberty//Rosa-A-Tana-Lawn/1390")
print(content(pg))
# too much to paste here