Question

我正在尝试使用XML2软件包从ESPN.com中删除一些表格。为了举个例子，我想把第7周的幻想四分卫排名变成R，其中的URL是：

http://www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-quarterback-rankings

我正在尝试使用“read_html（）”函数来执行此操作，因为这是我最熟悉的。这是我的语法及其错误：

> wk.7.qb.rk = read_html("www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks", which = 1)
Error: 'www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks' does not exist in current working directory ('C:/Users/Brandon/Documents/Fantasy/Football/Daily').

我也试过“read_xml（）”，只是为了得到同样的错误：

> wk.7.qb.rk = read_xml("www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks", which = 1)
Error: 'www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks' does not exist in current working directory ('C:/Users/Brandon/Documents/Fantasy/Football/Daily').

为什么R在工作目录中查找此URL？我已尝试使用其他URL的此功能，并取得了一些成功。这个特定的网址是什么让它看起来与其他网站不同？而且，我该如何改变呢？

Answer 1

当我在循环中运行read_html以浏览20页时出现此错误。在第20页之后，循环仍在运行，没有url，因此它开始使用NAs调用read_html进行其他循环迭代。希望这有帮助！

在R中进行Webscraping，“......在当前工作目录中不存在”错误

1 个答案: