Question

我正在尝试在远程服务器上下载R中的文件，该服务器位于许多代理之后。东西 - 我无法弄清楚是什么 - 导致文件在我尝试在该服务器上访问时被缓存返回，无论是通过R还是仅通过Web浏览器进行访问。

我已尝试在cacheOK=FALSE电话中使用download.file，但这没有效果。

每Is there a way to force browsers to refresh/download images?我尝试在网址末尾添加随机后缀：

download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",
                          format(Sys.time(), "%d%m%Y"),sep=""), 
              destfile = "F-F_Research_Data_Factors_daily.zip", cacheOK=FALSE)

这会产生例如以下网址：

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?17092012

当从远程服务器上的Web浏览器访问时，确实会返回该文件的最新版本。但是，当使用R中的download.file访问时，会返回损坏的zip存档。 WinRAR和R的unzip函数都抱怨zip文件已损坏。

unzip("F-F_Research_Data_Factors_daily.zip")
1: In unzip("F-F_Research_Data_Factors_daily.zip") :
internal error in unz code

我不明白为什么通过R下载此文件会导致文件被损坏，而通过Web浏览器下载则没有问题。

任何人都可以建议一种方法来从R中击败缓存（对此我不抱希望），或者为什么download.file不喜欢我的URL？someRandomString被添加到它的末尾？

Answer 1

如果您使用mode="wb"

，它会有效

download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",format(Sys.time(),"%d%m%Y"),sep=""), 
          destfile = "F-F_Research_Data_Factors_daily.zip", mode='wb', cacheOK=FALSE)

将随机后缀附加到文件名时，download.file（）失败

1 个答案: