无论使用何种下载方法,从CDC ftp站点下载带有R的压缩文件都会损坏

时间:2014-04-19 07:57:37

标签: r curl ftp download httr

试图从带有R的cdc下载这个压缩文件。它可以从firefox中正常工作..所以我立即尝试了setInternet2(TRUE),但那仍然无法正常工作..

在下面的每一个案例中,我得到:

  

z,其中; -unzip(TF)

Warning message:
In unzip(tf) : zip file is corrupt

这是我所有尝试的起始两行 -

fn <- 'ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/dvs/natality/nat2012us.zip'
tf <- tempfile() ; td <- tempdir()

以及我在尝试的内容:

# fails
download.file(fn,tf,mode='wb')
z <- unzip( tf , exdir = td )

# fails
setInternet2(TRUE)
download.file(fn,tf,mode='wb')
z <- unzip( tf , exdir = td )

# fails
download.file(fn,tf,mode='wb',cacheOK=FALSE)
z <- unzip( tf , exdir = td )

# fails
setInternet2(TRUE)
download.file(fn,tf,mode='wb',cacheOK=FALSE)
z <- unzip( tf , exdir = td )

# fails
library(downloader)
download(fn,tf,mode='wb')
z <- unzip( tf , exdir = td )

# fails
library(httr)
resp <- GET(fn)
writeBin(content(resp, "raw"), tf)

# fails
library(RCurl)
x <- getBinaryURL( fn )
writeBin( x , tf )
z <- unzip(tf)


# in every case:
> file.info(tf)$size
[1] 228799759
抱歉,如果它有点蠢话

1 个答案:

答案 0 :(得分:2)

看起来Windows unzip="internal"就是问题所在。 shell()和winrar解决问题

fn <- 'ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/dvs/natality/nat2012us.zip'
tf <- tempfile()
download.file( fn , tf , mode = 'wb' )

wr <- normalizePath( "C:/Program Files/WinRAR/WinRAR.exe" )

td <- tempdir()
shell( paste0( '"' , wr , '" x ' , tf , ' ' , td ) )