使用R下载并读取压缩的xml文件

时间:2014-07-28 22:50:31

标签: xml r zip

根据Dirk Eddelbuettel的this回答,我试图从xml存档中读取zip文件,以便进一步处理。除了URL和文件名之外,对引用的代码的唯一更改是我将read.table更改为xmlInternalTreeParse

library(XML)
temp <- tempfile()
download.file("http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&downfile=data%2Fnrg_105a.sdmx.zip",temp)
doc <- xmlInternalTreeParse(unz(temp, "nrg_105a.dsd.xml"))
fileunlink(temp)
closeAllConnections()

但是,这会返回以下错误:

Error in file.exists(file) : invalid 'file' argument

traceback()表明这是来自解析器的函数调用。因此,在这种情况下,临时似乎是不合适的参考。有没有办法使这项工作?

1 个答案:

答案 0 :(得分:2)

您可以尝试:

# Make a temporary file (tf) and a temporary folder (tdir)
tf <- tempfile(tmpdir = tdir <- tempdir())

## Download the zip file 
download.file("http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&downfile=data%2Fnrg_105a.sdmx.zip", tf)

## Unzip it in the temp folder
xml_files <- unzip(tf, exdir = tdir)

## Parse the first file
doc <- xmlInternalTreeParse(xml_files[1])

## Delete temporary files
unlink(tdir, T, T)