Question

我正在尝试使用 R 从网页下载并提取.csv文件。

此问题与Using R to download zipped data file, extract, and import data重复。

我无法使解决方案起作用，但可能是由于我正在使用的网址。

我正在尝试从http://data.worldbank.org/country/united-kingdom下载.csv文件（在下载数据下拉列表中）

使用上面链接中的@ Dirk解决方案，我试过

temp <- tempfile()
download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
con <- unz(temp, "gbr_Country_en_csv_v2.csv")
dat <- read.table(con, header=T, skip=2)
unlink(temp)

我通过查看页面源代码获得了扩展链接，我希望这会导致问题，但是如果我将其粘贴到地址栏中就可以了。

使用正确的Gb

下载文件

download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
# trying URL 'http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv'
# Content type 'application/zip' length 332358 bytes (324 Kb)
# opened URL
# downloaded 324 Kb

# also tried unzip but get this warning
con <- unzip(temp, "gbr_Country_en_csv_v2.csv")
# Warning message:
# In unzip(temp, "gbr_Country_en_csv_v2.csv") :
# requested file not found in the zip file

但是当我手动下载它们时，这些是文件名。

我非常感谢我在哪里出错了，谢谢

我使用的是Windows 8，R版本3.1.0

Answer 1

为了让您的数据下载和解压缩，您需要设置mode="wb"

download.file("...",temp, mode="wb")
unzip(temp, "gbr_Country_en_csv_v2.csv")
dd <- read.table("gbr_Country_en_csv_v2.csv", sep=",",skip=2, header=T)

看起来默认是＆＃34; w＆＃34;它假定一个文本文件。如果它是一个简单的csv文件，这没关系。但是，由于它是压缩的，它是一个二进制文件，因此＆＃34; wb＆＃34;。没有＆＃34; wb＆＃34;部分，你根本不能打开拉链。

Answer 2

几乎一切都好。在这种情况下，您只需要指定它是逗号分隔文件，例如使用sep=","中的read.table：

temp <- tempfile()
download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv", 
              temp)
con <- unz(temp, "gbr_Country_en_csv_v2.csv")
dat <- read.table(con, header=T, skip=2, sep=",")
unlink(temp)

通过这个小小的改变，我可以顺利导入你的csv。

HTH，Luca

Answer 3

可以使用WDI package获取世界银行发展指标。例如，

library(WDI)
inds <- WDIsearch(field = "indicator")[, 1]
GB <- WDI("GB", indicator = inds)

有关详细信息，请参阅WDIsearch和WDI函数以及rerference manual。

使用R下载压缩数据文件，解压缩并导入.csv

3 个答案: