我正在尝试将SDMX数据系列从网站下载到数据帧中。 http://stats.oecd.org/Index.aspx?DatasetCode=MEI_CLI
SDMX DATA URL是:
然后我尝试了:
library(XML2R)
file <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA19+G4E+G-7+NAFTA+OECDE+OECD+ONM+A5M+BRA+CHN+IND+IDN+RUS+ZAF.M/all?startTime=2000-01&endTime=2015-05"
obs <- XML2Obs(file)
tables <- collapse_obs(obs)
如何立即将时间,国家/地区和数据值收集到数据框中?
我还需要能够从2000收集数据(默认设置只有两年)
答案 0 :(得分:3)
使用原始链接试用rvest
包:
library(rvest)
k1<-html("http://stats.oecd.org/Index.aspx?DatasetCode=MEI_CLI")%>%
html_table(fill=TRUE,header=FALSE)%>%
.[[1]] //give me the first list only
View(k1) #see how table looks like. We only need row 4 (header), rows 8-last row.
names(k1)<-k1[4,] #gives me header
#We also don't need column 2. It's NA
data<-k1[8:nrow(k1),-2] #gives me final data
rm(k1) # remove k1, not necessary now
head(data)
Country Australia Austria Belgium Canada Chile Czech Republic Denmark Estonia Finland France Germany Greece Hungary Ireland Israel
8 Jun-2013 99.5 99.9 99.4 99.6 100.2 98.1 99.5 97.9 99.6 99.1 100.0 99.1 98.8 100.0 100.0
9 Jul-2013 99.5 100.0 99.6 99.7 100.1 98.4 99.7 98.2 99.8 99.2 100.2 99.0 98.7 100.1 99.9
10 Aug-2013 99.5 100.2 99.8 99.8 100.0 98.7 100.0 98.8 99.9 99.4 100.5 99.0 98.7 100.2 99.8
11 Sep-2013 99.6 100.3 100.1 99.9 99.9 99.2 100.2 99.4 100.0 99.6 100.7 99.0 98.7 100.2 99.7
12 Oct-2013 99.6 100.5 100.3 100.0 99.8 99.6 100.3 99.8 100.2 99.7 100.9 99.1 98.7 100.2 99.7
13 Nov-2013 99.7 100.6 100.5 100.0 99.7 100.0 100.4 100.0 100.4 99.8 101.0 99.3 98.6 100.2 99.6
Italy Japan Korea Mexico Netherlands New Zealand Norway Poland Portugal Slovak Republic Slovenia Spain Sweden Switzerland Turkey
8 99.2 100.7 100.2 99.0 99.2 101.3 100.1 99.9 98.4 98.6 98.1 99.2 99.0 100.2 100.7
9 99.4 100.8 100.2 98.8 99.4 101.5 100.2 100.0 98.6 98.8 98.0 99.5 99.1 100.3 100.5
10 99.7 101.0 100.1 98.7 99.5 101.6 100.4 100.1 98.9 99.0 98.0 99.7 99.1 100.4 100.3
11 99.9 101.1 100.1 98.7 99.7 101.6 100.5 100.2 99.3 99.2 98.0 100.0 99.2 100.6 100.2
12 100.0 101.3 100.0 98.8 99.8 101.7 100.7 100.2 99.8 99.5 98.1 100.2 99.3 100.7 100.1
13 100.2 101.4 100.0 98.8 99.9 101.7 100.9 100.3 100.2 99.9 98.2 100.4 99.4 100.8 99.9
United Kingdom United States Euro area (19 countries) Four Big European G7 NAFTA OECD - Europe OECD - Total OECD + Major Six NME
8 100.2 100.4 99.4 99.7 100.2 100.2 99.6 100.0 100.0
9 100.4 100.4 99.6 99.9 100.3 100.2 99.8 100.1 100.0
10 100.6 100.4 99.8 100.1 100.4 100.2 100.0 100.2 100.1
11 100.8 100.4 100.0 100.3 100.5 100.2 100.1 100.3 100.1
12 100.9 100.4 100.2 100.5 100.5 100.2 100.3 100.3 100.2
13 101.0 100.4 100.4 100.6 100.6 100.2 100.4 100.4 100.2
Major Five Asia Brazil China (People's Republic of) India Indonesia Russia South Africa
8 100.3 99.5 100.5 99.0 100.4 99.1 100.6
9 100.3 99.4 100.5 99.0 100.0 99.2 100.6
10 100.3 99.3 100.5 98.9 99.5 99.3 100.6
11 100.2 99.3 100.5 98.9 99.1 99.4 100.6
12 100.2 99.3 100.4 98.8 98.9 99.6 100.5
13 100.2 99.3 100.3 98.8 98.7 99.7 100.4
如果您愿意付出额外的努力来输入标题名称,可以使用XML包中的readHMTLTable
。
library(XML)
k2<-readHTMLTable("http://stats.oecd.o/Index.aspx?DatasetCode=MEI_CLI",header=FALSE) #first list gives the data
head(k2[[1]]) # as before you need to remove second column which is blank here
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19
1 Jun-2013 99.5 99.9 99.4 99.6 100.2 98.1 99.5 97.9 99.6 99.1 100.0 99.1 98.8 100.0 100.0 99.2 100.7
2 Jul-2013 99.5 100.0 99.6 99.7 100.1 98.4 99.7 98.2 99.8 99.2 100.2 99.0 98.7 100.1 99.9 99.4 100.8
3 Aug-2013 99.5 100.2 99.8 99.8 100.0 98.7 100.0 98.8 99.9 99.4 100.5 99.0 98.7 100.2 99.8 99.7 101.0
4 Sep-2013 99.6 100.3 100.1 99.9 99.9 99.2 100.2 99.4 100.0 99.6 100.7 99.0 98.7 100.2 99.7 99.9 101.1
5 Oct-2013 99.6 100.5 100.3 100.0 99.8 99.6 100.3 99.8 100.2 99.7 100.9 99.1 98.7 100.2 99.7 100.0 101.3
6 Nov-2013 99.7 100.6 100.5 100.0 99.7 100.0 100.4 100.0 100.4 99.8 101.0 99.3 98.6 100.2 99.6 100.2 101.4
V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38
1 100.2 99.0 99.2 101.3 100.1 99.9 98.4 98.6 98.1 99.2 99.0 100.2 100.7 100.2 100.4 99.4 99.7 100.2 100.2
2 100.2 98.8 99.4 101.5 100.2 100.0 98.6 98.8 98.0 99.5 99.1 100.3 100.5 100.4 100.4 99.6 99.9 100.3 100.2
3 100.1 98.7 99.5 101.6 100.4 100.1 98.9 99.0 98.0 99.7 99.1 100.4 100.3 100.6 100.4 99.8 100.1 100.4 100.2
4 100.1 98.7 99.7 101.6 100.5 100.2 99.3 99.2 98.0 100.0 99.2 100.6 100.2 100.8 100.4 100.0 100.3 100.5 100.2
5 100.0 98.8 99.8 101.7 100.7 100.2 99.8 99.5 98.1 100.2 99.3 100.7 100.1 100.9 100.4 100.2 100.5 100.5 100.2
6 100.0 98.8 99.9 101.7 100.9 100.3 100.2 99.9 98.2 100.4 99.4 100.8 99.9 101.0 100.4 100.4 100.6 100.6 100.2
V39 V40 V41 V42 V43 V44 V45 V46 V47 V48
1 99.6 100.0 100.0 100.3 99.5 100.5 99.0 100.4 99.1 100.6
2 99.8 100.1 100.0 100.3 99.4 100.5 99.0 100.0 99.2 100.6
3 100.0 100.2 100.1 100.3 99.3 100.5 98.9 99.5 99.3 100.6
4 100.1 100.3 100.1 100.2 99.3 100.5 98.9 99.1 99.4 100.6
5 100.3 100.3 100.2 100.2 99.3 100.4 98.8 98.9 99.6 100.5
6 100.4 100.4 100.2 100.2 99.3 100.3 98.8 98.7 99.7 100.4
答案 1 :(得分:3)
使用SDMX数据源,您可以使用rsdmx包(可在CRAN上使用)。
为此,您只需要以下内容:
myURL <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA19+G4E+G-7+NAFTA+OECDE+OECD+ONM+A5M+BRA+CHN+IND+IDN+RUS+ZAF.M/all?startTime=2013-06&endTime=2015-05"
sdmx.obj <- readSDMX(myURL)
sdmx.df <- as.data.frame(sdmx.obj)
head(sdmx.df)
那就是它!请随意查看包含更多示例的rsdmx wiki。