R将SDMX数据导入数据帧

时间:2015-06-17 11:19:36

标签: xml r

我正在尝试将SDMX数据系列从网站下载到数据帧中。 http://stats.oecd.org/Index.aspx?DatasetCode=MEI_CLI

SDMX DATA URL是:

http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA19+G4E+G-7+NAFTA+OECDE+OECD+ONM+A5M+BRA+CHN+IND+IDN+RUS+ZAF.M/all?startTime=2000-01&endTime=2015-05

然后我尝试了:

library(XML2R)

file <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA19+G4E+G-7+NAFTA+OECDE+OECD+ONM+A5M+BRA+CHN+IND+IDN+RUS+ZAF.M/all?startTime=2000-01&endTime=2015-05"

obs <- XML2Obs(file)
tables <- collapse_obs(obs)

如何立即将时间,国家/地区和数据值收集到数据框中?

我还需要能够从2000收集数据(默认设置只有两年)

2 个答案:

答案 0 :(得分:3)

使用原始链接试用rvest包:

library(rvest)
k1<-html("http://stats.oecd.org/Index.aspx?DatasetCode=MEI_CLI")%>%
html_table(fill=TRUE,header=FALSE)%>%
.[[1]] //give me the first list only

View(k1) #see how table looks like. We only need row 4 (header), rows 8-last row.  
names(k1)<-k1[4,] #gives me header 

#We also don't need column 2. It's NA
data<-k1[8:nrow(k1),-2] #gives me final data 
rm(k1) # remove k1, not necessary now

head(data)
    Country Australia Austria Belgium Canada Chile Czech Republic Denmark Estonia Finland France Germany Greece Hungary Ireland Israel
8  Jun-2013      99.5    99.9    99.4   99.6 100.2           98.1    99.5    97.9    99.6   99.1   100.0   99.1    98.8   100.0  100.0
9  Jul-2013      99.5   100.0    99.6   99.7 100.1           98.4    99.7    98.2    99.8   99.2   100.2   99.0    98.7   100.1   99.9
10 Aug-2013      99.5   100.2    99.8   99.8 100.0           98.7   100.0    98.8    99.9   99.4   100.5   99.0    98.7   100.2   99.8
11 Sep-2013      99.6   100.3   100.1   99.9  99.9           99.2   100.2    99.4   100.0   99.6   100.7   99.0    98.7   100.2   99.7
12 Oct-2013      99.6   100.5   100.3  100.0  99.8           99.6   100.3    99.8   100.2   99.7   100.9   99.1    98.7   100.2   99.7
13 Nov-2013      99.7   100.6   100.5  100.0  99.7          100.0   100.4   100.0   100.4   99.8   101.0   99.3    98.6   100.2   99.6
   Italy Japan Korea Mexico Netherlands New Zealand Norway Poland Portugal Slovak Republic Slovenia Spain Sweden Switzerland Turkey
8   99.2 100.7 100.2   99.0        99.2       101.3  100.1   99.9     98.4            98.6     98.1  99.2   99.0       100.2  100.7
9   99.4 100.8 100.2   98.8        99.4       101.5  100.2  100.0     98.6            98.8     98.0  99.5   99.1       100.3  100.5
10  99.7 101.0 100.1   98.7        99.5       101.6  100.4  100.1     98.9            99.0     98.0  99.7   99.1       100.4  100.3
11  99.9 101.1 100.1   98.7        99.7       101.6  100.5  100.2     99.3            99.2     98.0 100.0   99.2       100.6  100.2
12 100.0 101.3 100.0   98.8        99.8       101.7  100.7  100.2     99.8            99.5     98.1 100.2   99.3       100.7  100.1
13 100.2 101.4 100.0   98.8        99.9       101.7  100.9  100.3    100.2            99.9     98.2 100.4   99.4       100.8   99.9
   United Kingdom United States Euro area (19 countries) Four Big European    G7 NAFTA OECD - Europe OECD - Total OECD + Major Six NME
8           100.2         100.4                     99.4              99.7 100.2 100.2          99.6        100.0                100.0
9           100.4         100.4                     99.6              99.9 100.3 100.2          99.8        100.1                100.0
10          100.6         100.4                     99.8             100.1 100.4 100.2         100.0        100.2                100.1
11          100.8         100.4                    100.0             100.3 100.5 100.2         100.1        100.3                100.1
12          100.9         100.4                    100.2             100.5 100.5 100.2         100.3        100.3                100.2
13          101.0         100.4                    100.4             100.6 100.6 100.2         100.4        100.4                100.2
   Major Five Asia Brazil China (People's Republic of) India Indonesia Russia South Africa
8            100.3   99.5                        100.5  99.0     100.4   99.1        100.6
9            100.3   99.4                        100.5  99.0     100.0   99.2        100.6
10           100.3   99.3                        100.5  98.9      99.5   99.3        100.6
11           100.2   99.3                        100.5  98.9      99.1   99.4        100.6
12           100.2   99.3                        100.4  98.8      98.9   99.6        100.5
13           100.2   99.3                        100.3  98.8      98.7   99.7        100.4

如果您愿意付出额外的努力来输入标题名称,可以使用XML包中的readHMTLTable

library(XML)
k2<-readHTMLTable("http://stats.oecd.o/Index.aspx?DatasetCode=MEI_CLI",header=FALSE) #first list gives the data
head(k2[[1]]) # as before you need to remove second column which is blank here

      V1 V2   V3    V4    V5    V6    V7    V8    V9   V10   V11  V12   V13  V14  V15   V16   V17   V18   V19
1 Jun-2013    99.5  99.9  99.4  99.6 100.2  98.1  99.5  97.9  99.6 99.1 100.0 99.1 98.8 100.0 100.0  99.2 100.7
2 Jul-2013    99.5 100.0  99.6  99.7 100.1  98.4  99.7  98.2  99.8 99.2 100.2 99.0 98.7 100.1  99.9  99.4 100.8
3 Aug-2013    99.5 100.2  99.8  99.8 100.0  98.7 100.0  98.8  99.9 99.4 100.5 99.0 98.7 100.2  99.8  99.7 101.0
4 Sep-2013    99.6 100.3 100.1  99.9  99.9  99.2 100.2  99.4 100.0 99.6 100.7 99.0 98.7 100.2  99.7  99.9 101.1
5 Oct-2013    99.6 100.5 100.3 100.0  99.8  99.6 100.3  99.8 100.2 99.7 100.9 99.1 98.7 100.2  99.7 100.0 101.3
6 Nov-2013    99.7 100.6 100.5 100.0  99.7 100.0 100.4 100.0 100.4 99.8 101.0 99.3 98.6 100.2  99.6 100.2 101.4
    V20  V21  V22   V23   V24   V25   V26  V27  V28   V29  V30   V31   V32   V33   V34   V35   V36   V37   V38
1 100.2 99.0 99.2 101.3 100.1  99.9  98.4 98.6 98.1  99.2 99.0 100.2 100.7 100.2 100.4  99.4  99.7 100.2 100.2
2 100.2 98.8 99.4 101.5 100.2 100.0  98.6 98.8 98.0  99.5 99.1 100.3 100.5 100.4 100.4  99.6  99.9 100.3 100.2
3 100.1 98.7 99.5 101.6 100.4 100.1  98.9 99.0 98.0  99.7 99.1 100.4 100.3 100.6 100.4  99.8 100.1 100.4 100.2
4 100.1 98.7 99.7 101.6 100.5 100.2  99.3 99.2 98.0 100.0 99.2 100.6 100.2 100.8 100.4 100.0 100.3 100.5 100.2
5 100.0 98.8 99.8 101.7 100.7 100.2  99.8 99.5 98.1 100.2 99.3 100.7 100.1 100.9 100.4 100.2 100.5 100.5 100.2
6 100.0 98.8 99.9 101.7 100.9 100.3 100.2 99.9 98.2 100.4 99.4 100.8  99.9 101.0 100.4 100.4 100.6 100.6 100.2
    V39   V40   V41   V42  V43   V44  V45   V46  V47   V48
1  99.6 100.0 100.0 100.3 99.5 100.5 99.0 100.4 99.1 100.6
2  99.8 100.1 100.0 100.3 99.4 100.5 99.0 100.0 99.2 100.6
3 100.0 100.2 100.1 100.3 99.3 100.5 98.9  99.5 99.3 100.6
4 100.1 100.3 100.1 100.2 99.3 100.5 98.9  99.1 99.4 100.6
5 100.3 100.3 100.2 100.2 99.3 100.4 98.8  98.9 99.6 100.5
6 100.4 100.4 100.2 100.2 99.3 100.3 98.8  98.7 99.7 100.4
  

答案 1 :(得分:3)

使用SDMX数据源,您可以使用rsdmx包(可在CRAN上使用)。

为此,您只需要以下内容:

myURL <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA19+G4E+G-7+NAFTA+OECDE+OECD+ONM+A5M+BRA+CHN+IND+IDN+RUS+ZAF.M/all?startTime=2013-06&endTime=2015-05"
sdmx.obj <- readSDMX(myURL)
sdmx.df <- as.data.frame(sdmx.obj)
head(sdmx.df)

那就是它!请随意查看包含更多示例的rsdmx wiki。