计算面板数据的增长率

时间:2014-11-30 15:12:44

标签: r

data("Grunfeld",package = "AER")
library("plm") 
gr <- subset(Grunfeld, firm %in% c("General Electric","General Motors"))
pgr <- plm.data(gr,index = c("firm","year"))

row.names   firm    year    invest  value   capital
1   1   General Motors  1935    317.6   3078.5  2.8
2   2   General Motors  1936    391.8   4661.7  52.6
3   3   General Motors  1937    410.6   5387.1  156.9
4   4   General Motors  1938    257.7   2792.2  209.2
5   5   General Motors  1939    330.8   4313.2  203.4
6   6   General Motors  1940    461.2   4643.9  207.2
7   7   General Motors  1941    512.0   4551.2  255.2
8   8   General Motors  1942    448.0   3244.1  303.7
9   9   General Motors  1943    499.6   4053.7  264.1
10  10  General Motors  1944    547.5   4379.3  201.6
11  11  General Motors  1945    561.2   4840.9  265.0
12  12  General Motors  1946    688.1   4900.9  402.2
13  13  General Motors  1947    568.9   3526.5  761.5
14  14  General Motors  1948    529.2   3254.7  922.4
15  15  General Motors  1949    555.1   3700.2  1020.1
16  16  General Motors  1950    642.9   3755.6  1099.0
17  17  General Motors  1951    755.9   4833.0  1207.7
18  18  General Motors  1952    891.2   4924.9  1430.5
19  19  General Motors  1953    1304.4  6241.7  1777.3
20  20  General Motors  1954    1486.7  5593.6  2226.3
21  41  General Electric    1935    33.1    1170.6  97.8
22  42  General Electric    1936    45.0    2015.8  104.4
23  43  General Electric    1937    77.2    2803.3  118.0
24  44  General Electric    1938    44.6    2039.7  156.2
25  45  General Electric    1939    48.1    2256.2  172.6
26  46  General Electric    1940    74.4    2132.2  186.6
27  47  General Electric    1941    113.0   1834.1  220.9
28  48  General Electric    1942    91.9    1588.0  287.8
29  49  General Electric    1943    61.3    1749.4  319.9
30  50  General Electric    1944    56.8    1687.2  321.3
31  51  General Electric    1945    93.6    2007.7  319.6
32  52  General Electric    1946    159.9   2208.3  346.0
33  53  General Electric    1947    147.2   1656.7  456.4
34  54  General Electric    1948    146.3   1604.4  543.4
35  55  General Electric    1949    98.3    1431.8  618.3
36  56  General Electric    1950    93.5    1610.5  647.4
37  57  General Electric    1951    135.2   1819.4  671.3
38  58  General Electric    1952    157.3   2079.7  726.1
39  59  General Electric    1953    179.5   2371.6  800.3
40  60  General Electric    1954    189.6   2759.9  888.9

以上是数据,然后我想计算投资的增长率,因此通用汽车的增长率的第一个值将是NA,而通用电气的第一个价值将是 同样,这意味着我想按组计算增长率。

如果我使用以下命令:

pgr$invest_growth <- NA
pgr$invest_growth<- c(NA,diff(invest)/invest[-length(invest)])

我会得到一个结果,但对于41号,我得到了通用汽车和通用电气之间的增长率,而对于我的问题,我希望将41号值设为NA。

就像

一样
id firm invest
1  A    2
2  A    1 
3  A    4
4  A    3
1  B    2
2  B    5
3  B    2
4  B    1

然后

id firm invest growth rate
1  A    1         NA
2  A    2         1          
3  A    3        0.5
4  A    4        0.3333
1  B    5        NA
2  B    6        0.2
3  B    7        0.1666.
4  B    8        0.14

那么这种情况的命令是什么?非常感谢。

5 个答案:

答案 0 :(得分:1)

我会在基础R中使用ave并编写我自己的函数。

data("Grunfeld", package = "AER")
Grunfeld <- Grunfeld[order(Grunfeld$firm, Grunfeld$year), ]
myGR <- function(x, n=1) {
    c(rep(NA, n), diff(x, n) / head(x, -1*n))
}
Grunfeld$grInvest <- ave(Grunfeld$invest, Grunfeld$firm, FUN=myGR)   

编辑:如果您错过了多年(例如,这种方法不知道年份,只知道相对位置),这会失败,但我通常用expand.grid填写这些

fullPanel <- expand.grid(unique(Grunfeld$firm), min(Grunfeld$year):max(Grunfeld$year))
names(fullPanel) <- c("firm", "year")
GrunfeldAlt <- Grunfeld[-5, ]
GrunfeldAlt <- merge(GrunfeldAlt, fullPanel, all=TRUE)

答案 1 :(得分:1)

使用true软件包:

dplyr

答案 2 :(得分:0)

编辑:在QuantMod包中,您可以找到命令PercChange。如果您不想要百分比,也可以使用slide命令。

data(mtcars)

x<-split(mtcars,f=as.factor(mtcars$gear))

for(i in names(x)){
    x[[i]]<-PercChange(data=x[[i]],Var='mpg')
}

data<-unsplit(x,f=as.factor(mtcars$gear))

答案 3 :(得分:0)

有很多功能可以计算回报:TTR::ROCquantmod::dailyReturnquantmod::Delt等,但我会用你的。

myReturn <- function(x) c(NA, diff(x)/x[-length(x)])

至少有十几种方法可以做到这一点。搜索“split,apply,combine”,“group by”,“by”,“tapply”,“ave”。以下是几种方式。

data.table

library(data.table)
setDT(pgr) # convert to a data.table
pgr[, invest_growth:=myReturn(invest), by=firm] 
# if you don't want to create a column, pgr[, myReturn(invest), by=firm] will do.

分裂,lapply

pgr$invest_growth <-  unlist(lapply(split(pgr, pgr$firm), function(x) myReturn(x$invest)))

答案 4 :(得分:0)

collapse R软件包和通用函数fgrowth以及相关的增长算子G现在提供了一个解决此问题的快速解决方案。它还具有用于plm对象的方法:

data("Grunfeld",package = "AER")
library("plm") 
gr <- subset(Grunfeld, firm %in% c("General Electric","General Motors"))
pgr <- pdata.frame(gr, index = c("firm","year"))

library(collapse)
# This computes the proper panel growth rate of all numeric variables
G(pgr)
head(G(pgr))
                              firm year  G1.invest   G1.value  G1.capital
General Motors-1935 General Motors 1935         NA         NA          NA
General Motors-1936 General Motors 1936  23.362720  51.427643 1778.571429
General Motors-1937 General Motors 1937   4.798367  15.560847  198.288973
General Motors-1938 General Motors 1938 -37.238188 -48.168774   33.333333
General Motors-1939 General Motors 1939  28.366317  54.473175   -2.772467
General Motors-1940 General Motors 1940  39.419589   7.667161    1.868240

# You could add those growth rates to the data.frame using
add_vars(pgr) <- G(pgr, keep.ids = FALSE)

# You can only compute the growth rate of invest and turn off the automatic renaming with
G(pgr, cols = "invest", stubs = FALSE)

                                  firm year     invest
General Motors-1935     General Motors 1935         NA
General Motors-1936     General Motors 1936  23.362720
General Motors-1937     General Motors 1937   4.798367
General Motors-1938     General Motors 1938 -37.238188
General Motors-1939     General Motors 1939  28.366317

# You could also compute the growth rate of the panel-series directly:
G(pgr$invest)
  General Motors-1935   General Motors-1936   General Motors-1937   General Motors-1938   General Motors-1939   General Motors-1940 
                   NA             23.362720              4.798367            -37.238188             28.366317             39.419589 
  General Motors-1941   General Motors-1942   General Motors-1943   General Motors-1944   General Motors-1945   General Motors-1946 
            11.014744            -12.500000             11.517857              9.587670              2.502283             22.612259 
  General Motors-1947   General Motors-1948   General Motors-1949   General Motors-1950   General Motors-1951   General Motors-1952 
           -17.323064             -6.978379              4.894180             15.816970             17.576606             17.899193 
  General Motors-1953   General Motors-1954 General Electric-1935 General Electric-1936 General Electric-1937 General Electric-1938 
            46.364452             13.975774                    NA             35.951662             71.555556            -42.227979 
General Electric-1939 General Electric-1940 General Electric-1941 General Electric-1942 General Electric-1943 General Electric-1944 
             7.847534             54.677755             51.881720            -18.672566            -33.297062             -7.340946 
General Electric-1945 General Electric-1946 General Electric-1947 General Electric-1948 General Electric-1949 General Electric-1950 
            64.788732             70.833333             -7.942464             -0.611413            -32.809296             -4.883011 
General Electric-1951 General Electric-1952 General Electric-1953 General Electric-1954 
            44.598930             16.346154             14.113160              5.626741 

fgrowth / G完全基于C ++,因此在速度方面胜过所有其他解决方案,也可以应用于更大的面板:

library(microbenchmark)

microbenchmark(G(pgr))
Unit: microseconds
   expr    min     lq     mean median      uq     max neval
 G(pgr) 47.302 48.642 54.48262 49.533 50.6495 186.086   100

顺便说一句:flag / L / F使用fdiff / Dcollapse,还为滞后,超前,滞后/超前和迭代差异实现了类似的解决方案。