将数据帧列表中的所有列转换为时间序列

时间:2017-03-24 10:35:00

标签: r list time-series lapply

我有这样的清单:

df1 <- data.frame(var1 = 1:5, var2 = 6:10)
rownames(df1) <- 2001:2005
df2 <- data.frame(var5 = 21:25, var6 = 26:30)
rownames(df2) <- 2006:2010
mylist <- list(df1,df2)

> mylist
[[1]]
     var1 var2
2001    1    6
2002    2    7
2003    3    8
2004    4    9
2005    5   10

[[2]]
     var5 var6
2006   21   26
2007   22   27
2008   23   28
2009   24   29
2010   25   30

如何将每个数据框中的每一列转换为时间序列,其中开始和结束由相应数据框的rownames的最小值和最大值给出?

我试过了:

lapply(mylist, function(x) {apply(x, 2, function(y) ts(y, start = min(rownames(y), end = max(rownames(y)))))})

导致:

 Error in if (nobs != ndata) data <- if (NCOL(data) == 1) { : 
  missing value where TRUE/FALSE needed 

但没有任何意义。

1 个答案:

答案 0 :(得分:3)

我们可以使用lapply代替apply来循环列,因为apply的输出是matrix,所有类都会丢失。此外,min/max适用于numeric/integer元素,因此建议将characterrow.names转换为numeric

lst1 <- lapply(mylist, function(x) lapply(x, function(y) ts(y, 
      start = min(as.numeric(row.names(x))), end = max(as.numeric(row.names(x))))))
lst1[[1]][[1]]
#Time Series:
#Start = 2001 
#End = 2005 
#Frequency = 1 
#[1] 1 2 3 4 5

如果我们需要创建ts作为列,那么将输出分配回data.frame以保持结构像以前一样

 lst2 <- lapply(mylist, function(x) {
       x[] <- lapply(x, function(y) ts(y, start = min(as.numeric(row.names(x))),
                     end = max(as.numeric(row.names(x)))))
         x})
 str(lst2)
 #List of 2
 #$ :'data.frame':       5 obs. of  2 variables:
 # ..$ var1: Time-Series [1:5] from 2001 to 2005: 1 2 3 4 5
 # ..$ var2: Time-Series [1:5] from 2001 to 2005: 6 7 8 9 10
 #$ :'data.frame':       5 obs. of  2 variables:
 # ..$ var5: Time-Series [1:5] from 2006 to 2010: 21 22 23 24 25
 # ..$ var6: Time-Series [1:5] from 2006 to 2010: 26 27 28 29 30