我有这样的清单:
df1 <- data.frame(var1 = 1:5, var2 = 6:10)
rownames(df1) <- 2001:2005
df2 <- data.frame(var5 = 21:25, var6 = 26:30)
rownames(df2) <- 2006:2010
mylist <- list(df1,df2)
> mylist
[[1]]
var1 var2
2001 1 6
2002 2 7
2003 3 8
2004 4 9
2005 5 10
[[2]]
var5 var6
2006 21 26
2007 22 27
2008 23 28
2009 24 29
2010 25 30
如何将每个数据框中的每一列转换为时间序列,其中开始和结束由相应数据框的rownames的最小值和最大值给出?
我试过了:
lapply(mylist, function(x) {apply(x, 2, function(y) ts(y, start = min(rownames(y), end = max(rownames(y)))))})
导致:
Error in if (nobs != ndata) data <- if (NCOL(data) == 1) { :
missing value where TRUE/FALSE needed
但没有任何意义。
答案 0 :(得分:3)
我们可以使用lapply
代替apply
来循环列,因为apply
的输出是matrix
,所有类都会丢失。此外,min/max
适用于numeric/integer
元素,因此建议将character
类row.names
转换为numeric
lst1 <- lapply(mylist, function(x) lapply(x, function(y) ts(y,
start = min(as.numeric(row.names(x))), end = max(as.numeric(row.names(x))))))
lst1[[1]][[1]]
#Time Series:
#Start = 2001
#End = 2005
#Frequency = 1
#[1] 1 2 3 4 5
如果我们需要创建ts
作为列,那么将输出分配回data.frame以保持结构像以前一样
lst2 <- lapply(mylist, function(x) {
x[] <- lapply(x, function(y) ts(y, start = min(as.numeric(row.names(x))),
end = max(as.numeric(row.names(x)))))
x})
str(lst2)
#List of 2
#$ :'data.frame': 5 obs. of 2 variables:
# ..$ var1: Time-Series [1:5] from 2001 to 2005: 1 2 3 4 5
# ..$ var2: Time-Series [1:5] from 2001 to 2005: 6 7 8 9 10
#$ :'data.frame': 5 obs. of 2 variables:
# ..$ var5: Time-Series [1:5] from 2006 to 2010: 21 22 23 24 25
# ..$ var6: Time-Series [1:5] from 2006 to 2010: 26 27 28 29 30