使用for循环对数据帧列表进行子集化

时间:2014-01-15 19:05:04

标签: r for-loop subset

我的问题是为什么最后一个语句“a< - ...”能够在列表中为我提供该数据帧的子集,但是当我尝试使用for循环遍历所有数据帧时自动执行该过程在列表中我遇到了各种警告而不是我正在寻找的答案?

time <- c(1:20)
temp <- c(2,3,4,5,6,2,3,4,5,6,2,3,4,5,6,2,3,4,5,6)
data <- data.frame(time,temp)

tmp <- c(1,diff(data[[2]]))
tmp2 <- tmp < 0
tmp3 <- cumsum(tmp2)
data1 <- split(data, tmp3)

#this does not work.  I want to automate the successful process below through all data frames in the list "data1"
for(i in 1:length(data1)){
   finale[i] <- subset(data1[[i]], data1[[i]][,2] > 3)
}

#this works to give me a part of what I want
a <- subset(data1[[1]], data1[[1]][,2] >3)

2 个答案:

答案 0 :(得分:1)

也许您可能想尝试使用lapply

lapply(data1, function(x) subset(x, x[,2]>3))

使用for循环

的结果相同
finale <- vector("list", length(data1))
for(i in 1:length(data1)){
  finale[[i]] <- subset(data1[[i]], data1[[i]][,2] > 3)
}

它有效,因为我预先为finale分配了一个类型和长度,它不适合你,因为你没有声明finale应该是什么。

答案 1 :(得分:0)

您正尝试在向量中保存data.frame(2D对象)(1D objetc)。只需将finale定义为列表,代码即可运行:

time <- c(1:20)
temp <- c(2,3,4,5,6,2,3,4,5,6,2,3,4,5,6,2,3,4,5,6)
data <- data.frame(time,temp)

tmp <- c(1,diff(data[[2]]))
tmp2 <- tmp < 0
tmp3 <- cumsum(tmp2)
data1 <- split(data, tmp3)

#this does not work.  I want to automate the successful process below through all data frames in the list "data1"
finale <- vector(mode='list')
for(i in 1:length(data1)){
   finale[[i]] <- subset(data1[[i]], data1[[i]][,2] > 3) # Use [[i]] instead of [i]
}

要保存所有1个data.frame:

finale <- do.call(rbind, finale)