R - 循环遍历列表中的多个数据帧

时间:2013-04-25 14:46:22

标签: r loops

更新:来自dput(ldf [[1]])

的结果

没有概率。这里是:     “A 04/18/2013 06:34:58 3D9.1C2D9F22C2”,“A 04/18/2013 06:34:58 3D9.1C2D9F22C2”,     “A 04/18/2013 06:38:24 3D9.1C2DDAE977”,“A 04/18/2013 06:42:38 3D9.1C2DA0E0B5”,     “A 04/18/2013 06:42:38 3D9.1C2DA0E0B5”,“A 04/18/2013 07:07:49 3D9.1C2DD9D3CF”,     “A 04/18/2013 07:07:49 3D9.1C2DD9D3CF”)

  • 问题可能在于某些行不能完成这4个变量。 * ------------------------------------------------- ----------------------------

我从这个论坛得到了很多这个难题,但我仍然被卡住了。我试图循环遍历30个数据帧的列表,其数据已从文本文件中读入。我在循环结束时不断收到错误消息和空目标数据帧。谁能看出问题出在哪里?

以下是一些示例数据:

[73] "E 04/21/2013 14:05:01 3D9.1C2DF6F22D" "E 04/21/2013 14:05:01 3D9.1C2DF6F22D"
[75] "E 04/21/2013 14:47:54 3D9.1C2DF6F22D" "E 04/21/2013 14:47:54 3D9.1C2DF6F22D"

[[26]]
[1] "E 04/22/2013 17:07:02 3D9.1C2DDAC745" "E 04/22/2013 17:07:02 3D9.1C2DDAC745"
[3] "E 04/22/2013 17:07:02 3D9.1C2DDAC745"

[[27]]
[1] "F 04/17/2013 15:14:39 3D9.1C2D1DB26E" "F 04/17/2013 15:14:43 3D9.1C2D1DB26E"
[3] "F 04/17/2013 15:14:43 3D9.1C2D1DB26E" "F 04/17/2013 15:14:43 3D9.1C2D1DB26E"

这是我的循环代码:

new <- data.frame()

for (i in 1:length(ldf)) {
 a[i] <- as.data.frame(ldf[i])
 a[i] <- as.data.frame(a[i][-1,])
 names(a[i]) <- "id"
 c[i] <- strsplit(as.character(a[i]$id)," ")
 reader[i] = sapply(c[i],function(x)x[1])
 date[i] = sapply(c[i],function(x)x[2])
 time[i] = sapply(c[i],function(x)x[3])
 code[i] = sapply(c[i],function(x)x[4])
 out[i] <- as.data.frame(cbind(reader[i],date[i],time[i],code[i]))

new <- rbind(new, out[i])
}

这是我收到的错误消息:

Error in [<-.data.frame(`*tmp*`, i, 
value = list(c..A.04.17.2013.12.24.07.3D9.1C2D1DB26E....A.04.17.2013.12.24.07.3D9.1C2D1DB26E... = c(1L,  
: replacement element 1 has 337 rows, need 394

谢谢!

2 个答案:

答案 0 :(得分:1)

如果我理解正确你想要这个:

ldf <- list(c("E 04/21/2013 14:05:01 3D9.1C2DF6F22D","E 04/21/2013 14:05:01 3D9.1C2DF6F22D","E 04/21/2013 14:47:54 3D9.1C2DF6F22D","E 04/21/2013 14:47:54 3D9.1C2DF6F22D"),
c("E 04/22/2013 17:07:02 3D9.1C2DDAC745","E 04/22/2013 17:07:02 3D9.1C2DDAC745","E 04/22/2013 17:07:02 3D9.1C2DDAC745"),
c("F 04/17/2013 15:14:39 3D9.1C2D1DB26E","F 04/17/2013 15:14:43 3D9.1C2D1DB26E","F 04/17/2013 15:14:43 3D9.1C2D1DB26E","F 04/17/2013 15:14:43 3D9.1C2D1DB26E"))

do.call(rbind,lapply(ldf,function(x) data.frame(do.call(rbind,strsplit(x," ")))))
   X1         X2       X3             X4
1   E 04/21/2013 14:05:01 3D9.1C2DF6F22D
2   E 04/21/2013 14:05:01 3D9.1C2DF6F22D
3   E 04/21/2013 14:47:54 3D9.1C2DF6F22D
4   E 04/21/2013 14:47:54 3D9.1C2DF6F22D
5   E 04/22/2013 17:07:02 3D9.1C2DDAC745
6   E 04/22/2013 17:07:02 3D9.1C2DDAC745
7   E 04/22/2013 17:07:02 3D9.1C2DDAC745
8   F 04/17/2013 15:14:39 3D9.1C2D1DB26E
9   F 04/17/2013 15:14:43 3D9.1C2D1DB26E
10  F 04/17/2013 15:14:43 3D9.1C2D1DB26E
11  F 04/17/2013 15:14:43 3D9.1C2D1DB26E

请注意,所有列都是类因子。

答案 1 :(得分:0)

ldf是您的数据框列表吗?如果是这样,您没有正确索引它们。您正在尝试对类list的对象进行操作。看看这个玩具示例:

L <- list( x=matrix(1:4,nrow=2) , y=matrix(1:4,nrow=2) )
L
#$x
#    [,1] [,2]
#[1,]    1    3
#[2,]    2    4

#$y
#    [,1] [,2]
#[1,]    1    3
#[2,]    2    4

class(L[1])
[1] "list"
class(L[[1]])
[1] "matrix"

你在循环中使用了很多循环结构,这没有多大意义,因为它们作为便利函数提供,因此你不必使用循环。使用您的数据子集,您可以使用lapply访问列表中的每个data.frame,然后在每个数据框的列中使用apply来运行{ {1}}就像这样:

strplit