如何在r中按列合并data.frame列表

时间:2018-02-17 02:35:18

标签: r loops merge

如果列的值相同,我会合并2个不同列表的2个data.frame(按列)。 这是我的解决方案,但速度很慢。

for(j in 1:length(s49)){
  for(i in 1:length(s39)){
    if(s39[[i]]$merge[1] == s49[[j]]$merge[1]){ #if che value of column "merge" is the same
      merge(s39[[i]], s49[[j]], by = "merge") # merge the data.frame
    }
  }
}

修改

time      lat       lon      callsign   OR   DE ICAOType   merge
1504539460 39.02001 1.482148   JAF6LY EBAW LEIB     E190 EBAW LEIB
1504539475 51.16286 4.521561   JAF6LY EBAW LEIB     E190 EBAW LEIB
1504539497 51.15481 4.502335   JAF6LY EBAW LEIB     E190 EBAW LEIB
1504539519 51.14867 4.482498   JAF6LY EBAW LEIB     E190 EBAW LEIB
1504539541 51.14499 4.455566   JAF6LY EBAW LEIB     E190 EBAW LEIB

time        lat         lon      callsign   OR   DE ICAOType  merge
1504442638 36.72127 -4.42139880   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442653 51.17394  4.54910278   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442675 51.16878  4.57587990   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442697 51.16277  4.60563660   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442719 51.15363  4.63652740   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442741 51.13408  4.64803335   JAF32X EBAW LEMG     E190  EBAW LEIB
1504442763 51.11506  4.62890625   JAF32X EBAW LEMG     E190  EBAW LEIB

第一个data.frame是第一个列表的一部分,第二个data.frame是第二个列表的一部分。所以,专栏"合并"具有相同的值,然后我想合并它们。我想对列表中的所有数据帧执行此操作。

a<-do.call("rbind", s39)
b<-do.call("rbind", s49)
c<-rbind(a,b)
d<-split(c, c$merge)

这是另一种可能的解决方案,但我有数百万的记录,而且会很慢。

可重复的例子:

df <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                    col2 = as.factor(sample(10)), col3 = "a")
df2 <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                  col2 = as.factor(sample(10)), col3 = "b")
df3 <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                    col2 = as.factor(sample(10)), col3 = "c")
my.list <- list(df, df2,df3)

df4 <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                 col2 = as.factor(sample(10)), col3 = "c")
df5 <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                  col2 = as.factor(sample(10)), col3 = "d")
df6 <- data.frame(col1 = sample(c(1,2), 10, replace = TRUE),
                  col2 = as.factor(sample(10)), col3 = "a")
my.list2 <- list(df4, df5,df6)

#this is my solution (slow for milions records)
a<-do.call("rbind", my.list)
b<-do.call("rbind", my.list2)
c<-rbind(a,b)
d<-split(c, c$col3)

非常感谢

0 个答案:

没有答案