编写嵌套的for循环以连接在R中共享数据帧中的键的行

时间:2018-09-19 15:33:28

标签: r loops

所以,我有一个ID的关键数据框

IDs <- data.frame(c(123,456,789))

我还有一个需要串联的拆分SQL查询的数据框(由于长度的原因,查询被截断了,所以我不得不将它们拆分成多个部分)

splitQueriesdf <- data.frame(ID = c(123,123,123,456,456,456,789,789,789), SplitQUery = c("SELECT", "* FROM", "tablename1","SELECT", "* FROM", "tablename2","SELECT", "* FROM", "tablename3"))

我需要编写一个循环,该循环将ID数据帧中存在的ID的查询连接到第3个数据帧中。 nrows(ID)会有所不同,所以我也需要保持动态

所以我需要第三个数据框看起来像:

    ID  FullQuery
 1  123 SELECT * FROM tablename1
 2  456 SELECT * FROM tablename2
 3  789 SELECT * FROM tablename3

我有一个想法,我需要一个遍历ID长度的循环-如此3次,以及一个将正确行附加在一起的嵌套循环,但是我对R还是陌生的,卡住。这是我到目前为止的内容:

dataframe3= NULL
for (index in 1:nrow(IDs)){
   for (index2 in 1:nrow(splitQueriesdf)){ 
     dataframe3[index] <- rbind(splitQueriesdf[index2,4])
  }
}

非常感谢您的帮助!

3 个答案:

答案 0 :(得分:2)

一个选项是aggregate中的base R按“ ID”分组,然后按paste“ SplitQUery”列

splitQueriesdf$SplitQUery <- as.character(splitQueriesdf$SplitQUery)
aggregate(cbind(FullQuery = SplitQUery) ~ ID, splitQueriesdf,
          FUN = paste, collapse = ' ')
#  ID                FullQuery
#1 123 SELECT * FROM tablename1
#2 456 SELECT * FROM tablename2
#3 789 SELECT * FROM tablename3

答案 1 :(得分:1)

使用数据表包,您可以执行以下操作:

library(data.table)
IDs <- data.frame(ID = c(123,456,789))
splitQueriesdf <- data.frame(ID = c(123,123,123,456,456,456,789,789,789), SplitQUery = c("SELECT", "* FROM", "tablename1","SELECT", "* FROM", "tablename2","SELECT", "* FROM", "tablename3"))

setDT(splitQueriesdf)
splitQueriesdf[ID %in% IDs$ID, paste(SplitQUery, collapse = " "), by = .(ID)]

   ID                FullQuery
1: 123 SELECT * FROM tablename1
2: 456 SELECT * FROM tablename2
3: 789 SELECT * FROM tablename3

答案 2 :(得分:1)

使用tidyverse:

splitQueriesdf %>% group_by(ID) %>% summarise(query=paste(SplitQUery,collapse=" "))
## A tibble: 3 x 2
#     ID query                   
#  <dbl> <chr>                   
#1   123 SELECT * FROM tablename1
#2   456 SELECT * FROM tablename2
#3   789 SELECT * FROM tablename3