组合两个数据帧和聚合

时间:2017-03-23 03:05:38

标签: r dataframe merge aggregate

我有以下格式的2个数据框:

dt1

id     col1    col2    col3    col4 
___    ____    ____    _____   _____
 1      2       3       1       2
 2      3       4       1       1
 3      1       1       1       1
 4      1       2       1       2
 5      1       1       1       1
 6      1       2       1       2

dt2 

id     col1    col2    col3    col4 
___    ____    ____    _____   _____
 1      1       3       1       2
 2      3       4       1       0
 4      1       1       1       1
 6      1       2       1       2
 9      2       1       1       1
12      1       2       1       2

我希望通过id和结果数据框(如

)聚合和组合这两个数据帧
dt3

 id     col1    col2    col3    col4 
    ___    ____    ____    _____   _____
     1      3       6       2       4
     2      6       8       2       1
     3      1       1       1       1
     4      2       3       2       3
     5      1       1       1       1
     6      2       4       2       4
     9      2       1       1       1
    12      1       2       1       2

我尝试使用dt3=merge(dt1,dt2,all=TRUE)但是没有用。还试过dt3=merge(dt1,dt2,by=id)也没有用。感谢任何帮助。

4 个答案:

答案 0 :(得分:1)

我们可以在rbindlist中使用data.table并在按照' id'

进行分组后获取每列的sum
library(data.table)
rbindlist(mget(paste0('dt', 1:2)))[, lapply(.SD, sum), by = id]
#    id col1 col2 col3 col4
#1:  1    3    6    2    4
#2:  2    6    8    2    1
#3:  3    1    1    1    1
#4:  4    2    3    2    3
#5:  5    1    1    1    1
#6:  6    2    4    2    4
#7:  9    2    1    1    1
#8: 12    1    2    1    2

bind_rowsgroup_bysummarise_each tidyverse使用librarydplyr) bind_rows(dt1, dt2) %>% group_by(id) %>% summarise_each(funs(sum))

var YNvalue = "";
var products = [
{
 "brand": "brand1",
 "prodNum": "01-005",
 "YN": "Yes",
 "Stock": "Order"
 },
{
"brand": "brand2",
"prodNum": "02-005",
"YN": "Yes",
"Stock": "Ship"
},
{
"brand": "brand1",
"prodNum": "01-008",
"YN": "No",
"Stock": "Order"
}
]
function main() {
  var option = window.prompt("Product Number?", "01-008")
  YNvalue = getYNByProdNum(option)
  window.alert("Your YN value is simply " + YNvalue)
  console.log(YNvalue);
}

function getYNByProdNum(prodNum) {
  //loop through each product until we find on number that matches
  for(i in products) {
    //if product's number matches we return the YN value
    if(products[i].prodNum == prodNum) {
      return products[i].YN
    }
  }
}

答案 1 :(得分:0)

您正在寻找的神奇单词是rbind:     dt3 = rbind(dt1, dt2)

答案 2 :(得分:0)

由于它们具有相同的格式并且列匹配,因此将它们逐行放置。

dt3< - data.frame(dt1)

dt3< - rbind(dt2)#rbind逐行排列你的观察结果。

你可以把它全部放在一行

dt3< - data.frame(rbind(dt1,dt2))

答案 3 :(得分:0)

以下是dplyr解决方案:

library(dplyr)
bind_rows(dt1, dt2) %>% group_by(id) %>% 
  summarise_all(sum)

数据

dt1  <- structure(
  list(id = 1:6, col1 = c(2L, 3L, 1L, 1L, 1L, 1L), 
       col2 = c(3L, 4L, 1L, 2L, 1L, 2L), 
       col3 = c(1L, 1L, 1L, 1L, 1L, 1L), 
       col4 = c(2L, 1L, 1L, 2L, 1L, 2L)), 
  .Names = c("id", "col1", "col2", "col3",  "col4"), 
  class = "data.frame", row.names = c(NA, -6L))


dt2 <- structure(
  list(id = c(1L, 2L, 4L, 6L, 9L, 12L), 
       col1 = c(1L, 3L, 1L, 1L, 2L, 1L), 
       col2 = c(3L, 4L, 1L, 2L, 1L, 2L), 
       col3 = c(1L, 1L, 1L, 1L, 1L, 1L), 
       col4 = c(2L, 0L, 1L, 2L, 1L, 2L)), 
  .Names = c("id", "col1", "col2", "col3", "col4"), 
  class = "data.frame", row.names = c(NA, -6L))