我在r
中有两个数据帧ship_no bay_1 bay_2 bay_3 bay_5 bay_6
ABC 0 10 15 20 30
DEF 10 20 0 25 10
ERT 0 10 0 10 0
ship_no bay_1 bay_2 bay_7 bay_5 bay_6
ABC 10 10 10 0 0
DEF 10 10 0 15 10
ERT 0 0 0 10 0
我想在列键ship_no
我想要的数据框是
ship_no bay_1 bay_2 bay_3 bay_5 bay_6 bay_7
ABC 10 20 15 20 30 10
DEF 20 30 0 40 20 0
ERT 0 10 0 20 0 0
我怎样才能在r?
中这样做答案 0 :(得分:2)
我们可以将数据集放在list
中,使用rbindlist
来绑定数据集,按“ship_no”分组,获取其他列的sum
library(data.table)
rbindlist(list(df1, df2), fill = TRUE)[,lapply(.SD, sum, na.rm = TRUE) , ship_no]
# ship_no bay_1 bay_2 bay_3 bay_5 bay_6 bay_7
#1: ABC 10 20 15 20 30 10
#2: DEF 20 30 0 40 20 0
#3: ERT 0 10 0 20 0 0
另一种选择是dplyr
library(dplyr)
bind_rows(df1, df2) %>%
group_by(ship_no) %>%
summarise_all(funs(sum(., na.rm = TRUE)))
# A tibble: 3 x 7
# ship_no bay_1 bay_2 bay_3 bay_5 bay_6 bay_7
# <chr> <int> <int> <int> <int> <int> <int>
#1 ABC 10 20 15 20 30 10
#2 DEF 20 30 0 40 20 0
#3 ERT 0 10 0 20 0 0