根据之前的列将同一列添加到数据框

时间:2020-04-16 15:13:53

标签: r dataframe merge

我要通过一个公用键列(第一列)将两个数据帧合并在一起,但是我想基于上一列中的第二列再次添加同一列:

代码段

clusering_matrix_example <- data.frame(BGC = c("BGC1", "BGC2", "BGC3", "BGC4"), Family = c("10","20","30","40"))
network_matrix_example <- data.frame(BGC1 = c("BGC1", "BGC1", "BGC1", "BGC2", "BGC2", "BGC2", "BGC3", "BGC3", "BGC3", "BGC4", "BGC4", "BGC4"),
                                     BGC2 = c("BGC2", "BGC3", "BGC4", "BGC1", "BGC3", "BGC4", "BGC1", "BGC2", "BGC4", "BGC1", "BGC2", "BGC3"),
                                     score = c(1,2,3,1,4,5,2,4,6,3,5,6))
network_output_example <- merge(network_matrix_example, clusering_matrix_example, by.x= "BGC1", by.y = "BGC")

network_output_example <- merge(network_matrix_example, clusering_matrix_example, by.x= "BGC2", by.y = "BGC")

电流输出

BGC1  | BGC2 | score |Family
BGC1    BGC2    1     10
BGC1    BGC3    2     10
BGC1    BGC4    3     10
BGC2    BGC1    1     20
BGC2    BGC3    4     20
BGC2    BGC4    5     20
BGC3    BGC1    2     30
BGC3    BGC2    4     30
BGC3    BGC4    6     30
BGC4    BGC1    3     40
BGC4    BGC2    5     40
BGC4    BGC3    6     40

所需的输出

BGC1  | BGC2 | score |Family1 | Family2
BGC1    BGC2    1     10        20
BGC1    BGC3    2     10        30
BGC1    BGC4    3     10        40
BGC2    BGC1    1     20        10
BGC2    BGC3    4     20        30
BGC2    BGC4    5     20        40
BGC3    BGC1    2     30        10
BGC3    BGC2    4     30        20
BGC3    BGC4    6     30        40
BGC4    BGC1    3     40        10
BGC4    BGC2    5     40        20
BGC4    BGC3    6     40        40

2 个答案:

答案 0 :(得分:3)

缺少最后一列的原因是因为第二次与旧框架“ network_matrix_example”合并,而不是与新合并的“ network_output_example”合并。

代码应如下所示:

clusering_matrix_example <- data.frame(BGC = c("BGC1", "BGC2", "BGC3", "BGC4"), Family = c("10","20","30","40"))
network_matrix_example <- data.frame(BGC1 = c("BGC1", "BGC1", "BGC1", "BGC2", "BGC2", "BGC2", "BGC3", "BGC3", "BGC3", "BGC4", "BGC4", "BGC4"),
                                         BGC2 = c("BGC2", "BGC3", "BGC4", "BGC1", "BGC3", "BGC4", "BGC1", "BGC2", "BGC4", "BGC1", "BGC2", "BGC3"),
                                         score = c(1,2,3,1,4,5,2,4,6,3,5,6))
network_output_example <- merge(network_matrix_example, clusering_matrix_example, by.x= "BGC1", by.y = "BGC")

network_output_example <- merge(network_output_example, clusering_matrix_example, by.x= "BGC2", by.y = "BGC")

答案 1 :(得分:1)

嗨,我不知道这是否是最聪明的方法,但是它给出了所需的结果:

library(dplyr)
#your line:
network_output_example <- merge(network_matrix_example, clusering_matrix_example, by.x= "BGC1", by.y = "BGC")

# add left_join:
network_output_example %>% left_join(clusering_matrix_example, by= c("BGC2"= "BGC"))