dplyr :: full_join没有按预期工作

时间:2017-12-11 16:01:45

标签: r dataframe dplyr

输入是:

x <- data.frame(
   input.number = c(0,1,2,1,1),
   input.layer = c(0.0,0.0,0.0,0.0,0.5),
   output.number = c(1,1,1,1,1),
   output.layer = c(1.0,1.0,1.0,0.5,1.0),
   weights = c(-4.9076530,-2.8328544 ,-0.8687123,-2.8328544,-2.8328544)
)
y <- data.frame(
   input.number = 2,
   input.layer = 0,
   output.number = 1,
   output.layer = 0.5,
   weights = 0
)

通过运行加入它们:

dplyr::full_join(x, y, by = c("input.number", "input.layer", "output.number", "output.layer"), suffix = c('','.dupe'))

结果是一个带有重复列的data.frame:

   input.number input.layer output.number output.layer    weights weights.dupe
 1            0         0.0             1          1.0 -4.9076530           NA
 2            1         0.0             1          1.0 -2.8328544           NA
 3            2         0.0             1          1.0 -0.8687123           NA
 4            1         0.0             1          0.5 -2.8328544           NA
 5            1         0.5             1          1.0 -2.8328544           NA
 6            2         0.0             1          0.5         NA            0

由于新线不是骗局,我期待这样的事情:

   input.number input.layer output.number output.layer    weights
 1            0         0.0             1          1.0 -4.9076530           
 2            1         0.0             1          1.0 -2.8328544           
 3            2         0.0             1          1.0 -0.8687123           
 4            1         0.0             1          0.5 -2.8328544           
 5            1         0.5             1          1.0 -2.8328544           
 6            2         0.0             1          0.5         0

1 个答案:

答案 0 :(得分:1)

full_join功能正常。但是,由于bind_rowsx具有相同的列名并且您想要添加“a y,因此包中的dplyr::bind_rows(x, y) # input.number input.layer output.number output.layer weights # 1 0 0.0 1 1.0 -4.9076530 # 2 1 0.0 1 1.0 -2.8328544 # 3 2 0.0 1 1.0 -0.8687123 # 4 1 0.0 1 0.5 -2.8328544 # 5 1 0.5 1 1.0 -2.8328544 # 6 2 0.0 1 0.5 0.0000000 函数似乎正在寻找新行“到数据框。

rbind

或者您可以使用基础R中的rbind(x, y) # input.number input.layer output.number output.layer weights # 1 0 0.0 1 1.0 -4.9076530 # 2 1 0.0 1 1.0 -2.8328544 # 3 2 0.0 1 1.0 -0.8687123 # 4 1 0.0 1 0.5 -2.8328544 # 5 1 0.5 1 1.0 -2.8328544 # 6 2 0.0 1 0.5 0.0000000 函数。

whereRaw