合并数据框联接功能

时间:2018-08-01 16:58:40

标签: r join dplyr

尝试合并行和列不相等但两个数据框之间的公共变量很少的两个数据框。这是一个虚拟的例子。实际数据包含300多行和100列。

library(dplyr)
library(tidyr)
library(purrr)
library(tibble)

#modified mtcars data
modified.mtcars.df <- as.tibble(mtcars)
modified.mtcars.df <- rownames_to_column(modified.mtcars.df, var = "motor_cars")
modified.mtcars.df <- modified.mtcars.df %>% select (motor_cars, mpg, cyl, disp)

#delete some rows
modified.mtcars.df <- modified.mtcars.df [1:28, ]

#create empty dataframe
empty.tb <- as.tibble (data.frame (matrix (as.numeric(NA), nrow=nrow(mtcars), ncol = ncol(mtcars))))
colnames (empty.tb) <- colnames (mtcars)
empty.tb <- add_column(empty.tb, motor_cars = as.character(NA), .before = 1)
empty.tb$motor_cars <- rownames(mtcars)
  

合并两个数据框。输出应类似于具有相同行数和列数的mtcar-32 x 12

new.mtcars.df <- full_join(empty.tb, modified.mtcars.df)

  

通过= c(“ motor_cars”,“ mpg”,“ cyl”,“ disp”)加入。   60 x12。添加具有NA的汽车行-原始数据集的最后4行也为NA

new.mtcars1.df <- full_join(empty.tb, modified.mtcars.df)

  

32 x15。合并行,但将3个变量与.y相加。多数   的值为NA

new.mtcars2.df <- full_join(modified.mtcars.df, empty.tb, by = "motors_cars")

  

32 x 15

new.mtcars3.df <- full_join(empty.tb, modified.mtcars.df, by = "motors_cars")

  

32 x 15

我尝试了inner_joinanti_joinleft_join semi_joinright_join的所有组合,还修改了Modifyed.mt.cars.df和empty。以上每个功能的tb分别为x和y。

任何帮助将不胜感激。谢谢。

0 个答案:

没有答案