如何合并两个具有相同列名但R中变量中可能具有相同数据的数据框?

时间:2020-04-08 13:24:04

标签: r

我想问一下如何合并这两个数据框?

df1:

Name   Type   Price
A       1      NA
B       2      2.5
C       3      2.0

df2:

Name   Type   Price
A       1      1.5
D       2      2.5
E       3      2.0

从两个df中都可以看到,它们的列名相同,并且“名称”中的一行具有相同的值,即A,但df1没有价格,而df2有价格。我想实现此输出,以便如果“名称”中的值相同,则它们将合并

Name   Type   Price
A       1      1.5
B       2      2.5
C       3      2.0
D       2      2.5
E       3      2.0

3 个答案:

答案 0 :(得分:0)

我们可以在full_join上对df1df2进行Name,并在coalesceType上使用Price这些列中的第一个非NA值。

library(dplyr)

full_join(df1, df2, by = 'Name') %>%
   mutate(Type = coalesce(Type.x, Type.y), 
          Price = coalesce(Price.x, Price.y)) %>%
   select(names(df1))

#  Name Type Price
#1    A    1   1.5
#2    B    2   2.5
#3    C    3   2.0
#4    D    2   2.5
#5    E    3   2.0

与基数R类似:

transform(merge(df1, df2, by = 'Name', all = TRUE), 
           Price = ifelse(is.na(Price.x), Price.y, Price.x), 
           Type = ifelse(is.na(Type.x), Type.y, Type.x))[names(df1)]

数据

df1 <- structure(list(Name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), Type = 1:3, Price = c(NA, 2.5, 2)), 
class = "data.frame", row.names = c(NA, -3L))

df2 <- structure(list(Name = structure(1:3, .Label = c("A", "D", "E"
), class = "factor"), Type = 1:3, Price = c(1.5, 2.5, 2)), 
class = "data.frame", row.names = c(NA, -3L))

答案 1 :(得分:0)

似乎您想将数据框重新绑定在一起,然后删除价格为NA且按名称排序的行。

library(data.table)

setDT(rbind(df1, df2))[!is.na(Price)][order(Name)]
#    Name Type Price
# 1:    A    1   1.5
# 2:    B    2   2.5
# 3:    C    3   2.0
# 4:    D    2   2.5
# 5:    E    3   2.0

答案 2 :(得分:0)

这是使用merge + ocmplete.cases

的基本R解决方案
dfout <- subset(u <- merge(df1,df2,all= TRUE),complete.cases(u))

产生

> dfout
  Name Type Price
1    A    1   1.5
3    B    2   2.5
4    C    3   2.0
5    D    2   2.5
6    E    3   2.0

数据

df1 <- structure(list(Name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), Type = 1:3, Price = c(NA, 2.5, 2)), 
class = "data.frame", row.names = c(NA, -3L))

df2 <- structure(list(Name = structure(1:3, .Label = c("A", "D", "E"
), class = "factor"), Type = 1:3, Price = c(1.5, 2.5, 2)), 
class = "data.frame", row.names = c(NA, -3L))