合并两个表并在R中保留较小的值

时间:2019-07-25 09:23:45

标签: r merge

我想要两个表并保持较小的公用Task值 表格看起来像这样

x<-data.frame("Task"=c("A","B","C","D","E"),"FC"=c(12,NA,15,14,NA),FH=c(13,15,NA,17,20))
  Task FC FH
1    A 12 13
2    B NA 15
3    C 15 NA
4    D 14 17
5    E NA 20
y<-data.frame("Task"=c("B","C","F","G"),"FC"=c(NA,12,20,NA),FH=c(NA,17,18,NA))
  Task FC FH
1    B NA NA
2    C 12 17
3    F 20 18
4    G NA NA


How can I use function `melt`  to get result like this:
  Task FC FH
1    A 12 13
2    B NA 15
3    C 12 17
4    D 14 17
5    E NA 20
6    F 20 18
7    G NA NA

2 个答案:

答案 0 :(得分:4)

一个选择是完全加入并为每个Task保持最小值

aggregate(.~Task, merge(x, y, all = TRUE), min, na.rm = TRUE,na.action = "na.pass")

#  Task  FC  FH
#1    A  12  13
#2    B Inf  15
#3    C  12  17
#4    D  14  17
#5    E Inf  20
#6    F  20  18
#7    G Inf Inf

这将返回Inf而不是NA,但可以根据需要将其更改为NA

out[out == Inf] <- NA

dplyr相同的是

library(dplyr)

full_join(x, y) %>%
   group_by(Task) %>%
   summarise_all(min, na.rm = TRUE)

答案 1 :(得分:3)

一种data.table解决方案是

# Bring together the two tables
z <- funion(as.data.table(x),as.data.table(y))

# Find the min of FC and FH for each Task
z <- z[, .(FC = min(FC, na.rm = T), FH = min(FH, na.rm = T)), by = "Task"]

# Replace Infs returned by min with NA
z[is.infinite(FC), FC := NA]
z[is.infinite(FH), FH := NA]