合并数据框并按组划分行

时间:2021-04-29 20:30:26

标签: r dataframe dplyr division

我想将 df1 中的值除以 df2 中的值。在这个可重现的示例中,我能够对这些值求和。分工呢?提前致谢!

df1 <- data.frame(country = c("a", "b", "c"), year1 = c(1, 2, 3), year2 = c(1, 2, 3))
df2 <- data.frame(country = c("a", "b", "d"), year1 = c(1, 2, NA), year2 = c(1, 2, 3))

df3 <- bind_rows(df1, df2) %>%
  mutate_if(is.numeric, tidyr::replace_na, 0) %>%
  group_by(country) %>%
  summarise_all(., sum, na.rm = TRUE) %>%
  na_if(., 0)

预期结果是:

# A tibble: 4 x 3
  country year1 year2
  <chr>   <dbl> <dbl>
1 a           1     1
2 b           1     1
3 c          NA    NA
4 d          NA    NA

2 个答案:

答案 0 :(得分:2)

由于有 2 行的组和 1 行的组,请在 if/else 中使用 summarise/across 条件将 first 元素除以 last if有两个元素或 else return NA

library(dplyr) # version 1.0.4
library(tidyr)
bind_rows(df1, df2) %>% 
    mutate(across(where(is.numeric), replace_na, 0)) %>% 
    group_by(country) %>% 
    summarise(across(everything(), ~ if(n() == 2) first(.)/last(.) 
          else NA_real_))

-输出

# A tibble: 4 x 3
#  country year1 year2
#* <chr>   <dbl> <dbl>
#1 a           1     1
#2 b           1     1
#3 c          NA    NA
#4 d          NA    NA

答案 1 :(得分:2)

这是使用 merge + split.default

的基本 R 选项
df <- merge(df1, df2, by = "country", all = TRUE)
cbind(
  df[1],
  list2DF(lapply(
    split.default(df[-1], gsub("\\.(x|y)", "", names(df)[-1])),
    function(v) do.call("/", v)
  ))
)

给出

  country year1 year2
1       a     1     1
2       b     1     1
3       c    NA    NA
4       d    NA    NA