结合具有不同列需求的数据框

时间:2018-08-05 16:35:17

标签: r merge

具有两个像 这个:

df1 <- data.frame(a_id = c(42,3234,1445,34),
text = c("sth1","sthe2","sthe3","sther4"),
product_id = c("Google","Yahoo","Amazon","Yahoo"))

和这个:

df2 <- data.frame(b_id = c(55,78,2345467,23,42312,44),
                  text = c("st1","sth2","sth3","sth4","sth5","sth6"),
                  product_id = c("Yahoo","Google","Amazon","Amazon","Amazon","Google"))

如何创建基于一个列的合并数据帧,又如何保留另一列并将这些列合并为a_id和b_id中的一个,并用“ a”或“ b“在每个值的开头?

以下是预期输出的示例:

dfmerge = data.frame(ab_id = c("a42","a3234","a1445","a34","b55","b78","b2345467","b23","b42312","b44"),
                     text = c("sth1","sthe2","sthe3","sther4","st1","sth2","sth3","sth4","sth5","sth6"),
                     product_id = c("Google","Yahoo","Amazon","Yahoo","Yahoo","Google","Amazon","Amazon","Amazon","Google"))

2 个答案:

答案 0 :(得分:2)

我们可以使用map来做到这一点。将数据集放置在list中,使用map遍历数据集,通过mutate输入列名的前缀paste第一列并重命名列名< / p>

library(tidyverse)
list(df1, df2) %>% 
     map_df(~ .x %>% 
                 mutate(!! names(.x)[1] := paste0(substr(names(.x)[1],
                  1, 1), !! rlang::sym(names(.x)[1]))) %>% 
                 rename_at(1, ~ "ab_id"))
#     ab_id   text product_id
#1       a42   sth1     Google
#2     a3234  sthe2      Yahoo
#3     a1445  sthe3     Amazon
#4       a34 sther4      Yahoo
#5       b55    st1      Yahoo
#6       b78   sth2     Google
#7  b2345467   sth3     Amazon
#8       b23   sth4     Amazon
#9    b42312   sth5     Amazon
#10      b44   sth6     Google

以上内容也可以包装在一个函数中

fbind <- function(dat1, dat2) {
           list(dat1, dat2) %>%
               map_df( ~ 
                       .x %>%
                        mutate(!! names(.x)[1] := paste0(substr(names(.x)[1],
                  1, 1), !! rlang::sym(names(.x)[1]))) %>% 
                 rename_at(1, ~ "ab_id"))
     }

fbind(df1, df2)
#     ab_id   text product_id
#1       a42   sth1     Google
#2     a3234  sthe2      Yahoo
#3     a1445  sthe3     Amazon
#4       a34 sther4      Yahoo
#5       b55    st1      Yahoo
#6       b78   sth2     Google
#7  b2345467   sth3     Amazon
#8       b23   sth4     Amazon
#9    b42312   sth5     Amazon
#10      b44   sth6     Google

答案 1 :(得分:2)

您可以在函数中执行此操作。

myMerge <- function(x, y) {
  nm <- names(x)[-1]
  names(x) <- names(y) <- 1:3
  x[, 1] <- paste0("a", x[, 1])
  y[, 1] <- paste0("b", y[, 1])
  return(setNames(rbind(x, y), c("ab_id", nm)))
}

结果

> myMerge(df1, df2)
      ab_id   text product_id
1       a42   sth1     Google
2     a3234  sthe2      Yahoo
3     a1445  sthe3     Amazon
4       a34 sther4      Yahoo
5       b55    st1      Yahoo
6       b78   sth2     Google
7  b2345467   sth3     Amazon
8       b23   sth4     Amazon
9    b42312   sth5     Amazon
10      b44   sth6     Google