如果有多个条件,我想合并两个单元格的内容。
我有以下数据框:
df <- data.frame(page = c("a1","a1","a2","a2","a3"),
keyword = c("a,b,c", "a,b,c,d", "d,e,f","g","a"))
伪代码中的条件:
if some cells of column page are equal (e.g. a1 and a2 appear two times)
then combine the content of column keyword and delete duplicate content.
这意味着最后我需要一个数据帧,如下所示:
page keyword
a1 a,b,c,d
a2 d,e,f,g
a3 a
我已经尝试了不同的方法,但是没有收到正确的结果。有人有主意吗?
答案 0 :(得分:1)
有了data.table
,您可以做到
library(data.table)
setDT(df)
df[, .(unlist(strsplit(keyword, split = ","))), by = page
][, .(keyword = toString(unique(V1))), by = page]
# page keyword
#1: a1 a, b, c, d
#2: a2 d, e, f, g
#3: a3 a
这是一个tidyr
和dplyr
选项。
library(dplyr); library(tidyr)
df %>%
separate_rows(keyword, sep = ",") %>%
group_by(page) %>%
summarise(keyord = toString(unique(keyword)))
# A tibble: 3 x 2
# page keyord
# <chr> <chr>
#1 a1 a, b, c, d
#2 a2 d, e, f, g
#3 a3 a