我的数据框'df'具有以下结构:
假设存在4个不同的商店和标题
Title Store
T1 S1
T1 S2
T1 S3
T1 S4
T2 S1
T2 S2
T2 S4
T3 S1
T3 S4
T4 S1
T4 S2
问题:
我想找到所有标题组合的常用商店
预期输出:
Title_combination Common_Store
T1,T2,T3,T4 S1
T1,T2,T3 S1,S4
T1,T2,T4 S1,S2
........ ...... so on
答案 0 :(得分:1)
使用base
个功能。内联说明。
数据:
tbl <- read.table(text="Title Store
T1 S1
T1 S2
T1 S3
T1 S4
T2 S1
T2 S2
T2 S4
T3 S1
T3 S4
T4 S1
T4 S2", header=TRUE)
运作:
#get unique titles
titles <- unique(tbl$Title)
#combine rows into a single data.frame
do.call(rbind, unlist(
#for each set of n titles
lapply(seq_along(titles), function(n)
#using combn to generate combi and apply function to each combi
combn(titles, n, function(subtitles) {
#recursively intersect all stores for each title within the set subtitles
cstores <- Reduce(function(s, t2) intersect(s, tbl$Store[tbl$Title==t2]),
subtitles[-1],
tbl$Store[tbl$Title==subtitles[1]])
data.frame(
Title_combi=paste(subtitles, collapse=","),
Common_Store=paste(cstores, collapse=",")
)
}, simplify=FALSE) #dont simplify results from combn
),
recursive=FALSE)) #unlist 1 level of combi results
结果:
# Title_combi Common_Store
# 1 T1 S1,S2,S3,S4
# 2 T2 S1,S2,S4
# 3 T3 S1,S4
# 4 T4 S1,S2
# 5 T1,T2 S1,S2,S4
# 6 T1,T3 S1,S4
# 7 T1,T4 S1,S2
# 8 T2,T3 S1,S4
# 9 T2,T4 S1,S2
# 10 T3,T4 S1
# 11 T1,T2,T3 S1,S4
# 12 T1,T2,T4 S1,S2
# 13 T1,T3,T4 S1
# 14 T2,T3,T4 S1
# 15 T1,T2,T3,T4 S1