以下是广告系列数据的数据框
Subject Response Rate(%) Campaign Type Channel
Buy Stunning Phone A 81.00 A e-mail
Special Emi OFFER 81.00 B e-mail
Buy Stunning Phone at EMI 73.00 C SMS
The game changer is here. 85.00 A SMS
Buy Stunnig Phone A 80.00 A SMS
Special Emi OFFER 88.00 B e-mail
Buy Stunning Phone at EMI 48.00 C e-mail
The game changer is here. 48.00 A e-mail
Buy Stunning Phone 89.00 A e-mail
Special Emi OFFER 89.00 B SMS
Buy Stunning Phone at EMI 69.00 C SMS
我创建了一个术语文档矩阵,如下所示
Word Frequency
big 10
upgrade 10
worth 10
latest 9
much 9
phone 8
exciting 8
back 7
colours 7
case 6
stylish 6
clear 6
experience 5
time 5
我按照降低响应率的顺序对基于databy dplyr的通道类型进行了子集化。 我希望强调/列出针对每个主题的文件矩阵一词的字样。如果主题中存在Word,则列为主题附近的单独列表。我无法找到办法做到这一点。
答案 0 :(得分:1)
你的意思是这样吗
library(dplyr)
df <- read.table(header = TRUE, sep = "," ,text = "Subject,Response Rate(%),Campaign Type,Channel
Buy Stunning Phone A,81.00,A,e-mail
Special Emi OFFER,81.00,B,e-mail
Buy Stunning Phone at EMI,73.00,C,SMS
The game changer is here.,85.00,A,SMS
Buy Stunnig Phone A,80.00,A,SMS
Special Emi OFFER,88.00,B,e-mail
Buy Stunning Phone at EMI,48.00,C,e-mail
The game changer is here.,48.00,A,e-mail
Buy Stunning Phone,89.00,A,e-mail
Special Emi OFFER,89.00,B,SMS
Buy Stunning Phone at EMI,69.00,C,SMS",)
df2 <- read.table(header = TRUE, sep = "," ,text = "Word,Frequency
big,10
upgrade,10
worth,10
latest,9
much,9
phone,8
exciting,8
back,7
colours,7
case,6
stylish,6
clear,6
experience,5
time,5",)
m = sapply(df2$Word %>% as.character() %>% trimws(),regexpr,text = df$Subject %>% as.character(),ignore.case = TRUE)
df$keyWord <- sapply(1:nrow(m),function(idx){
t = m[idx,] > 0 %>% unlist()
paste0(names(t)[t],collapse = ",")
})
df