将分组数据框转换为arules的事务

时间:2018-03-29 11:46:25

标签: r transactions arules

我有一个数据框,其中包含每个会话(列“会话”)一系列操作(列“操作”)。可以在同一会话中重复操作(例如,对于会话01, a - > b>> a ),因为我感兴趣的是理解其中的顺序它们发生了:

 x<- data.frame(
       session=c("01","01","01","02","02", "02","03","03"), 
       action=c("a","b","a","c","a","c", "a","b"))

我需要将其转换为事务格式,以便我可以使用'arules'包来应用apriori算法。期望的输出将是:

01 a,b,a

02 c,a,c

03 a,b

其中基本上每个会话都会报告相应的确切序列。

您建议使用哪种方法?

谢谢。

2 个答案:

答案 0 :(得分:1)

使用base R,我们可以使用aggregate

aggregate(action~ session, x, FUN = toString)
#   session  action
#1      01 a, b, a
#2      02 c, a, c
#3      03    a, b

如果我们需要转换为transactions

library(apriori)
as(split(x$action, x$session), "transactions")

答案 1 :(得分:0)

x <- data.frame(session=c("01","01","01","02","02", "02","03","03"), 
                action=c("a","b","a","c","a","c", "a","b"))

library(dplyr)

x %>%
  group_by(session) %>%
  summarise(action = paste0(action, collapse = ","))

# # A tibble: 3 x 2
# session action
#   <fct>   <chr> 
# 1 01      a,b,a 
# 2 02      c,a,c 
# 3 03      a,b