R - apriori()不承认数字交易中的lhs

时间:2018-01-04 20:44:19

标签: r apriori arules

我在使用arules软件包生成任何规则时遇到了麻烦。我设法获得了100000行交易数据,在SAS中显示了规则。我不能让它在R中工作。

[5]      {19,29,40,119,134}   
[6]      {24,40,45,67,141}    
[7]      {17,18,57,74,412}    
[8]      {16,79,90,150,498}   
[9]      {18,57,111,161,267}  
[10]     {11,75,131,427,429}  
[11]     {57,99,111,143,236} 

交易数据看起来像这样,最初来自一个表格,其中所有数字都是分开的。

arules <- read.transactions('tid.csv', format = c("basket", "single"), 
sep=",")
rules <- apriori(arules,parameter = list(supp = 0.1, conf = 0.1, target = 
"rules"))
summary(rules)

作为参考,支持和置信度设置没有区别。有时我会在检查规则时得到这个。

         lhs    rhs                   support      confidence   lift count
[1]      {}  => {8,11,96,112,432}     9.710623e-06 9.710623e-06 1    1    
[2]      {}  => {62,134,222,254,412}  9.710623e-06 9.710623e-06 1    1 

知道apriori为什么不能分开交易中的项目?这是否需要重新格式化为长格式,如果是这样,我将如何形成这个数据框?

V2  V3  V4  V5  V6
8   11  96  112 432
10  35  39  76  119
18  38  68  141 267
29  36  57  61  63
19  29  40  119 134
24  40  45  67  141
17  18  57  74  412

1 个答案:

答案 0 :(得分:0)

如果我理解正确,那么你应该尝试一下,让我们知道它是否有帮助。

library(arules)
library(arulesViz)

#sample data
df <- read.table(text="V2  V3  V4  V5  V6
                 8   11  96  112 432
                 10  35  39  76  119
                 18  38  68  141 267
                 29  36  57  61  63
                 19  29  40  119 134
                 24  40  45  67  141
                 17  18  57  74  412", header=T)
write.csv(df, "apriori_demo.csv", row.names = F)

#convert sample data into transactions format for apriori algorithm
trx <- read.transactions("apriori_demo.csv", format="basket", sep=",", skip=1)

#apriori rules
apriori_rule <- apriori(trx, parameter = list(supp = 0.1, conf = 0.1)) 
#obviously you need to have better parameters compared to the one you have used in your post!
inspect(apriori_rule)
plot(apriori_rule, method="graph")