我在使用arules软件包生成任何规则时遇到了麻烦。我设法获得了100000行交易数据,在SAS中显示了规则。我不能让它在R中工作。
[5] {19,29,40,119,134}
[6] {24,40,45,67,141}
[7] {17,18,57,74,412}
[8] {16,79,90,150,498}
[9] {18,57,111,161,267}
[10] {11,75,131,427,429}
[11] {57,99,111,143,236}
交易数据看起来像这样,最初来自一个表格,其中所有数字都是分开的。
arules <- read.transactions('tid.csv', format = c("basket", "single"),
sep=",")
rules <- apriori(arules,parameter = list(supp = 0.1, conf = 0.1, target =
"rules"))
summary(rules)
作为参考,支持和置信度设置没有区别。有时我会在检查规则时得到这个。
lhs rhs support confidence lift count
[1] {} => {8,11,96,112,432} 9.710623e-06 9.710623e-06 1 1
[2] {} => {62,134,222,254,412} 9.710623e-06 9.710623e-06 1 1
知道apriori为什么不能分开交易中的项目?这是否需要重新格式化为长格式,如果是这样,我将如何形成这个数据框?
V2 V3 V4 V5 V6
8 11 96 112 432
10 35 39 76 119
18 38 68 141 267
29 36 57 61 63
19 29 40 119 134
24 40 45 67 141
17 18 57 74 412
答案 0 :(得分:0)
如果我理解正确,那么你应该尝试一下,让我们知道它是否有帮助。
library(arules)
library(arulesViz)
#sample data
df <- read.table(text="V2 V3 V4 V5 V6
8 11 96 112 432
10 35 39 76 119
18 38 68 141 267
29 36 57 61 63
19 29 40 119 134
24 40 45 67 141
17 18 57 74 412", header=T)
write.csv(df, "apriori_demo.csv", row.names = F)
#convert sample data into transactions format for apriori algorithm
trx <- read.transactions("apriori_demo.csv", format="basket", sep=",", skip=1)
#apriori rules
apriori_rule <- apriori(trx, parameter = list(supp = 0.1, conf = 0.1))
#obviously you need to have better parameters compared to the one you have used in your post!
inspect(apriori_rule)
plot(apriori_rule, method="graph")