r - 缺少lhs产品的apriori规则问题

时间:2017-01-11 18:30:05

标签: r associations apriori arules

我按照https://www.r-bloggers.com/implementing-apriori-algorithm-in-r/的说明生成关联规则,但我无法在规则中生成lhs产品。我想这是因为我的交易无法分解。

以下是我的原始csv数据的示例:

  itemList
1 ContentManagement
2 Migration,Explorer
3 Explorer,Migration
4 Explorer,ContentManagement
5 Migration,Explorer

然后,我应用以下内容:

#load package required
library(arules)

#convert csv file to basket format
txn = read.transactions(
  file = "ItemList.csv",
  rm.duplicates = TRUE,
  format = "basket",
  sep = ",",
  col = 1
);
inspect(txn)

#remove quotes from transactions
txn@itemInfo$labels <- gsub("\"","",,txn@itemInfo$labels)

交易看起来像这样:

[1] {ContentManagement}            1
[2] {Migration,Explorer}           2
[3] {Explorer,Migration}           3
[4] {Explorer,ContentManagement}   4
[5] {Migration,Explorer}           5

当我申请以下内容时:

#run apriori algorithm
basket_rules <-
  apriori(txn,
          parameter = list(
            minlen = 1,
            sup = 0.01,
            conf = 0.01,
            target = "rules",
            maxtime=10
          ))
#basket_rules <- apriori(txn,parameter = list(sup = 0.00001, conf = 0.01, target="rules"),appearance = list(lhs = "Migration")))

#view rules
inspect(basket_rules)

它给出了令人失望的结果,如下:

     lhs    rhs                                    support    confidence lift
[1]  {}  => {ContentManagement}                    0.01175068 0.01175068 1   
[2]  {}  => {Migration, Explorer}                  0.01226158 0.01226158 1   
[3]  {}  => {Explorer,Migration}                   0.02145777 0.02145777 1
你可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

问题在于文件的结构。它不是逗号分隔文件,因为行号(行标签)和用逗号分隔的项之间有空格而不是数字。删除只留下项目的行号,并在col = NULL中设置read.transactions

如果您使用R编写文件,请确保在row.names = FALSE中使用write.csv