使用R apriori进行数字/连续属性

时间:2016-07-24 10:43:29

标签: r machine-learning data-mining apriori

我试图在R中使用Apriori来表示数字属性。我将属性离散化了

mat1 = discretize(table[1:699,1:1])
mat1 = cbind(mat1, discretize(table[1:699,2:2]))
rules <- apriori(mat1, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))

但是apriori将所有非零值视为1。

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport support minlen maxlen target   ext
        0.9    0.1    1 none FALSE            TRUE     0.5      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 349 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[2 item(s), 699 transaction(s)] done [0.00s].
sorting and recoding items ... [2 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 done [0.00s].
writing ... [4 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
Warning message:
In asMethod(object) :
  matrix contains values other than 0 and 1! Setting all entries != 0 to 1.

由于apriori将输入作为二进制值,如何在连续数值值属性上应用关联规则挖掘?

1 个答案:

答案 0 :(得分:0)

我不会使用discretize(),而是将apriori()函数应用于包含您的数据作为因子的数据框,您可以这样定义:

mat1 <- data.frame(
  X1 = as.character(table[1:699,1:1])
  ,X2 = as.character(table[1:699,2:2])
  ,stringsAsFactors = TRUE); #also by default