R Arules:如何从lhs / rhs

时间:2016-12-23 14:31:14

标签: r transactions apriori arules

我已将文件作为R:

中的事务加载
path = "my_file.csv"
t = read.transactions(path,format="single", sep=';',cols=c("ID","Products"))

#get the rules:
rules = apriori(t,parameter = list(supp=0.01, conf=0.33, minlen=2, maxlen=4))
#sort by confidence:
rules = sort(rules, by="confidence", decreasing=TRUE)
#inspect the first 10 rules:
inspect(rules[1:10])

哪个输出是:

     lhs      rhs  support  confidence        lift
[1]  {e,b} => {a}     0.01        0.97  some_value      
[2]  {a}   => {f}     0.04        0.92  some_value 
[3]  {t,f} => {a}     0.12        0.91  some_value 
[4]  {b,j} => {a}     0.09        0.82  some_value 
[5]  {e}   => {a}     0.25        0.77  some_value 
[6]  {g,h} => {a}     0.05        0.56  some_value 
[7]  {p}   => {a}     0.31        0.54  some_value 
[8]  {q,n} => {h}     0.18        0.49  some_value 
[9]  {s}   => {a}     0.07        0.46  some_value 
[10] {s,d} => {a}     0.20        0.42  some_value 

现在我的问题是项目集{a}太频繁了,我想设置apriori规则生成器,使项目{a}或任何其他我不想考虑的项目,不会出现在生成的规则中。 我知道一种简单的方法是从上传的交易文件中删除项目{a};无论如何,即使它很简单,它也不聪明和优雅,也很长,因为我正在使用数百种不同的交易文件。

在网上搜索我发现这个设置模式用于指定lhs和rhs:

rules = apriori(t,parameter = list(supp=0.01, conf=0.33, minlen=2, maxlen=4), appearance=list(default="lhs", rhs="b"))

现在检查的输出是:

      lhs     rhs         support    confidence          lift
[1]  {a,b} => {b}     other_value   other_value   other_value      
[2]  {a}   => {b}     other_value   other_value   other_value       
[3]  {a,f} => {b}     other_value   other_value   other_value       
[4]  {b,j} => {b}     other_value   other_value   other_value      
[5]  {a}   => {b}     other_value   other_value   other_value       
[6]  {a,h} => {b}     other_value   other_value   other_value       
[7]  {a}   => {b}     other_value   other_value   other_value       
[8]  {q,a} => {b}     other_value   other_value   other_value      
[9]  {a}   => {b}     other_value   other_value   other_value       
[10] {a,d} => {b}     other_value   other_value   other_value       

所以有可能告诉Apriori我们想要rhs(或lhs)中的哪个项目;但是不可能告诉Apriori我们不想要哪个项目。或者我不可能以这种方式告诉我(我不想要{a}):

    rules = apriori(t,parameter = list(supp=0.01, conf=0.33, minlen=2, maxlen=4), appearance=list(default="lhs", rhs!="a"))

这会产生错误。

有什么建议吗?感谢

1 个答案:

答案 0 :(得分:3)

查看? APappearance

第一个示例显示如何从项目集中排除单个项目。您也可以为挖掘规则执行此操作:

 data("Adult")

 ## find only frequent itemsets which do not contain small or large income
 is <- apriori(Adult, parameter = list(support= 0.1, target="frequent"), 
   appearance = list(none = c("income=small", "income=large"),
   default="both"))