我在我的数据集上实现了Apriori算法并成功过滤它以获得积极的结果。我得到的规则是反向复制,即:
lhs rhs support confidence lift
1 {Mobile=1} => {Earphone=1} 0.025563474 0.09925997 0.3808200
2 {Earphone=1} => {Mobile=1} 0.025563474 0.09807662 0.3808200
3 {Jeans=1} => {Shirt=1} 0.024030494 0.09389671 0.3637123
4 {Shirt=1} => {Jeans=1} 0.024030494 0.09308297 0.3637123
可以看出规则1&规则2与LHS& RHS是互换的。有没有办法从最终结果中删除这些规则?
我的代码是:
transactions <- as(sold_data, "transactions");
rules = apriori(transactions, parameter=list(support=0.001, confidence=0.05));
rules_subset <- sort(
subset(rules,
(lhs %in% c("Mobile=1", "Earphone=1", "Watch=1", "Jeans=1", "Shirt=1")) &
!(lhs %in% c("Mobile=0", "Earphone=0", "Watch=0", "Jeans=0", "Shirt=0")) &
(rhs %in% c("Mobile=1", "Earphone=1", "Watch=1", "Jeans=1", "Shirt=1"))
),
decreasing = TRUE,
by = "support"
);
inspect(subset_rules);
我是R的新手,所以如果已经提出这个问题,请不要忘记我会删除这个问题
答案 0 :(得分:1)
您所指的是冗余规则。 可以使用下面的代码轻松删除这些规则。
transactions <- as(sold_data, "transactions")
rules = apriori(transactions, parameter=list(support=0.001, confidence=0.05))
subset.matrix <- is.subset(rules,rules)
subset.matrix[lower.tri(subset.matrix,diag=T)] <- NA
redundant <- colSums(subset.matrix,na.rm=T) >= 1
rules <- rules[!redundant]