I am going to implement a personal recommendation system using Apriori algorithm. I know there are three useful concepts as 'support',confidence' and 'lift. I already know the meaning of them. Also I know how to find the frequent item sets using support concept. But I wonder why confidence and lift concepts are there for if we can find frequent item sets using support rule?
could you explain me why 'confidence' and 'lift' concepts are there when 'support' concept is already applied and how can I proceed with 'confidence' and 'lift' concepts if I have already used support concept for the data set?
I would be highly obliged if you could answer with SQL queries since I am still an undergraduate. Thanks a lot
答案 0 :(得分:1)
单独支持会产生许多冗余规则。
e.g。
A -> B
A, C -> B
A, D -> B
A, E -> B
...
提升和类似措施的目的是删除不比简单规则好多少的复杂规则。 在上述情况下,简单的规则A - > B可能没有复杂规则那么自信,但需要更多的支持。其他规则可能只是这种强烈模式的巧合,由于样本量较小,信心略强。
同样,如果你有:
A -> B confidence: 90%
C -> D confidence: 90%
A, C -> B, D confidence: 80%
然后最后一条规则甚至坏,尽管信心很高! 前两个规则产生相同的结果,但信心更高。因此,最后一条规则不应该是80%正确,但如果你假设前两条规则要保持,那么-10%是正确的!
因此,支持和信心 不足以考虑。