支持和提升mllib spark / scala中的fp增长规则

时间:2016-07-18 13:09:04

标签: scala apache-spark apache-spark-mllib

我想通过fp-growth提取生成的关联规则的支持和提升。找到下面代码的规则后,我手动完成交易并计算支持和提升。我想知道是否有更多的方法来提取这些信息。谢谢!

val fpg = new FPGrowth()
  .setMinSupport(0.2)
  .setNumPartitions(10)
val model = fpg.run(transactions)

model.freqItemsets.collect().foreach { itemset =>
  println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}

val minConfidence = 0.8
model.generateAssociationRules(minConfidence).collect().foreach { rule =>
  println(
    rule.antecedent.mkString("[", ",", "]")
      + " => " + rule.consequent .mkString("[", ",", "]")
      + ", " + rule.confidence)
}

1 个答案:

答案 0 :(得分:1)

mm不优雅,但这就是我要做的

val freqs = fpgrowth_model(transactions, min_supp=supp)
val supps = freqs.withColumn("support", $"freq" / total_transactions)
val rules = get_rules(transactions, min_supp=supp, min_confidence=conf)
val cross_df = supps.join(rules, $"items" === $"consequent")
               .withColumn("lift",$"confidence" / $"support")