c1,c2,c3
pencil,book,eraser
pen,book
如果这是我的数据集,则需要
之类的组合pencil
pencil,book
pencil,eraser
book,eraser
pen
pen,book
我使用rdd以这种格式完成操作,但是现在我的输入是数据框,我该如何组合?
val itemset = data.flatMap { line =>
val arr = line.split(delimiter)
(1 to arr.length).flatMap { y =>
val combinations = arr.combinations(y)
println("arr elements "+arr.deep)
combinations.foreach(x => println(x.deep))
combinations.map { x => (x.toSet, 1)}
}
}.reduceByKey(_ + _)