如果两行的数组字段有交集,我想对数据集的行进行分组。
case class Test(id: String, keys:Array[String])
val testDataset : Dataset[Test] // Has following data
//("id1", ["key1", "key2", "key3"])
//("id2", ["key1", "key4", "key5"])
//("id3", ["key5", "key7", "key8"])
//("id4", ["key9"])
I want the output to be,
//Group1
[("id1", ["key1", "key2", "key3"]), ("id2", ["key1", "key4", "key5"]), ("id3", ["key5", "key7", "key8"])],
//Group2
[("id4", ["key9"])]
什么是有效的方法。