JavaPairRDD<PartitionKey, Iterable<Cat>> rddCat
JavaPairRDD<PartitionKey, Iterable<Dog>> rddDog
JavaPairRDD<PartitionKey, Iterable<Fish>> rddFish
JavaPairRDD<PartitionKey, Tuple3<Iterable<Cat>, Iterable<Dog>, Iterable<fish>>>
我只设法做到这一点,
rddCat.cogroup(rddDog, rddFish)
--> FlatMapFunction<Tuple2<PartitionKey, Tuple3<Iterable<Iterable<Cat>>, Iterable<Iterable<Dog>>, Iterable<Iterable<Fish>>>>
JavaPairRDD<PartitionKey, Tuple2<Iterable<Cat>, Iterable<Dog>>> catDogRdd = rddCat.join(rddDog);
JavaPairRDD<PartitionKey, Tuple2<Tuple2<Iterable<Cat>, Iterable<Dog>>, Iterable<Fish>>> finalRdd = catDogRdd.join(rddFish);
答案 0 :(得分:0)
tl; dr 使用join
,即def join[W](other: RDD[(K, W)]): RDD[(K, (V, W))]
)。
我使用Scala,以下似乎工作正常。
scala> r2.collect
res7: Array[(Int, Iterable[Int])] = Array((0,CompactBuffer(0, 1)), (3,CompactBuffer(6, 7)), (4,CompactBuffer(8, 9)), (1,CompactBuffer(2, 3)), (2,CompactBuffer(4, 5)))
scala> r3.collect
res8: Array[(Int, Iterable[Int])] = Array((0,CompactBuffer(0, 1, 2)), (3,CompactBuffer(9)), (1,CompactBuffer(3, 4, 5)), (2,CompactBuffer(6, 7, 8)))
scala> r5.collect
res9: Array[(Int, Iterable[Int])] = Array((0,CompactBuffer(0, 1, 2, 3, 4)), (1,CompactBuffer(5, 6, 7, 8, 9)))
scala> r2 join r3 join r5 collect
res10: Array[(Int, ((Iterable[Int], Iterable[Int]), Iterable[Int]))] = Array((0,((CompactBuffer(0, 1),CompactBuffer(0, 1, 2)),CompactBuffer(0, 1, 2, 3, 4))), (1,((CompactBuffer(2, 3),CompactBuffer(3, 4, 5)),CompactBuffer(5, 6, 7, 8, 9))))
答案 1 :(得分:0)
我设法在Guava的帮助下做到了:
//given
final JavaPairRDD<Character, Iterable<Integer>> rdd1 = ...
final JavaPairRDD<Character, Iterable<Integer>> rdd2 = ...
final JavaPairRDD<Character, Iterable<Integer>> rdd3 = ...
// when
final JavaPairRDD<Character, Tuple3<Iterable<Iterable<Integer>>, Iterable<Iterable<Integer>>, Iterable<Iterable<Integer>>>> grouped = rdd1.cogroup(rdd2, rdd3);
final JavaPairRDD<Character, Tuple3<Iterable<Integer>, Iterable<Integer>, Iterable<Integer>>> flattened = grouped.mapValues(
t3 -> new Tuple3<>(Iterables.concat(t3._1()), Iterables.concat(t3._2()), Iterables.concat(t3._3()))
);
我想知道@Fundhor你是如何在第一次尝试中设法产生这个签名的。这似乎不可能。