将一行附加到火花中的一对RDD

时间:2018-10-19 06:03:01

标签: scala apache-spark

我有一对RDD的现有值,例如: (1,2) (3,4) (5,6)

我想在同一RDD上附加一行(7,8)

如何在Spark中附加到相同的RDD?

1 个答案:

答案 0 :(得分:0)

您可以使用联合操作。

scala> val rdd1 = sc.parallelize(List((1,2), (3,4), (5,6)))
q: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[1] at parallelize at <console>:24

scala> val rdd2 = sc.parallelize(List((7, 8)))
q: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[1] at parallelize at <console>:24

scala> val unionOfTwo = rdd1.union(rdd2)
res0: org.apache.spark.rdd.RDD[(Int, Int)] = UnionRDD[2] at union at <console>:28