标签: apache-spark pyspark
使用RDD生成配对,例如:
rdd1 = sc.parallelize(['d', '112', 'b', 'c', 'i', 'a', 'e'])
输出:
[('d','112'), ('d','b'), ('d','c'), ('d','i'), ..., ('a','e')]
谢谢