我在print(y.glom()。collect())行上遇到错误,为什么会这样呢?

时间:2018-10-11 06:51:05

标签: pyspark

x = sc.parallelize([[('Z','Mera')],[('B','Bharath')],[('M','Mahaan')],[('B','Bharath')],[('J','Jai Ho')]])
print(x.collect())

y = x.partitionBy(2, lambda z: 0 if z[0] < 'H' else 1)
#print(x.glom().collect())

print(y.glom().collect()) # Fix this

1 个答案:

答案 0 :(得分:0)

尝试一下

>>> y = x.flatMap(lambda x: x).partitionBy(2, lambda z: 0 if z[0] < 'H' else 1).map(lambda x: [x])
>>> y.collect()
[[('B', 'Bharath')], [('B', 'Bharath')], [('Z', 'Mera')], [('M', 'Mahaan')], [('J', 'Jai Ho')]]
>>> print(y.glom().collect())
[[[('B', 'Bharath')], [('B', 'Bharath')]], [[('Z', 'Mera')], [('M', 'Mahaan')], [('J', 'Jai Ho')]]]