下面的代码字在Pyspark(Spark 1.6.2)中完全没问题。任何人都可以帮助我使用Scala中的相关代码
Pyspark代码:
locwtemps = sc.parallelize(['Hayward,71|69|71|71|72',
'Baumholder,46|42|40|37|39',
'Alexandria,50|48|51|53|44',
'Melbourne,88|101|85|77|74'])
kvpairs = locwtemps.map(lambda x: x.split(','))
locwtemps = kvpairs.flatMapValues(lambda x: x.split('|')).map(lambda x: (x[0], int(x[1])))
输出:
# [(u'Hayward', 71),# (u'Hayward', 69),# (u'Hayward', 71),# (u'Hayward', 71),# (u'Hayward', 72)]
答案 0 :(得分:-2)
val locwtemps = sc.parallelize(List("Hayward,71|69|71|71|72", "Baumholder,46|42|40|37|39", "Alexandria,50|48|51|53|44", "Melbourne,88|101|85|77|74"))
val kvpairs = locwtemps.map(_.split(','))
kvpairs.map(x => (x(0),x(1))).flatMapValues(_.split('|')).mapValues(y => y.toInt).collect