嗨其实我是Scala的初学者和火花。所以这可能很容易,但我不知道如何解决这个问题
val a = sc.parallelize(List("dog","tiger","lion","cat","panther","eagle"))
val b = a.map(x.length,x)
,所需的输出是
Array[(Int,String)]=Array((4,lion),(7,panther),(3,dogcat),(5,tigereagle))
这就是我试过的
val res = a.collect()
for ( i <- 0 to (res.length - 2) ) {
for ( j <- 1 to (res.length - 1 ) ) {
if (res(i).length==res33(j).length && res(i) != res(j))println((res(i).concat(res(j))))
}}
但是没有以期望的方式获得o / p
答案 0 :(得分:0)
试试这个
val list = List("dog","tiger","lion","cat","panther","eagle")
list: List[String] = List(dog, tiger, lion, cat, panther, eagle)
scala>val r = list.groupBy(_.length)collect{
case e=> e._1 -> e._2.mkString("")
}
r: scala.collection.immutable.Map[Int,String] = Map(5 -> tigereagle, 4 -> lion, 7 -> panther, 3 -> dogcat)
scala> r.toArray
res3: Array[(Int, String)] = Array((5,tigereagle), (4,lion), (7,panther), (3,dogcat))
答案 1 :(得分:0)
使用groupBy
val a = sc.parallelize(List("dog","tiger","lion","cat","panther","eagle"))
a.groupBy(_.length).foreach(println)
这为您提供CompactBuffer
每个与
(5,CompactBuffer(tiger, eagle))
(4,CompactBuffer(lion))
(3,CompactBuffer(dog, cat))
(7,CompactBuffer(panther))
现在您可以根据需要使用CompactBuffer
现在通过RDD和每个CompactBuffer mkstring
函数进行映射以创建单个string
a.groupBy(_.length).map(x => (x._1, x._2.mkString("")))
这会给你
(4,lion)
(7,panther)
(3,dogcat)
(5,tigereagle)
答案 2 :(得分:0)
您可以使用groupBy
按字符串的长度进行分组,然后映射分组结果以按照您希望的方式转换输出。
val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "panther", "eagle"))
//group by length of the string and concatenate each string having same length
val b = a.groupBy(_.length).map(x => (x._1, x._2.mkString("")))
//print the output
b.foreach(print(_))
//output
//(3,dogcat)(4,lion)(7,panther)(5,tigereagle)
如果结果为Array[(Int, String)]
,则使用collect as,
val array: Array[(Int, String)] = b.collect()