如何使用forloop来比较scala

时间:2018-03-23 10:11:52

标签: scala apache-spark

嗨其实我是Scala的初学者和火花。所以这可能很容易,但我不知道如何解决这个问题

val a = sc.parallelize(List("dog","tiger","lion","cat","panther","eagle"))

val b = a.map(x.length,x)

,所需的输出是

Array[(Int,String)]=Array((4,lion),(7,panther),(3,dogcat),(5,tigereagle))

这就是我试过的

 val res = a.collect()
 for ( i <- 0 to (res.length - 2) ) {
   for ( j <- 1 to (res.length - 1 ) ) {
   if (res(i).length==res33(j).length && res(i) != res(j))println((res(i).concat(res(j))))
 }}

但是没有以期望的方式获得o / p

3 个答案:

答案 0 :(得分:0)

试试这个

 val list = List("dog","tiger","lion","cat","panther","eagle")
list: List[String] = List(dog, tiger, lion, cat, panther, eagle)

scala>val r = list.groupBy(_.length)collect{
  case e=> e._1 -> e._2.mkString("")
}
r: scala.collection.immutable.Map[Int,String] = Map(5 -> tigereagle, 4 -> lion, 7 -> panther, 3 -> dogcat)
scala> r.toArray
res3: Array[(Int, String)] = Array((5,tigereagle), (4,lion), (7,panther), (3,dogcat))

答案 1 :(得分:0)

使用groupBy

很简单
val a = sc.parallelize(List("dog","tiger","lion","cat","panther","eagle"))

a.groupBy(_.length).foreach(println)

这为您提供CompactBuffer每个与

长度相同的密钥
(5,CompactBuffer(tiger, eagle))
(4,CompactBuffer(lion))
(3,CompactBuffer(dog, cat))
(7,CompactBuffer(panther))

现在您可以根据需要使用CompactBuffer 现在通过RDD和每个CompactBuffer mkstring函数进行映射以创建单个string

a.groupBy(_.length).map(x => (x._1, x._2.mkString("")))

这会给你

(4,lion)
(7,panther)
(3,dogcat)
(5,tigereagle)

答案 2 :(得分:0)

您可以使用groupBy按字符串的长度进行分组,然后映射分组结果以按照您希望的方式转换输出。

val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "panther", "eagle"))

//group by length of the string and concatenate each string having same length
val b = a.groupBy(_.length).map(x => (x._1, x._2.mkString("")))

//print the output
b.foreach(print(_))

//output
//(3,dogcat)(4,lion)(7,panther)(5,tigereagle)

如果结果为Array[(Int, String)],则使用collect as,

val array: Array[(Int, String)] = b.collect()