如何编写一个接收map函数的scala函数到泛型类型

时间:2015-08-04 12:52:36

标签: scala generics

将Spark 1.3.0与Scala一起使用,我有两个函数,基本上在给定的RDD[(Long, String, Boolean, String)]上执行相同的操作,直到从(Long, String, Boolean, String)到2个元素的元组的特定映射函数:

def rddToMap1(rdd: RDD[(Long, String, Boolean, String)]): Map[Long, Set[(String, Boolean)]] = {
rdd
  .map(t => (t._1, (t._2, t._3))) //mapping function 1
  .groupBy(_._1)
  .mapValues(_.toSet)
  .collect
  .toMap
  .mapValues(_.map(_._2))
  .map(identity)
}


def rddToMap2(rdd: RDD[(Long, String, Boolean, String)]): Map[(Long, String), Set[String]] = {
rdd
  .map(t => ((t._1, t._2), t._4)) //mapping function 2
  .groupBy(_._1)
  .mapValues(_.toSet)
  .collect
  .toMap
  .mapValues(_.map(_._2))
  .map(identity)
}

我想编写一个通用函数genericRDDToMap,我稍后会用它来实现rddToMap1rddToMap2

这不起作用:

def genericRDDToMap[A](rdd: RDD[(Long, String, Boolean, String)], mapFn: (Long, String, Boolean, String) => A) = {      
rdd     
  .map(mapFn) //ERROR       
  .groupBy(_._1)        
  .mapValues(_.toSet)       
  .collect      
  .toMap        
  .mapValues(_.map(_._2))       
  .map(identity)        
}

(Eclipse)解释器不将mapFn作为有效的映射函数,它说:

type mismatch; found : (Long, String, Boolean, String) => A required: ((Long, String, Boolean, String)) => ?

即使我克服了这一点,我怎么知道我的通用类型A_1中的价值groupBy会跟随?

总结一下:我该怎么做?

1 个答案:

答案 0 :(得分:0)

你错过了(Long, String, Boolean, String)周围的括号。如果A的类型为 TupleX ,则可以使用上限指定它(此处我使用Tuple2):

  def genericRDDToMap[X, Y, A <: Tuple2[X,Y]](rdd: RDD[(Long, String, Boolean, String)], 
                         mapFn: ((Long, String, Boolean, String)) => A) (implicit ev: ClassTag[A])= {     
      ... 
  }