这是我的功能
/**
*
*
* @param spark spark
* @param templateInfo (code,(type_ids,content,lang))
* @param pushedTemplatedInfo (CODE,PUSH_DATE,PUSHED_CNT)
* @param templateCycle
* @param catTypeId
* @param templateCount
*/
def getNormalTemplate(spark: SparkSession, templateInfo: RDD[(String, (String, String, String))],
pushedTemplatedInfo: RDD[(String, (String, Int))],
templateCycle: Int, catTypeId: Int, templateCount: Int) = {
val templateDate = pushUtil.getNextSomeday(templateCycle)
println("templateDate:" + templateDate)
val deleteTemplatedInfo = pushedTemplatedInfo.filter(_._2._1 >= templateDate).map(x => (x._1, x._2._1))
val brpushedTemplatedMap = spark.sparkContext
.broadcast(pushedTemplatedInfo.map(x => (x._1, x._2._2)).distinct().collectAsMap())
val TemplateCodeSelection = templateInfo.filter(x => x._2._1 == catTypeId)
.map(x => (x._1, brpushedTemplatedMap.value.getOrElse(x._1, 0)))
.reduceByKey((x, y) => math.max(x, y))
.subtractByKey(deleteTemplatedInfo)
.sortBy(x => (x._2, x._1))(Ordering.Tuple2(Ordering.Int,Ordering.String.reverse))
//(code,(type_ids,content,lang))
val res = templateInfo.map(x => x._1)
}
谁能告诉我为什么,我正在按照How to sort a list in Scala by two fields?
的顺序进行编码答案 0 :(得分:2)
如果看到方法sortBy的签名,则将看到它需要2个参数?Startup
和Ordering
。您需要发送ClassTag
Tuple
您可以这样创建一个类标签:
def sortBy[K](
f: (T) => K,
ascending: Boolean = true,
numPartitions: Int = this.partitions.length)
(implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[T] = withScope {
this.keyBy[K](f)
.sortByKey(ascending, numPartitions)
.values
}
为解决您的情况,您将这样调用sortBy:
ClassTag[(Int, String)]((Int, String).getClass)