Question

我有这个：

(0,List(pablo, luca))
(1,List(marco))
(3,List(anna))
(2,List(fobi))

我希望用相应的字符串（“0”，“1”，“2”，“树”）替换每个Int（0,1,2,3）：

(zero,List(pablo, luca))
(uno,List(marco))
(tree,List(anna))
(due,List(fobi))

所以为了这个目标，我正在使用它：

finalCommunitiesDetectedRdd: RDD[(Int, Seq[String])] = ...

def getNameOfBin(id: Int): String = id match {
    case 0  => "Low SA Users:"
    case 1  => "Medium-Low SA Users:"
    case 2  => "Medium-High SA Users:"
    case 3  => "High SA Users:"
    case other => "nothing" // what to do if nothing else matches
}

var finalCommunitiesDetectedWithNamesRdd: RDD[(String, Seq[String])] = finalCommunitiesDetectedRdd.map{ case (id, Seq(username)) => (getNameOfBin(id), Seq(username)) }

finalCommunitiesDetectedWithNamesRdd.foreach(println) // check

但我得到了：

18/01/20 10:38:32错误执行者：阶段49.0中的任务0.0中的异常（TID 26） scala.MatchError：（0，List（pablo，luca））（类scala.Tuple2）

为什么？

Answer 1

Seq(username)只匹配具有一个元素的序列。如果你不关心元组的第二个元素就像这样匹配：

case (id, seq) => (getNameOfBin(id), seq)

将匹配函数应用于每个RDD元素

1 个答案: