我是Scala的新手,很难通过此代码解决问题。
x.map{case (x1: Any, x2: Any,x3: String) => x1}.count()
投掷
scala.MatchError: null error
这是x
scala> x.cache()
res111: x.type = MapPartitionsRDD[522] at map at <console>:49
scala> x
res109: org.apache.spark.rdd.RDD[(Any, Any, String)] = MapPartitionsRDD[522] at map at <console>:49
scala> x.count()
res112: Long = 64508825
任何指针都将受到赞赏。
答案 0 :(得分:1)
错误消息
scala.MatchError:null
清楚地表明存在null
(Any, Any, String)
值
所以你应该在计数之前过滤空值
x.filter(_ != null).map{case (x1: Any, x2: Any,x3: String) => x1}.count()
如果您不确定您的数据是否具有空值,那么您可以更改匹配大小写,如下所示,并在匹配大小后过滤
x.map{_ match {
case (x1: Any, x2: Any,x3: String) => x1
case _ => "not matched"
}}.filter(_ != "not matched").count()