Spark错误:不支持的文字类型类sql.catalyst.expressions.GenericRowWithSchema

时间:2018-04-03 13:57:40

标签: scala apache-spark apache-spark-sql

  

Spark错误:不支持的文字类型类sql.catalyst.expressions.GenericRowWithSchema

val crimeDF = spark.read.option("header", "true").csv("crime_data.csv")
crimeDF.show(5)

enter image description here

var a = crimeDF.select(max($"IncidntNum")).take(1)
println(a(0).toString)

 a: Array[org.apache.spark.sql.Row] = Array([991309046])
    [991309046]

我正在执行此命令:

crimeDF.where($"IncidntNum ===" +a(0))

但收到此错误:

java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema [991309046]
  at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:77)
  at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:163)
  at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:163)
  at scala.util.Try.getOrElse(Try.scala:79)
  at org.apache.spark.sql.catalyst.expressions.Literal$.create(literals.scala:162)
  at org.apache.spark.sql.functions$.typedLit(functions.scala:112)
  at org.apache.spark.sql.functions$.lit(functions.scala:95)
  at org.apache.spark.sql.Column.$plus(Column.scala:648)
  ... 76 elided

2 个答案:

答案 0 :(得分:3)

您正在尝试将整个Row作为文字传递。在后续查询中使用它之前从行中提取值:

crimeDF.where($"IncidntNum" === a(0).getInt(0))

crimeDF.where($"IncidntNum" === a(0).getString(0))

取决于您的输出中的这些数字是Int还是String s。

答案 1 :(得分:1)

你应该这样做

var a = crimeDF.select(max($"IncidntNum")).take(1)(0).getAs[Int](0)

然后

crimeDF.where($"IncidntNum" === a)

crimeDF.where($"IncidntNum" === lit(a))