SparkR-db.readTypedVector(con,colType,numRows)中的错误:不支持反序列化的类型:“来自某一列的某些消息”

时间:2018-11-07 17:57:59

标签: r apache-spark sparkr databricks

我有sparkDataFrame,并且正在使用gapplyCollect运行一个函数。通常这会失败并显示以下错误

Error in handleErrors(returnStatus, conn) : 
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 25 in stage 694.0 failed 4 times, most recent failure: Lost task 25.3 in stage 694.0 (TID 24447, 10.139.64.4, executor 0): org.apache.spark.SparkException: R computation failed with
 Error in db.readTypedVector(con, colType, numRows) : 
  Unsupported type for deserialization: Some message from one of the columns. Calls: <Anonymous> -> lapply -> lapply -> FUN -> db.readTypedVector
Execution halted
    at org.apache.spark.api.r.RRunner.compute(RRunner.scala:108)
    at org.apache.spark.sql.execution.FlatMapGroupsInRExec$$anonfun$14.apply(objects.scala:455)
    at org.apache.spark.sql.execution.FlatMapGroupsInRExec$$anonfun$14.apply(objects.scala:432)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:842)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:842)

有什么想法可能有问题吗?  消息简短,但具有[],::和-字符。如果我仅将消息剥离为10个字符,它将起作用。  我正在使用Azure数据块。

0 个答案:

没有答案