我正在进行简单的火花聚合操作,从avro文件中读取数据作为数据帧,然后使用rdd.map方法将它们映射到case-classes,然后执行一些聚合操作,如count等。 大多数时候它工作得很好。但有时会产生奇怪的CodeGen异常;
[ERROR] 2017-03-24 08:43:20,595 org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator logError - failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */ return new SpecificUnsafeProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
我正在使用此代码;
val deliveries = sqlContext.read.format("com.databricks.spark.avro").load(deliveryDir)
.selectExpr("FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd') as day",
"FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd_HH') as hour",
"deliveryId"
)
.filter("valid = true").rdd
.map(row => {
val deliveryId = row.getAs[Long]("deliveryId")
val uid = row.getAs[Long]("uid")
val deviceModelId: Integer = if(row.getAs[Integer]("deviceModelId") == null) {
0
} else {
row.getAs[Integer]("deviceModelId")
}
val delivery = new DeliveryEvent(deliveryId, row.getAs[Integer]("adId"), row.getAs[Integer]("adSpaceId"), uid, deviceModelId)
eventCache.getDeliverCache().put(new Element(deliveryId, delivery))
new InteractedAdInfo(row.getAs[String]("day"), delivery.deliveryId, delivery.adId, delivery.adSpaceId, uid, deviceModelId, deliveryEvent=1)
})
deliveries.count()
我无法重新解决问题。但我在生产中不定期地得到它。我使用java-app并使用spark-core_2.11:2.1.0和spark-avro_2.11:3.1.0 maven坐标。
问题出在哪里,我在运行应用程序时设置java -Xms8G -Xmx12G -XX:PermSize = 1G -XX:MaxPermSize = 1G。
答案 0 :(得分:0)
我在非常简单的动作spark.read.format("com.databricks.spark.avro").load(fn).cache.count
中看到了类似的错误,这在应用于大型AVRO文件时是间歇性的(在我的测试中为4GB-10GB范围)。但是,我可以消除删除设置--conf spark.executor.cores=4
并将其默认设置为1的错误。
WARN TaskSetManager: Lost task 58.0 in stage 2.0 (TID 82, foo.com executor 10): java.lang.RuntimeException:
Error while encoding:
java.util.concurrent.ExecutionException:
java.lang.Exception: failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */ return new SpecificUnsafeProjection(references);
/* 003 */ }