我尝试使用自定义映射ID将数据帧从spark写入Elastic。当我这样做时,我得到以下错误。
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 14.0 failed 16 times, most recent failure: Lost task 0.15 in stage 14.0 (TID 860, ip-10-122-28-111.ec2.internal, executor 1): org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: [DataFrameFieldExtractor for field [[paraId]]] cannot extract value from entity [class java.lang.String] | instance
及以下是用于写入ES的配置。
var config= Map("es.nodes"->node,
"es.port"->port,
"es.clustername"->clustername,
"es.net.http.auth.user" -> login,
"es.net.http.auth.pass" -> password,
"es.write.operation" -> "upsert",
"es.mapping.id" -> "paraId",
"es.resource" -> "test/type")
df.saveToEs(config)
我使用的是5.6版本的ES和2.2.0的Spark。如果你们对此有任何见解,请告诉我。
感谢!