我正在使用Spark Submit Command执行一个pyspark作业。以前是工作文件,我已经执行了10次以上的同一作业。它只是一个简单的从csv文件到hive表的数据加载命令,仅包含500条记录。正在执行同一命令,现在它显示了顶点失败问题。
我正在使用以下Spark提交命令。
spark-submit --num-executors 3 --executor-cores 3 --executor-memory 20g
--jars /usr/hdp/3.1.0.0-78/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar
--py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip main.py
/user/hive/source
/user/hive/sourc>Source File Location
我正在获取以下错误消息。
Error while processing statement: FAILED: Execution Error,
return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1,
vertexId=vertex_1599711935259_0207_17_00, diagnostics=[Task failed, taskId=task_1599711935259_0207_17_00_000000,
diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) :
Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0,
Vertex vertex_1500011935259_0207_17_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed,
vertexName=Reducer 2, vertexId=vertex_1500011935259_0207_17_00,
diagnostics=[Vertex received Kill while in RUNNING state.,
Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1,
Vertex vertex_1500011935259_0207_17_00 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
有人可以建议我如何解决此错误。?
答案 0 :(得分:0)
这个问题解决了,我得到这个是由于以下参数的不匹配计算造成的。
setDocumentAsString