来自数据框的自定义定界文本文件

时间:2019-03-01 09:13:54

标签: apache-spark dataframe hadoop apache-spark-sql

我正在使用spark 1.6,并尝试从数据框创建定界文件。

字段分隔符为'| ^',因此我在从临时表中选择时将数据帧中的列连接起来

现在下面的代码每次都会因此错误而失败

ERROR scheduler.TaskSetManager: Task 172 in stage 9.0 failed 4 times; aborting job
19/03/01 09:10:15 ERROR datasources.InsertIntoHadoopFsRelation: Aborting job.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 172 in stage 9.0 failed 4 times, most recent failure: Lost task 172.3 in stage 9.0 (TID 1397, tplhc01d104.iuser.iroot.adidom.com, executor 7): org.apache.spark.SparkException: Task failed while writing rows.
        at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:272

我正在使用的代码是这个。

tempDF.registerTempTable("BNUC_TEMP")

context.sql("select concat('VALID','|^', RECORD_ID,'|^', DATA_COL1,'|^', DATA_COL2,'|^','P','|^', DATA_COL4,'|^', DATA_COL5,'|^', DATA_COL6,'GBP','|^',from_unixtime(unix_timestamp( ACTION_DATE)),'|^',from_unixtime(unix_timestamp( UPDATED_DATE))) from BNUC_TEMP")
.write.mode("overwrite")
.text("/user/USERNAME/landing/staging/BNU/temp/")

0 个答案:

没有答案