Question

我有一个火花流（使用cloudera 5.12的2.1.1）。带输入卡夫卡和输出HDFS（实木复合地板格式）问题是，我随机获得LeaseExpiredException（并非在所有迷你批处理中都是如此）

org.apache.hadoop.ipc.RemoteException（org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException）：在/ user / qoe_fixe / data_tv / tmp / cleanData / _temporary / 0 / _temporary / attempt_20180629132202_0215_m_000000上没有租约year = 2018 / month = 6 / day = 29 / hour = 11 / source = LYO2 / part-00000-c6f21a40-4088-4d97-ae0c-24fa463550ab.snappy.parquet（inode 135532024）：文件不存在。 Holder DFSClient_attempt_20180629132202_0215_m_000000_0_-1048963677_900没有打开的文件。

我正在使用数据集API写入hdfs

      if (!InputWithDatePartition.rdd.isEmpty() ) InputWithDatePartition.repartition(1).write.partitionBy("year", "month", "day","hour","source").mode("append").parquet(cleanPath)

由于这个错误，我的工作在几个小时后失败了

Answer 1

写入同一目录的两个作业共享相同的_temporary文件夹。

因此，当第一个作业完成时，将执行以下代码（ FileOutputCommitter类）：

  public void cleanupJob(JobContext context) throws IOException {
    if (hasOutputPath()) {
      Path pendingJobAttemptsPath = getPendingJobAttemptsPath();
      FileSystem fs = pendingJobAttemptsPath
          .getFileSystem(context.getConfiguration());
      // if job allow repeatable commit and pendingJobAttemptsPath could be
      // deleted by previous AM, we should tolerate FileNotFoundException in
      // this case.
      try {
        fs.delete(pendingJobAttemptsPath, true);
      } catch (FileNotFoundException e) {
        if (!isCommitJobRepeatable(context)) {
          throw e;
        }
      }
    } else {
      LOG.warn("Output Path is null in cleanupJob()");
    }
  }

它在第二个作业仍在运行时删除 pendingJobAttemptsPath （_ temporary）这可能会有所帮助：

Multiple spark jobs appending parquet data to same base path with partitioning

在Spark Streaming中随机获取LeaseExpiredException

1 个答案: