Question

对于Scala专家来说，这可能是一个愚蠢的问题，但是作为初学者，我面临着难以确定解决方案的困难。任何指针都会有所帮助。

我在HDFS位置设置了3个文件，名称如下：

fileFirst.dat
fileSecond.dat
fileThird.dat

不一定会以任何顺序存储它们。 fileFirst.dat可能会在最后创建，因此每次ls都会显示不同的文件顺序。

我的任务是按照以下顺序将所有文件合并到一个文件中： fileFirst内容，然后是fileSecond内容，最后是fileThird内容；以换行符作为分隔符，没有空格。

我尝试了一些想法，但无法提出一些可行的建议。每当组合的顺序混乱时。

以下是我合并任何内容的功能：

  def writeFile(): Unit = {
    val in: InputStream = fs.open(files(i).getPath)
    try {
      IOUtils.copyBytes(in, out, conf, false)
      if (addString != null) out.write(addString.getBytes("UTF-8"))
    } finally in.close()
}

Files的定义如下：

val files: Array[FileStatus] = fs.listStatus(srcPath)

这是一个更大的函数的一部分，我在其中传递了此方法中使用的所有参数。完成所有操作后，我将执行out.close()以关闭输出流。

任何想法都欢迎，即使它违背了我试图做的文件写逻辑；只要了解我在scala方面的表现并不出色；现在：）

Answer 1

如果您可以直接枚举Paths，则不需要使用listStatus。您可以尝试以下操作（未经测试）：

val relativePaths = Array("fileFirst.dat", "fileSecond.dat", "fileThird.dat")
val paths = relativePaths.map(new Path(srcDirectory, _))
try {
    val output = fs.create(destinationFile)
    for (path <- paths) {
        try {
            val input = fs.open(path)
            IOUtils.copyBytes(input, output, conf, false)
        } catch {
            case ex => throw ex // Feel free to do some error handling here
        } finally {
            input.close()
        }
    }
} catch {
    case ex => throw ex // Feel free to do some error handling here
} finally {
    output.close()
}

文件合并逻辑：scala

1 个答案: